llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Bruno Cardoso Lopes	6fbe7b9ddd	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Dan Gohman	6822b9d177	Revert r112432. It appears to be exposing a problem in the emacs build. llvm-svn: 112638	2010-08-31 20:58:44 +00:00
Owen Anderson	233463074b	More cleanups of my JumpThreading transforms, including extracting some duplicated code into a helper function. llvm-svn: 112634	2010-08-31 20:26:04 +00:00
Jakob Stoklund Olesen	d76e5132e7	Ignore unallocatable registers in RegAllocFast. llvm-svn: 112632	2010-08-31 19:54:25 +00:00
Devang Patel	b94251aea0	Revert r112623. It is causing self host build failures. llvm-svn: 112631	2010-08-31 19:41:03 +00:00
Owen Anderson	66b51ff843	Add an RAII helper to make cleanup of the RecursionSet more fool-proof. llvm-svn: 112628	2010-08-31 19:24:27 +00:00
Owen Anderson	5e2c04e417	Only try to clean up the current block if we changed that block already. llvm-svn: 112625	2010-08-31 18:55:52 +00:00
Jim Grosbach	9cc0a6397a	SP relative offsets need to be adjusted by the local allocation size when determining if they're likely to be in range of the SP when resolving frame references. llvm-svn: 112624	2010-08-31 18:52:31 +00:00
Devang Patel	414cbc940a	Remember byval argument's frame index during argument lowering and use this info to emit debug info. Fixes Radar 8367011. llvm-svn: 112623	2010-08-31 18:50:09 +00:00
Jim Grosbach	d0ebe535e9	this assert should just be a condition, since this function is just asking if the offset is legally encodable, not actually trying to do the encoding. llvm-svn: 112622	2010-08-31 18:49:31 +00:00
Owen Anderson	e2b5bd3a7f	Refactor my fix for PR5652 to terminate the predecessor lookups after the first failure. llvm-svn: 112620	2010-08-31 18:48:48 +00:00
Jim Grosbach	ddc265a982	Improve virtual frame base register allocation heuristics. 1. Allocate them in the entry block of the function to enable function-wide re-use. The instructions to create them should be re-materializable, so there shouldn't be additional cost compared to creating them local to the basic blocks where they are used. 2. Collect all of the frame index references for the function and sort them by the local offset referenced. Iterate over the sorted list to allocate the virtual base registers. This enables creation of base registers optimized for positive-offset access of frame references. (Note: This may be appropriate to later be a target hook to do the sorting in a target appropriate manner. For now it's done here for simplicity.) llvm-svn: 112609	2010-08-31 17:58:19 +00:00
Dan Gohman	47865eb626	Speculatively revert r112433. llvm-svn: 112608	2010-08-31 17:56:47 +00:00
Benjamin Kramer	3f8b8c1f5b	Allow creation of SHT_NULL sections, from Roman Divacky. llvm-svn: 112605	2010-08-31 17:03:33 +00:00
Duncan Sands	2a1c11e104	Stop using the dom frontier in DwarfEHPrepare by not promoting alloca's any more. I plan to reimplement alloca promotion using SSAUpdater later. It looks like Bill's URoR logic really always needs domtree, so the pass now always asks for domtree info. llvm-svn: 112597	2010-08-31 09:05:06 +00:00
Nick Lewycky	66fe124a92	Fix an infinite loop; merging two functions will create a new function (if the two are weak, we make them thunks to a new strong function) so don't iterate through the function list as we're modifying it. Also add back the outermost loop which got removed during the cleanups. llvm-svn: 112595	2010-08-31 08:29:37 +00:00
Owen Anderson	bf12defee5	Don't perform an extra traversal of the function just to do cleanup. We can safely simplify instructions after each block has been processed without worrying about iterator invalidation. llvm-svn: 112594	2010-08-31 07:55:56 +00:00
Bill Wendling	0409e77e99	- Cleanup some whitespaces. - Convert {0,1} and friends into 0b01, which is identical and more consistent. llvm-svn: 112593	2010-08-31 07:50:46 +00:00
Owen Anderson	ccaee65189	Rename ValuePropagation to a more descriptive CorrelatedValuePropagation. llvm-svn: 112591	2010-08-31 07:48:34 +00:00
Owen Anderson	6853ce863c	Rename file to something more descriptive. llvm-svn: 112590	2010-08-31 07:41:39 +00:00
Owen Anderson	ba28fe3dcb	More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly constant-fold undef, and be more careful with its return value. This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's handling of and/or of i1's), but never manifested before. This patch adds a tracking set to prevent this case. llvm-svn: 112589	2010-08-31 07:36:34 +00:00
Michael J. Spencer	ab565264fa	Cleanup Whitespace. llvm-svn: 112587	2010-08-31 06:36:46 +00:00
Michael J. Spencer	3bca7bf84d	System: Fix getMagicNumber on windows. getMagicNumber was treating the _binary_ data it read in as a null terminated string. This resulted in the std::string calculating the length, and causing an assert in other code that assumed that the length it passed was the same as the length of the string it would get back. llvm-svn: 112586	2010-08-31 06:36:33 +00:00
Devang Patel	a1ff33906b	Offset is not always unsigned number. llvm-svn: 112584	2010-08-31 06:12:08 +00:00
Devang Patel	2eeab37306	Simplify. llvm-svn: 112583	2010-08-31 06:11:28 +00:00
Nick Lewycky	75dfadbaf9	Switch to DenseSet, simplifying much more code. We now have a single iteration where we hash, compare and fold, instead of one iteration where we build up the hash buckets and a second one to fold. llvm-svn: 112582	2010-08-31 05:53:05 +00:00
Owen Anderson	bd9edea8a3	Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine. llvm-svn: 112575	2010-08-31 04:41:06 +00:00
Bruno Cardoso Lopes	ebe80d78ff	zap unused method. x86 is the only user and already has a more powerfull version llvm-svn: 112571	2010-08-31 02:36:20 +00:00
Bruno Cardoso Lopes	08d5d62dcb	Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570	2010-08-31 02:26:40 +00:00
Eric Christopher	b2756a8b99	Rewrite slightly so we can expand for floating point types easier. llvm-svn: 112568	2010-08-31 01:28:42 +00:00
Jakob Stoklund Olesen	6fa8a6ac6b	Add experimental -disable-physical-join command line option. Eventually, we want to disable physreg coalescing completely, and let the register allocator do its job using hints. This option makes it possible to measure the impact of disabling physreg coalescing. llvm-svn: 112567	2010-08-31 01:27:49 +00:00
Owen Anderson	f1cec75012	Fix a typo. llvm-svn: 112560	2010-08-30 23:59:30 +00:00
Eric Christopher	21b355b522	If we have an unhandled type then assert, we shouldn't get here for things we can't handle. llvm-svn: 112559	2010-08-30 23:48:26 +00:00
Owen Anderson	c90a98e0a5	Cleanups suggested by Chris. llvm-svn: 112553	2010-08-30 23:34:17 +00:00
Owen Anderson	af33f22b40	Re-apply r112539, being more careful to respect the return values of the constant folding methods. Additionally, use the ConstantExpr::get*() methods to simplify some constant folding. llvm-svn: 112550	2010-08-30 23:22:36 +00:00
Anton Korobeynikov	851437063a	Expand MOVi32imm in ARM mode after regalloc. This provides scheduling opportunities (extra instruction can go in between MOVT / MOVW pair removing the stall). llvm-svn: 112546	2010-08-30 22:50:36 +00:00
Owen Anderson	8e72b2f3f5	Add statistics to evaluate this pass. llvm-svn: 112545	2010-08-30 22:45:55 +00:00
Owen Anderson	9d301e20e8	Revert r112539. It accidentally introduced a miscompilation. llvm-svn: 112543	2010-08-30 22:33:41 +00:00
Owen Anderson	479c0c406f	Fixes and cleanups pointed out by Chris. In general, be careful to handle 0 results from ComputeValueKnownInPredecessors (indicating undef), and re-use existing constant folding APIs. llvm-svn: 112539	2010-08-30 22:07:52 +00:00
Bill Wendling	7532e3418e	Use the existing T2I_bin_s_irs pattern instead of creating T2I_bin_sw_irs, which is meant to do exactly the same thing. Thanks to Jim Grosbach for pointing this out! :-) llvm-svn: 112538	2010-08-30 22:05:23 +00:00
NAKAMURA Takumi	2c7cfc7bd4	Fix a comment. llvm-svn: 112535	2010-08-30 21:54:03 +00:00
Jakob Stoklund Olesen	ce3cfe3e8b	Remember to clear the shadow kill flag at the same time as clearing the real kill flag. This could cause duplicate kill flags when the same register was used twice in a continuous sequence of STRs. There is no small test case. <rdar://problem/8218046> llvm-svn: 112534	2010-08-30 21:52:40 +00:00
Dan Gohman	6909a51674	Add comments explaining why it's not necessary to include the is-function-local flag in metadata uniquing bits. llvm-svn: 112528	2010-08-30 21:18:41 +00:00
Bob Wilson	826a677f94	Remove NEON vmovn intrinsic, replacing it with vector truncate operations. Auto-upgrade the old intrinsic and update tests. llvm-svn: 112507	2010-08-30 20:02:30 +00:00
Jim Grosbach	674b25ce31	Make ARM add rN, sp, #imm instructions rematerializable. That's how the address of locals is calculated, so this should help relieve register pressure a bit. Recalculating the local address is almost always going to be better than spilling. llvm-svn: 112503	2010-08-30 19:49:58 +00:00
Chris Lattner	765e59210c	two changes: 1) nuke ConstDataCoalSection, which is dead. 2) revise my previous patch for rdar://8018335, which was completely wrong. Specifically, it doesn't make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS, because it is for readonly data. templates (it turns out) go to const_coal_nt. The real fix for rdar://8018335 was to give ConstTextCoalSection a section kind of ReadOnly instead of Text. llvm-svn: 112496	2010-08-30 18:12:35 +00:00
Bob Wilson	2b83684be8	When expanding NEON VST pseudo instructions, if the original super-register operand is killed, add it to the expanded instruction as an implicit kill operand instead of marking the individual subregs with kill flags. This should work better in general and also handles the case for VST3 where one of the subregs was not referenced in the expanded instruction and so was not marked killed. llvm-svn: 112494	2010-08-30 18:10:48 +00:00
Benjamin Kramer	6c5076f317	MCELF: The value of all common symbols is the offset from the start of the section. Patch by Roman Divacky. llvm-svn: 112492	2010-08-30 17:20:17 +00:00
Owen Anderson	d29fb4b991	It is possible to try to merge a not-constant with a constantrage, when dealing with ptrtoint ConstantExpr's. Unfortunately, the only testcase I have for this is huge and doesn't reduce well because the error is sensitive to iteration-order issues, since the problem only occurs when merging values in a particular order. llvm-svn: 112489	2010-08-30 17:03:45 +00:00
Benjamin Kramer	b63bf8cf9b	Don't print two "0x" prefixes. Use a raw_ostream overload instead of llvm::format. llvm-svn: 112479	2010-08-30 14:46:53 +00:00
NAKAMURA Takumi	b5613b9f12	EE/JIT: Do not invoke parent's ctors/dtors from main()! (PR3897) On Mingw and Cygwin, the symbol __main is resolved to callee's(eg. tools/lli) one, to invoke wrong duplicated ctors (and register wrong callee's dtors with atexit(3)). We expect, by callee, ExecutionEngine::runStaticConstructorsDestructors() is called before ExecutionEngine::runFunctionAsMain() is called. llvm-svn: 112474	2010-08-30 14:00:29 +00:00
Benjamin Kramer	b540b09a7c	The value is offset from the start of the section for non-common symbols, submitted by Jordan Gordeev. llvm-svn: 112473	2010-08-30 12:00:16 +00:00
Benjamin Kramer	ca65cc9222	Index external symbols by symbol table instead of parent section, by Roman Divacky. llvm-svn: 112472	2010-08-30 11:59:29 +00:00
Benjamin Kramer	3aabb3eb53	Mark all common symbols external. This is not exactly correct but it lets apps link for now and can be adjusted later. Patch by Roman Divacky. llvm-svn: 112471	2010-08-30 11:56:55 +00:00
Duncan Sands	3e49fe09db	Remove a hack that tries to understand incorrect triples from the Triple class constructor. Only valid triples should now be used inside LLVM - front-ends are now responsable for rejecting or correcting invalid target triples. The Triple::normalize method can be used to straighten out funky triples provided by users. Give this a whirl through the buildbots to see if I caught all places where triples enter LLVM. llvm-svn: 112470	2010-08-30 10:57:54 +00:00
Bill Wendling	999c8b219d	Revert r112461. It was failing on PPC... llvm-svn: 112463	2010-08-30 04:36:50 +00:00
Bill Wendling	c325a15569	Create Thumb2sI_cpsr and T2sI_cpsr. These new classes indicate that CPSR is the optional modified register (instead of reg0). Along with r112461 it will make sure that the optional define of CPSR is marked as "def" and will thus mark the instructions using these classes (t2ANDS*) as setting the 's' flag. llvm-svn: 112462	2010-08-30 01:47:35 +00:00
Bill Wendling	450e009b5e	When adding a register, we should mark it as "def" if it can optionally define said (physical) register. llvm-svn: 112461	2010-08-30 01:36:05 +00:00
Chris Lattner	8331070df0	revert 112457, it looks like it broke selfhost. llvm-svn: 112459	2010-08-29 22:28:18 +00:00
Chris Lattner	92879a5ba1	rewrite DwarfEHPrepare to use SSAUpdater to promote its allocas instead of PromoteMemToReg. This allows it to stop using DF and DT, eliminating a computation of DT and DF from clang -O3. Clang is now down to 2 runs of DomFrontier. llvm-svn: 112457	2010-08-29 19:54:28 +00:00
Chris Lattner	a38548a56d	inline function into its only caller. llvm-svn: 112455	2010-08-29 19:28:28 +00:00
Chris Lattner	9e35d96cea	two changes: 1) make AliasSet hold the list of call sites with an assertingvh so we get a violent explosion if the pointer dangles. 2) Fix AliasSetTracker::deleteValue to remove call sites with by-pointer comparisons instead of by-alias queries. Using findAliasSetForCallSite can cause alias sets to get merged when they shouldn't, and can also miss alias sets when the call is readonly. #2 fixes PR6889, which only repros with a .c file :( llvm-svn: 112452	2010-08-29 18:42:23 +00:00
Chris Lattner	51639dea34	LICM does get dead instructions input to it. Instead of sinking them out of loops, just delete them. llvm-svn: 112451	2010-08-29 18:22:25 +00:00
Chris Lattner	aac9263929	use moveBefore instead of remove+insert, it avoids some symtab manipulation, so its faster (in addition to being more elegant) llvm-svn: 112450	2010-08-29 18:18:40 +00:00
Chris Lattner	5c31ee4663	revert 112448 for now. llvm-svn: 112449	2010-08-29 18:11:16 +00:00
Chris Lattner	8e922b170c	optimize LICM::hoist to use moveBefore. Correct its updating of AST to remove the hoisted instruction from the AST, since it is no longer in the loop. llvm-svn: 112448	2010-08-29 18:03:33 +00:00
Chris Lattner	150bdce5c1	fix some bugs (found by inspection) where LICM would not update LICM correctly. When sinking an instruction, it should not add entries for the sunk instruction to the AST, it should remove the entry for the sunk instruction. The blocks being sunk to are not in the loop, so their instructions shouldn't be in the AST (yet)! llvm-svn: 112447	2010-08-29 18:00:00 +00:00
Chris Lattner	a3b10c0752	rework the ownership of subloop alias information: instead of keeping them around until the pass is destroyed, keep them around a) just when useful (not for outer loops) and b) destroy them right after we use them. This should reduce memory use and fixes potential bugs where a loop is deleted and another loop gets allocated to the same address. llvm-svn: 112446	2010-08-29 17:46:00 +00:00
Chris Lattner	4cdad46980	apparently unswitch had the same "Feature". Stop its claims that it preserves domfrontier if it doesn't really. llvm-svn: 112445	2010-08-29 17:23:19 +00:00
Chris Lattner	7181337888	now that loop passes don't use DomFrontier, there is no reason for the unroller to pretend it supports updating it. It still has a horrible hack for DomTree. llvm-svn: 112444	2010-08-29 17:21:35 +00:00
Dan Gohman	9ea315c5ca	Make IVUsers iterative instead of recursive. This has the side effect of reversing the order of most of IVUser's results. llvm-svn: 112442	2010-08-29 16:40:03 +00:00
Dan Gohman	d7ad97b6f5	Optionally rerun dedicated-register filtering after applying other filtering techniques, as those may allow it to filter out more obviously unprofitable candidates. llvm-svn: 112441	2010-08-29 16:39:22 +00:00
Dan Gohman	d5e0d45b20	Fix several areas in LSR to do a better job keeping the main LSRInstance data structures up to date. This fixes some pessimizations caused by stale data which will be exposed in an upcoming change. llvm-svn: 112440	2010-08-29 16:32:54 +00:00
Dan Gohman	b62347f067	Refactor the three main groups of code out of NarrowSearchSpaceUsingHeuristics into separate functions. llvm-svn: 112439	2010-08-29 16:09:42 +00:00
Dan Gohman	7b45482eab	Delete a bogus check. llvm-svn: 112438	2010-08-29 15:30:29 +00:00
Dan Gohman	bde89495fa	Add some comments. llvm-svn: 112437	2010-08-29 15:27:08 +00:00
Dan Gohman	185c024c53	Move this debug output into GenerateAllReuseFormula, to declutter the high-level logic. llvm-svn: 112436	2010-08-29 15:21:38 +00:00
Dan Gohman	5023e9f408	Delete an unused declaration. llvm-svn: 112435	2010-08-29 15:19:11 +00:00
Dan Gohman	defdc9d59c	Do one lookup instead of two. llvm-svn: 112434	2010-08-29 15:18:49 +00:00
Dan Gohman	4e9013673c	Restructure the {A,+,B}<L> * {C,+,D}<L> folding so that it folds all applicable addrecs before recursing on getMulExpr, instead of recursing on getMulExpr for each one. llvm-svn: 112433	2010-08-29 15:16:58 +00:00
Dan Gohman	88b40ae04a	Batch up subtracts along with adds, when analyzing long chains of operations. llvm-svn: 112432	2010-08-29 15:10:06 +00:00
Dan Gohman	988c90a5e1	Micro-optimize GroupByComplexity. llvm-svn: 112431	2010-08-29 15:07:13 +00:00
Dan Gohman	690916c085	Hold AddRec->getLoop() in a variable, to make the Mul code more consistent with the Add code. llvm-svn: 112430	2010-08-29 14:55:19 +00:00
Dan Gohman	5a3ad9c218	Rename a variable, for consistency. llvm-svn: 112429	2010-08-29 14:53:34 +00:00
Dan Gohman	d581ece72a	Use iterators instead of indices. llvm-svn: 112428	2010-08-29 14:52:02 +00:00
Kalle Raiskila	daba4ffc75	Fix lowering of INSERT_VECTOR_ELT in SPU. The IDX was treated as byte index, not element index. llvm-svn: 112422	2010-08-29 12:41:50 +00:00
Bill Wendling	8a7258d771	Fix whitespaces. No functionality changes. llvm-svn: 112421	2010-08-29 11:31:07 +00:00
Chris Lattner	270d50d48c	licm preserves the cfg, it doesn't have to explicitly say it preserves domfrontier. It does preserve AA though. llvm-svn: 112419	2010-08-29 07:02:56 +00:00
Chris Lattner	40ec41d1c4	now that it doesn't use the PromoteMemToReg function, LICM doesn't require DomFrontier. Dropping this doesn't actually save any runs of the pass though. llvm-svn: 112418	2010-08-29 06:49:44 +00:00
Chris Lattner	fe3e4cdd30	completely rewrite the memory promotion algorithm in LICM. Among other things, this uses SSAUpdater instead of PromoteMemToReg. llvm-svn: 112417	2010-08-29 06:43:52 +00:00
Bob Wilson	807d004452	Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416	2010-08-29 05:57:34 +00:00
Chris Lattner	9921d9c3c1	use getUniqueExitBlocks instead of a manual set. llvm-svn: 112412	2010-08-29 05:12:21 +00:00
Eli Friedman	6ccafafe61	A couple of small missed optimizations. llvm-svn: 112411	2010-08-29 05:07:40 +00:00
Chris Lattner	2133f877c6	reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg. This leads to much simpler code. llvm-svn: 112410	2010-08-29 04:55:06 +00:00
Chris Lattner	fac07b1dca	implement SSAUpdater::RewriteUseAfterInsertions, a helpful form of RewriteUse. llvm-svn: 112409	2010-08-29 04:54:06 +00:00
Chris Lattner	24927beaff	remove dead proto llvm-svn: 112408	2010-08-29 04:53:24 +00:00
Chris Lattner	65ce6da2f1	reduce indentation in LICM::sink by using early exits, use getUniqueExitBlocks instead of getExitBlocks and a manual set to eliminate dupes. llvm-svn: 112405	2010-08-29 04:28:20 +00:00
Chris Lattner	4928fe010e	modernize this pass a bit: use efficient set/map and reduce indentation. llvm-svn: 112404	2010-08-29 04:23:04 +00:00
Chris Lattner	c8947a83e3	when merging two alias sets, the result set is volatile if either of the sets is volatile. We were dropping the volatile bit of the merged in set, leading (luckily) to assertions in cases like PR7535. I cannot produce a testcase that repros with opt, but this is obviously correct. llvm-svn: 112402	2010-08-29 04:14:47 +00:00
Chris Lattner	741da104b4	more cleanup llvm-svn: 112401	2010-08-29 04:13:43 +00:00
Chris Lattner	db816c533c	clean this up llvm-svn: 112400	2010-08-29 04:06:55 +00:00
Bill Wendling	6d105ce757	- Add a parameter to T2I_bin_irs for those patterns which set the S bit. - Create T2I_bin_sw_irs to be like T2I_bin_w_irs, but that it sets the S bit. llvm-svn: 112399	2010-08-29 03:55:31 +00:00
Chris Lattner	646fee99c3	add a bunch more common shuffles to the instprinter. llvm-svn: 112397	2010-08-29 03:08:08 +00:00
Bill Wendling	8ad57ff92e	Name ANDflag to ANDS, which is less stupid. llvm-svn: 112395	2010-08-29 03:06:09 +00:00
Bill Wendling	6e586677a7	File missing from last commit. llvm-svn: 112394	2010-08-29 03:02:28 +00:00
Bill Wendling	385ad1516f	Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but it sets the CPSR register. llvm-svn: 112393	2010-08-29 03:02:11 +00:00
Chris Lattner	56bc8ba493	I have manually decoded the imm field of an insertps one too many times. This patch causes llc and llvm-mc (which both default to verbose-asm) to print out comments after a few common shuffle instructions which indicates the shuffle mask, e.g.: insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] This is carefully factored to keep the information extraction (of the shuffle mask) separate from the printing logic. I plan to move the extraction part out somewhere else at some point for other parts of the x86 backend that want to introspect on the behavior of shuffles. llvm-svn: 112387	2010-08-28 20:42:31 +00:00
Chris Lattner	8cb4abbc0e	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	c3b630d64b	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	7fa5fa1207	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Michael J. Spencer	5cb73a9fb8	Don't cast Win32 FILETIME structs to int64. Patch by Dimitry Andric! According to the Microsoft documentation here: http://msdn.microsoft.com/en-us/library/ms724284%28VS.85%29.aspx this cast used in lib/System/Win32/Path.inc: __int64 ft = reinterpret_cast<__int64>(&fi.ftLastWriteTime); should not be done. The documentation says: "Do not cast a pointer to a FILETIME structure to either a ULARGE_INTEGER* or __int64* value because it can cause alignment faults on 64-bit Windows." llvm-svn: 112376	2010-08-28 16:39:32 +00:00
Chris Lattner	d16c80e27f	remove the MSIL backend. It isn't maintained, is buggy, has no testcases and hasn't kept up with ToT. Approved by Anton. llvm-svn: 112375	2010-08-28 16:33:36 +00:00
Bob Wilson	956e07b985	Use pseudo instructions for VST1 and VST2. llvm-svn: 112357	2010-08-28 05:12:57 +00:00
Chris Lattner	ecf276b787	remove unions from LLVM IR. They are severely buggy and not being actively maintained, improved, or extended. llvm-svn: 112356	2010-08-28 04:09:24 +00:00
Chris Lattner	4b49ada02c	remove the ABCD and SSI passes. They don't have any clients that I'm aware of, aren't maintained, and LVI will be replacing their value. nlewycky approved this on irc. llvm-svn: 112355	2010-08-28 03:51:24 +00:00
Chris Lattner	e7afb6fbb0	remove dead proto llvm-svn: 112354	2010-08-28 03:45:03 +00:00
Chris Lattner	fc1da78d16	for completeness, allow undef also. llvm-svn: 112351	2010-08-28 03:36:51 +00:00
Chris Lattner	b2dbdbc795	squish dead code. llvm-svn: 112350	2010-08-28 03:21:03 +00:00
Chris Lattner	fcf6250d57	zap dead code llvm-svn: 112349	2010-08-28 03:18:45 +00:00
Bruno Cardoso Lopes	1052e6d5d9	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Chris Lattner	b61cf1e296	handle the constant case of vector insertion. For something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345	2010-08-28 01:50:57 +00:00
Chris Lattner	c70b0c0ee7	optimize bitcasts from large integers to vector into vector element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343	2010-08-28 01:20:38 +00:00
Dan Gohman	507f5a8ae7	Completely disable tail calls when fast-isel is enabled, as fast-isel doesn't currently support dealing with this. llvm-svn: 112341	2010-08-28 00:51:03 +00:00
Dan Gohman	ffab4c6a7d	Trim a #include. llvm-svn: 112340	2010-08-28 00:49:13 +00:00
Dan Gohman	aa972a1ee3	Fix an index calculation thinko. llvm-svn: 112337	2010-08-28 00:39:27 +00:00
Bob Wilson	abdcae7f20	We don't need to custom-select VLDMQ and VSTMQ anymore. llvm-svn: 112336	2010-08-28 00:20:11 +00:00
Benjamin Kramer	edb09ef2df	Update CMake build. Add newline at end of file. llvm-svn: 112332	2010-08-28 00:11:12 +00:00
Bob Wilson	412a170b04	When merging Thumb2 loads/stores, do not give up when the offset is one of the special values that for ARM would be used with IB or DA modes. Fall through and consider materializing a new base address is it would be profitable. llvm-svn: 112329	2010-08-27 23:57:52 +00:00
Owen Anderson	dc4703bcd5	Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's. This pass addresses the missed optimizations from PR2581 and PR4420. llvm-svn: 112325	2010-08-27 23:31:36 +00:00
Owen Anderson	f2255ee253	Improve the precision of getConstant(). llvm-svn: 112323	2010-08-27 23:29:38 +00:00
Bob Wilson	31d487d235	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00
Chris Lattner	3f880c2097	Enhance the shift propagator to handle the case when you have: A = shl x, 42 ... B = lshr ..., 38 which can be transformed into: A = shl x, 4 ... iff we can prove that the would-be-shifted-in bits are already zero. This eliminates two shifts in the testcase and allows eliminate of the whole i128 chain in the real example. llvm-svn: 112314	2010-08-27 22:53:44 +00:00
Devang Patel	eb68981283	Simplify. llvm-svn: 112305	2010-08-27 22:25:51 +00:00
Chris Lattner	80632e5fd9	Implement a pretty general logical shift propagation framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304	2010-08-27 22:24:38 +00:00
Bob Wilson	09b040a386	Unsigned value cannot be < 0. llvm-svn: 112300	2010-08-27 21:44:35 +00:00
Dan Gohman	ee0a450648	When merging adjacent operands, scan ahead and merge all equal adjacent operands at once, instead of just two at a time. llvm-svn: 112299	2010-08-27 21:39:59 +00:00
Chris Lattner	5ed3d56ced	remove some special shift cases that have been subsumed into the more general simplify demanded bits logic. llvm-svn: 112291	2010-08-27 21:04:34 +00:00
Dan Gohman	f950201d01	Make the {A,+,B}<L> + {C,+,D}<L> --> Other + {A+C,+,B+D}<L> transformation collect all the addrecs with the same loop add combine them at once rather than starting everything over at the first chance. llvm-svn: 112290	2010-08-27 20:45:56 +00:00
Bill Wendling	09a19ea0bf	Remove now unneeded command line flag that enables 'optimize compares.' llvm-svn: 112287	2010-08-27 20:39:09 +00:00
Owen Anderson	a1a80a3acd	Fix typos in comments. llvm-svn: 112286	2010-08-27 20:32:56 +00:00
Chris Lattner	866b888095	teach the truncation optimization that an entire chain of computation can be truncated if it is fed by a sext/zext that doesn't have to be exactly equal to the truncation result type. llvm-svn: 112285	2010-08-27 20:32:06 +00:00
Dan Gohman	636c57c5de	Switch ScalarEvolution's main Value->SCEV map from std::map to DenseMap. llvm-svn: 112281	2010-08-27 18:55:03 +00:00
Chris Lattner	69a9143584	Add an instcombine to clean up a common pattern produced by the SRoA "promote to large integer" code, eliminating some type conversions like this: %94 = zext i16 %93 to i32 ; <i32> [#uses=2] %96 = lshr i32 %94, 8 ; <i32> [#uses=1] %101 = trunc i32 %96 to i8 ; <i8> [#uses=1] This also unblocks other xforms from happening, now clang is able to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry pshufd $1, %xmm0, %xmm2 addss %xmm0, %xmm2 movdqa %xmm1, %xmm3 addss %xmm2, %xmm3 pshufd $1, %xmm1, %xmm0 addss %xmm3, %xmm0 ret on x86-64, instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret This seems pretty close to optimal to me, at least without using horizontal adds. This also triggers in lots of other code, including SPEC. llvm-svn: 112278	2010-08-27 18:31:05 +00:00
Bob Wilson	c01101e76c	Add alignment arguments to all the NEON load/store intrinsics. Update all the tests using those intrinsics and add support for auto-upgrading bitcode files with the old versions of the intrinsics. llvm-svn: 112271	2010-08-27 17:13:24 +00:00
Owen Anderson	35ff7a208e	Use LVI to eliminate conditional branches where we've tested a related condition previously. Update tests for this change. This fixes PR5652. llvm-svn: 112270	2010-08-27 17:12:29 +00:00
Dan Gohman	e1fcf40b52	Optimize SCEVComplexityCompare. Use a 3-way return instead of a 2-way return to avoid needing two calls to test for equivalence, and sort addrecs by their degree before examining their operands. llvm-svn: 112267	2010-08-27 15:26:01 +00:00
Anton Korobeynikov	62a9879ef4	Properly handle passing of FP stuff to varargs function on Win64: value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262	2010-08-27 14:43:06 +00:00
Benjamin Kramer	f5c811611e	MCELF: Port EmitInstruction changes from MachO streamer. Patch by Roman Divacky. llvm-svn: 112260	2010-08-27 10:40:51 +00:00
Benjamin Kramer	4e04c2e69f	MCELF: Always overwrite FixedValue. llvm-svn: 112259	2010-08-27 10:38:39 +00:00
Daniel Dunbar	f642d43594	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00

1 2 3 4 5 ...

41195 Commits