llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Chris Lattner	40ec41d1c4	now that it doesn't use the PromoteMemToReg function, LICM doesn't require DomFrontier. Dropping this doesn't actually save any runs of the pass though. llvm-svn: 112418	2010-08-29 06:49:44 +00:00
Chris Lattner	fe3e4cdd30	completely rewrite the memory promotion algorithm in LICM. Among other things, this uses SSAUpdater instead of PromoteMemToReg. llvm-svn: 112417	2010-08-29 06:43:52 +00:00
Bob Wilson	807d004452	Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416	2010-08-29 05:57:34 +00:00
Chris Lattner	9921d9c3c1	use getUniqueExitBlocks instead of a manual set. llvm-svn: 112412	2010-08-29 05:12:21 +00:00
Eli Friedman	6ccafafe61	A couple of small missed optimizations. llvm-svn: 112411	2010-08-29 05:07:40 +00:00
Chris Lattner	2133f877c6	reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg. This leads to much simpler code. llvm-svn: 112410	2010-08-29 04:55:06 +00:00
Chris Lattner	fac07b1dca	implement SSAUpdater::RewriteUseAfterInsertions, a helpful form of RewriteUse. llvm-svn: 112409	2010-08-29 04:54:06 +00:00
Chris Lattner	24927beaff	remove dead proto llvm-svn: 112408	2010-08-29 04:53:24 +00:00
Chris Lattner	65ce6da2f1	reduce indentation in LICM::sink by using early exits, use getUniqueExitBlocks instead of getExitBlocks and a manual set to eliminate dupes. llvm-svn: 112405	2010-08-29 04:28:20 +00:00
Chris Lattner	4928fe010e	modernize this pass a bit: use efficient set/map and reduce indentation. llvm-svn: 112404	2010-08-29 04:23:04 +00:00
Chris Lattner	c8947a83e3	when merging two alias sets, the result set is volatile if either of the sets is volatile. We were dropping the volatile bit of the merged in set, leading (luckily) to assertions in cases like PR7535. I cannot produce a testcase that repros with opt, but this is obviously correct. llvm-svn: 112402	2010-08-29 04:14:47 +00:00
Chris Lattner	741da104b4	more cleanup llvm-svn: 112401	2010-08-29 04:13:43 +00:00
Chris Lattner	db816c533c	clean this up llvm-svn: 112400	2010-08-29 04:06:55 +00:00
Bill Wendling	6d105ce757	- Add a parameter to T2I_bin_irs for those patterns which set the S bit. - Create T2I_bin_sw_irs to be like T2I_bin_w_irs, but that it sets the S bit. llvm-svn: 112399	2010-08-29 03:55:31 +00:00
Chris Lattner	646fee99c3	add a bunch more common shuffles to the instprinter. llvm-svn: 112397	2010-08-29 03:08:08 +00:00
Bill Wendling	8ad57ff92e	Name ANDflag to ANDS, which is less stupid. llvm-svn: 112395	2010-08-29 03:06:09 +00:00
Bill Wendling	6e586677a7	File missing from last commit. llvm-svn: 112394	2010-08-29 03:02:28 +00:00
Bill Wendling	385ad1516f	Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but it sets the CPSR register. llvm-svn: 112393	2010-08-29 03:02:11 +00:00
Chris Lattner	56bc8ba493	I have manually decoded the imm field of an insertps one too many times. This patch causes llc and llvm-mc (which both default to verbose-asm) to print out comments after a few common shuffle instructions which indicates the shuffle mask, e.g.: insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] This is carefully factored to keep the information extraction (of the shuffle mask) separate from the printing logic. I plan to move the extraction part out somewhere else at some point for other parts of the x86 backend that want to introspect on the behavior of shuffles. llvm-svn: 112387	2010-08-28 20:42:31 +00:00
Chris Lattner	8cb4abbc0e	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	c3b630d64b	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	7fa5fa1207	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Michael J. Spencer	5cb73a9fb8	Don't cast Win32 FILETIME structs to int64. Patch by Dimitry Andric! According to the Microsoft documentation here: http://msdn.microsoft.com/en-us/library/ms724284%28VS.85%29.aspx this cast used in lib/System/Win32/Path.inc: __int64 ft = reinterpret_cast<__int64>(&fi.ftLastWriteTime); should not be done. The documentation says: "Do not cast a pointer to a FILETIME structure to either a ULARGE_INTEGER* or __int64* value because it can cause alignment faults on 64-bit Windows." llvm-svn: 112376	2010-08-28 16:39:32 +00:00
Chris Lattner	d16c80e27f	remove the MSIL backend. It isn't maintained, is buggy, has no testcases and hasn't kept up with ToT. Approved by Anton. llvm-svn: 112375	2010-08-28 16:33:36 +00:00
Bob Wilson	956e07b985	Use pseudo instructions for VST1 and VST2. llvm-svn: 112357	2010-08-28 05:12:57 +00:00
Chris Lattner	ecf276b787	remove unions from LLVM IR. They are severely buggy and not being actively maintained, improved, or extended. llvm-svn: 112356	2010-08-28 04:09:24 +00:00
Chris Lattner	4b49ada02c	remove the ABCD and SSI passes. They don't have any clients that I'm aware of, aren't maintained, and LVI will be replacing their value. nlewycky approved this on irc. llvm-svn: 112355	2010-08-28 03:51:24 +00:00
Chris Lattner	e7afb6fbb0	remove dead proto llvm-svn: 112354	2010-08-28 03:45:03 +00:00
Chris Lattner	fc1da78d16	for completeness, allow undef also. llvm-svn: 112351	2010-08-28 03:36:51 +00:00
Chris Lattner	b2dbdbc795	squish dead code. llvm-svn: 112350	2010-08-28 03:21:03 +00:00
Chris Lattner	fcf6250d57	zap dead code llvm-svn: 112349	2010-08-28 03:18:45 +00:00
Bruno Cardoso Lopes	1052e6d5d9	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Chris Lattner	b61cf1e296	handle the constant case of vector insertion. For something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345	2010-08-28 01:50:57 +00:00
Chris Lattner	c70b0c0ee7	optimize bitcasts from large integers to vector into vector element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343	2010-08-28 01:20:38 +00:00
Dan Gohman	507f5a8ae7	Completely disable tail calls when fast-isel is enabled, as fast-isel doesn't currently support dealing with this. llvm-svn: 112341	2010-08-28 00:51:03 +00:00
Dan Gohman	ffab4c6a7d	Trim a #include. llvm-svn: 112340	2010-08-28 00:49:13 +00:00
Dan Gohman	aa972a1ee3	Fix an index calculation thinko. llvm-svn: 112337	2010-08-28 00:39:27 +00:00
Bob Wilson	abdcae7f20	We don't need to custom-select VLDMQ and VSTMQ anymore. llvm-svn: 112336	2010-08-28 00:20:11 +00:00
Benjamin Kramer	edb09ef2df	Update CMake build. Add newline at end of file. llvm-svn: 112332	2010-08-28 00:11:12 +00:00
Bob Wilson	412a170b04	When merging Thumb2 loads/stores, do not give up when the offset is one of the special values that for ARM would be used with IB or DA modes. Fall through and consider materializing a new base address is it would be profitable. llvm-svn: 112329	2010-08-27 23:57:52 +00:00
Owen Anderson	dc4703bcd5	Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's. This pass addresses the missed optimizations from PR2581 and PR4420. llvm-svn: 112325	2010-08-27 23:31:36 +00:00
Owen Anderson	f2255ee253	Improve the precision of getConstant(). llvm-svn: 112323	2010-08-27 23:29:38 +00:00
Bob Wilson	31d487d235	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00
Chris Lattner	3f880c2097	Enhance the shift propagator to handle the case when you have: A = shl x, 42 ... B = lshr ..., 38 which can be transformed into: A = shl x, 4 ... iff we can prove that the would-be-shifted-in bits are already zero. This eliminates two shifts in the testcase and allows eliminate of the whole i128 chain in the real example. llvm-svn: 112314	2010-08-27 22:53:44 +00:00
Devang Patel	eb68981283	Simplify. llvm-svn: 112305	2010-08-27 22:25:51 +00:00
Chris Lattner	80632e5fd9	Implement a pretty general logical shift propagation framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304	2010-08-27 22:24:38 +00:00
Bob Wilson	09b040a386	Unsigned value cannot be < 0. llvm-svn: 112300	2010-08-27 21:44:35 +00:00
Dan Gohman	ee0a450648	When merging adjacent operands, scan ahead and merge all equal adjacent operands at once, instead of just two at a time. llvm-svn: 112299	2010-08-27 21:39:59 +00:00
Chris Lattner	5ed3d56ced	remove some special shift cases that have been subsumed into the more general simplify demanded bits logic. llvm-svn: 112291	2010-08-27 21:04:34 +00:00
Dan Gohman	f950201d01	Make the {A,+,B}<L> + {C,+,D}<L> --> Other + {A+C,+,B+D}<L> transformation collect all the addrecs with the same loop add combine them at once rather than starting everything over at the first chance. llvm-svn: 112290	2010-08-27 20:45:56 +00:00

1 2 3 4 5 ...

41007 Commits