llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Bob Wilson	807d004452	Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416	2010-08-29 05:57:34 +00:00
Eli Friedman	6ccafafe61	A couple of small missed optimizations. llvm-svn: 112411	2010-08-29 05:07:40 +00:00
Bill Wendling	6d105ce757	- Add a parameter to T2I_bin_irs for those patterns which set the S bit. - Create T2I_bin_sw_irs to be like T2I_bin_w_irs, but that it sets the S bit. llvm-svn: 112399	2010-08-29 03:55:31 +00:00
Chris Lattner	646fee99c3	add a bunch more common shuffles to the instprinter. llvm-svn: 112397	2010-08-29 03:08:08 +00:00
Bill Wendling	8ad57ff92e	Name ANDflag to ANDS, which is less stupid. llvm-svn: 112395	2010-08-29 03:06:09 +00:00
Bill Wendling	6e586677a7	File missing from last commit. llvm-svn: 112394	2010-08-29 03:02:28 +00:00
Bill Wendling	385ad1516f	Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but it sets the CPSR register. llvm-svn: 112393	2010-08-29 03:02:11 +00:00
Chris Lattner	56bc8ba493	I have manually decoded the imm field of an insertps one too many times. This patch causes llc and llvm-mc (which both default to verbose-asm) to print out comments after a few common shuffle instructions which indicates the shuffle mask, e.g.: insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] This is carefully factored to keep the information extraction (of the shuffle mask) separate from the printing logic. I plan to move the extraction part out somewhere else at some point for other parts of the x86 backend that want to introspect on the behavior of shuffles. llvm-svn: 112387	2010-08-28 20:42:31 +00:00
Chris Lattner	8cb4abbc0e	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	c3b630d64b	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	7fa5fa1207	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Chris Lattner	d16c80e27f	remove the MSIL backend. It isn't maintained, is buggy, has no testcases and hasn't kept up with ToT. Approved by Anton. llvm-svn: 112375	2010-08-28 16:33:36 +00:00
Bob Wilson	956e07b985	Use pseudo instructions for VST1 and VST2. llvm-svn: 112357	2010-08-28 05:12:57 +00:00
Chris Lattner	ecf276b787	remove unions from LLVM IR. They are severely buggy and not being actively maintained, improved, or extended. llvm-svn: 112356	2010-08-28 04:09:24 +00:00
Bruno Cardoso Lopes	1052e6d5d9	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Bob Wilson	abdcae7f20	We don't need to custom-select VLDMQ and VSTMQ anymore. llvm-svn: 112336	2010-08-28 00:20:11 +00:00
Bob Wilson	412a170b04	When merging Thumb2 loads/stores, do not give up when the offset is one of the special values that for ARM would be used with IB or DA modes. Fall through and consider materializing a new base address is it would be profitable. llvm-svn: 112329	2010-08-27 23:57:52 +00:00
Bob Wilson	31d487d235	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00
Bob Wilson	09b040a386	Unsigned value cannot be < 0. llvm-svn: 112300	2010-08-27 21:44:35 +00:00
Anton Korobeynikov	62a9879ef4	Properly handle passing of FP stuff to varargs function on Win64: value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262	2010-08-27 14:43:06 +00:00
Daniel Dunbar	f642d43594	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00
Jim Grosbach	2b81a07dc7	Simplify eliminateFrameIndex() interface back down now that PEI doesn't need to try to re-use scavenged frame index reference registers. rdar://8277890 llvm-svn: 112241	2010-08-26 23:32:16 +00:00
Jim Grosbach	d21756ab1e	tidy up a bit. no functional change. llvm-svn: 112228	2010-08-26 21:56:30 +00:00
Jim Grosbach	5b8e21eaa6	Turn off the scavenging based frame reg reuse briefly to measure whether it's still having a significant effect. It shouldn't be now that the pre-RA virtual base reg stuff is in. Assuming that's valididated by the nightly testers, we can simplify a lot of the PEI frame index code. llvm-svn: 112220	2010-08-26 21:29:54 +00:00
Bruno Cardoso Lopes	6150648a64	zap the now unused MVT::getIntVectorWithNumElements llvm-svn: 112218	2010-08-26 20:53:12 +00:00
Bob Wilson	efc503afd2	Use pseudo instructions for VST3. llvm-svn: 112208	2010-08-26 18:51:29 +00:00
Bill Wendling	c76a3e317c	Reapply r112176 without removing the other CMN patterns (that was unintentional). llvm-svn: 112206	2010-08-26 18:33:51 +00:00
Bob Wilson	640cc8ce83	Fix comment typos. llvm-svn: 112202	2010-08-26 18:08:11 +00:00
Jim Grosbach	5b1ce460ec	Restrict the register to tGPR to make sure the str instruction will be encodable as a 16-bit wide instruction. llvm-svn: 112195	2010-08-26 17:02:47 +00:00
Dan Gohman	b1020bb551	Revert r112176; it broke test/CodeGen/Thumb2/thumb2-cmn.ll. llvm-svn: 112191	2010-08-26 15:50:25 +00:00
Dan Gohman	8088d5e31d	Reapply r112091 and r111922, support for metadata linking, with a fix: add a flag to MapValue and friends which indicates whether any module-level mappings are being made. In the common case of inlining, no module-level mappings are needed, so MapValue doesn't need to examine non-function-local metadata, which can be very expensive in the case of a large module with really deep metadata (e.g. a large C++ program compiled with -g). This flag is a little awkward; perhaps eventually it can be moved into the ClonedCodeInfo class. llvm-svn: 112190	2010-08-26 15:41:53 +00:00
Bill Wendling	a125fb1689	There seems to be a (potential) hardware bug with the CMN instruction and comparison with 0. These two pieces of code should give identical results: rsbs r1, r1, 0 cmp r0, r1 mov r0, #0 it ls mov r0, #1 and: cmn r0, r1 mov r0, #0 it ls mov r0, #1 However, the CMN gives the opposite result when r1 is 0. This is because the carry flag is set in the CMP case but not in the CMN case. In short, the CMP instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is defined as 1 in this case, the carry flag will always be set when r0 >= 0). The CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is defined as 0). The AddWithCarry in the CMP case seems to be relying upon the identity: ~x + 1 = -x However when x is 0 and unsigned, this doesn't hold: x = 0 ~x = 0xFFFF FFFF ~x + 1 = 0x1 0000 0000 (-x = 0) != (0x1 0000 0000 = ~x + 1) Therefore, we should disable all versions of CMN, especially when comparing against zero, until we can limit when the CMN instruction is used (when we know that the RHS is not 0) or when we have a hardware fix for this. (See the ARM docs for the "AddWithCarry" pseudo-code.) This is related to <rdar://problem/7569620>. llvm-svn: 112176	2010-08-26 09:07:33 +00:00
Chris Lattner	148485f707	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Bob Wilson	e74da18e57	Use pseudo instructions for VST1d64Q. llvm-svn: 112170	2010-08-26 05:33:30 +00:00
Chris Lattner	5256226fc8	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Chris Lattner	467be04ce6	remove dead proto llvm-svn: 112131	2010-08-26 01:14:37 +00:00
Bruno Cardoso Lopes	8bb7c79c1a	Fix PR7748 without using microsoft extensions llvm-svn: 112128	2010-08-26 01:02:53 +00:00
Jim Grosbach	6500a1a2f9	Enable pre-RA virtual frame base register allocation. rdar://8277890 llvm-svn: 112127	2010-08-26 00:58:06 +00:00
Bob Wilson	1df383d9cb	Revert svn 107892 (with changes to work with trunk). It caused a crash if a VLD result was not used (Radar 8355607). It should also fix pr7988, but I haven't verified that yet. llvm-svn: 112118	2010-08-26 00:13:36 +00:00
Chris Lattner	eb4c7e43cc	we should pattern match the SSE complex arithmetic ops. llvm-svn: 112109	2010-08-25 23:31:42 +00:00
Bob Wilson	b85b3cf91f	Start converting NEON load/stores to use pseudo instructions, beginning here with the VST4 instructions. Until after register allocation, we want to represent sets of adjacent registers by a single super-register. These VST4 pseudo instructions have a single QQ or QQQQ source register operand. They get expanded to the real VST4 instructions with 4 separate D register operands. Once this conversion is complete, we'll be able to remove the NEONPreAllocPass and avoid some fragile and hacky code elsewhere. llvm-svn: 112108	2010-08-25 23:27:42 +00:00
Bruno Cardoso Lopes	28f3261dbd	Revert this for now, PUNPCKLDQ dont operate on v4f32 llvm-svn: 112090	2010-08-25 21:26:37 +00:00
Daniel Dunbar	1a881a3eca	X86: Fix misencode of RI64mi8. This fixes OpenSSL / x86_64-apple-darwin10 / clang -O3. llvm-svn: 112089	2010-08-25 21:11:02 +00:00
Jim Grosbach	50dbbda454	Don't override the var from the enclosing scope. When doing copy/paste/modify, it's apparently rather important to remember the 'modify' bit... llvm-svn: 112075	2010-08-25 19:11:34 +00:00
Chris Lattner	84423f212d	zap dead code llvm-svn: 112073	2010-08-25 19:00:00 +00:00
Benjamin Kramer	4eb0e8bb2c	Remove dead recursive function. Yay for clang -Wunused-function. llvm-svn: 112060	2010-08-25 17:27:58 +00:00
Daniel Dunbar	9b7c2ce591	ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed comparison that would overflow. - The other under/overflow cases can't actually happen because the immediates which would trigger them are legal (so we don't enter this code), but adjusted the style to make it clear the transform is always valid. llvm-svn: 112053	2010-08-25 16:58:05 +00:00
Eric Christopher	1bf07e75ac	Do type checks before we bother to do everything else. llvm-svn: 112039	2010-08-25 08:43:57 +00:00
Anton Korobeynikov	1544f79e36	Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there. Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove other flags-clobberring stuff (e.g. cmp instructions) occuring after _alloca call. llvm-svn: 112034	2010-08-25 07:50:11 +00:00
Eric Christopher	9e3831d7a9	Reorganize load mechanisms. Handle types in a little less fixed way. Fix some todos. No functional change. llvm-svn: 112031	2010-08-25 07:23:49 +00:00

1 2 3 4 5 ...

15062 Commits