llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Bob Wilson	524123343c	Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from the VST pseudos. The VLD/VST scheduling still needs work (see pr6722), but at least we shouldn't confuse the loads with the stores. llvm-svn: 113473	2010-09-09 05:40:26 +00:00
Jim Grosbach	0f01a7319e	Re-enable usage of the ARM base pointer. r113394 fixed the known failures. Re-running some nightly testers w/ it enabled to verify. llvm-svn: 113399	2010-09-08 20:12:02 +00:00
Eric Christopher	95fe7f6d38	Remove ssp from this test. llvm-svn: 113392	2010-09-08 19:32:34 +00:00
Kalle Raiskila	be22226628	Fix CellSPU vector shuffles, again. Some cases of lowering to rotate were miscompiled. llvm-svn: 113355	2010-09-08 11:53:38 +00:00
Jim Grosbach	7a957dc761	disable for the moment while tracking down a few Thumb2-O0 failure that look related. (attempt deux, complete w/ test update this time) llvm-svn: 113333	2010-09-08 02:00:34 +00:00
Devang Patel	acddbab1a5	remove these tests for now. llvm-svn: 113293	2010-09-07 22:03:44 +00:00
Devang Patel	751d5ec40a	There is no need to force target if the test is going to run on other x86 platforms. llvm-svn: 113285	2010-09-07 20:59:09 +00:00
Devang Patel	e9dcee0a69	Fix command line used to link these test cases. llvm-svn: 113237	2010-09-07 18:17:56 +00:00
Devang Patel	ca974014e7	Reintroduce dbg-declare tests. llvm-svn: 113232	2010-09-07 18:01:49 +00:00
Devang Patel	c1cdbbb6d5	Remove last three tests. I need to make them independent of my setup. llvm-svn: 113213	2010-09-07 17:08:57 +00:00
Devang Patel	cb223026e6	Add a test case to check handling of dbg-declare during hybrid mode where we begin using fast-isel but switch back to DAG building at some point. llvm-svn: 113210	2010-09-07 17:03:44 +00:00
Devang Patel	5b5dd1d32b	Add a test case to check handling of dbg-declare by selection DAG builder. llvm-svn: 113209	2010-09-07 16:56:35 +00:00
Devang Patel	bcdf6f20ac	Add a test case to check handling of dbg-declare by fast-isel. llvm-svn: 113208	2010-09-07 16:40:53 +00:00
Chris Lattner	684ae57b8e	implement rdar://6653118 - fastisel should fold loads where possible. Since mem2reg isn't run at -O0, we get a ton of reloads from the stack, for example, before, this code: int foo(int x, int y, int z) { return x+y+z; } used to compile into: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx movl 4(%rsp), %esi addl %edx, %esi movl (%rsp), %edx addl %esi, %edx movl %edx, %eax addq $12, %rsp ret Now we produce: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx addl 4(%rsp), %edx ## Folded load addl (%rsp), %edx ## Folded load movl %edx, %eax addq $12, %rsp ret Fewer instructions and less register use = faster compiles. llvm-svn: 113102	2010-09-05 02:18:34 +00:00
Dale Johannesen	2f4f8f5705	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
Jim Grosbach	c50df6cfad	Re-apply r112883: "For ARM stack frames that utilize variable sized objects and have either large local stack areas or require dynamic stack realignment, allocate a base register via which to access the local frame. This allows efficient access to frame indices not accessible via the FP (either due to being out of range or due to dynamic realignment) or the SP (due to variable sized object allocation). In particular, this greatly improves efficiency of access to spill slots in Thumb functions which contain VLAs." r112986 fixed a latent bug exposed by the above. llvm-svn: 112989	2010-09-03 18:37:12 +00:00
Daniel Dunbar	3fa5ea53fa	Revert "For ARM stack frames that utilize variable sized objects and have either", it is breaking oggenc with Clang for ARMv6. This reverts commit 8d6e29cfda270be483abf638850311670829ee65. llvm-svn: 112962	2010-09-03 15:26:42 +00:00
NAKAMURA Takumi	736651f79b	test/CodeGen/X86: Add explicit -mtriple=(i686\|x86_64)-linux for Win32 host. llvm-svn: 112947	2010-09-03 03:24:08 +00:00
Bruno Cardoso Lopes	f91bd70e9a	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Bob Wilson	24fa0b33b1	Replace NEON vabdl, vaba, and vabal intrinsics with combinations of the vabd intrinsic and add and/or zext operations. In the case of vaba, this also avoids the need for a DAG combine pattern to combine vabd with add. Update tests. Auto-upgrade the old intrinsics. llvm-svn: 112941	2010-09-03 01:35:08 +00:00
Anton Korobeynikov	32cecc0ecc	Properly emit __chkstk call instead of __alloca on non-mingw windows targets. Patch by Cameron Esfahani! llvm-svn: 112902	2010-09-02 23:03:46 +00:00
Jim Grosbach	fb89154d21	For ARM stack frames that utilize variable sized objects and have either large local stack areas or require dynamic stack realignment, allocate a base register via which to access the local frame. This allows efficient access to frame indices not accessible via the FP (either due to being out of range or due to dynamic realignment) or the SP (due to variable sized object allocation). In particular, this greatly improves efficiency of access to spill slots in Thumb functions which contain VLAs. rdar://7352504 rdar://8374540 rdar://8355680 llvm-svn: 112883	2010-09-02 22:29:01 +00:00
Dan Gohman	6824bfc554	Don't narrow the load and store in a load+twiddle+store sequence unless there are clearly no stores between the load and the store. This fixes this miscompile reported as PR7833. This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is safe, but awkward to prove safe. Move it to X86's README.txt. llvm-svn: 112861	2010-09-02 21:18:42 +00:00
Sandeep Patel	1c6b7d8a96	Fix an unnecessary XFAIL llvm-svn: 112853	2010-09-02 20:19:24 +00:00
Jim Grosbach	8f30718112	Now that register allocation properly considers reserved regs, simplify the ARM register class allocation order functions to take advantage of that. llvm-svn: 112841	2010-09-02 18:14:29 +00:00
Bob Wilson	8951c7592c	Convert VLD1 and VLD2 instructions to use pseudo-instructions until after regalloc. llvm-svn: 112825	2010-09-02 16:00:54 +00:00
NAKAMURA Takumi	f5566eebf8	test/loop-strength-reduce4: Add explicit triplet for Win32 host. llvm-svn: 112802	2010-09-02 03:45:58 +00:00
NAKAMURA Takumi	8e620796b6	test/twoaddr-coalesce: Do not use @main. Win32 codegen emits implicit invoking __main into, to fail. llvm-svn: 112801	2010-09-02 03:45:51 +00:00
Bob Wilson	3348d2eb50	Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply, add, and subtract operations with zero-extended or sign-extended vectors. Update tests. Add auto-upgrade support for the old intrinsics. llvm-svn: 112773	2010-09-01 23:50:19 +00:00
Bruno Cardoso Lopes	601bf4c6d3	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Jakob Stoklund Olesen	fe11c81560	Teach RemoveCopyByCommutingDef to check all aliases, not just subregisters. This caused a miscompilation in WebKit where %RAX had conflicting defs when RemoveCopyByCommutingDef was commuting a %EAX use. llvm-svn: 112751	2010-09-01 22:15:35 +00:00
Chris Lattner	b74759a9fa	temporarily revert r112664, it is causing a decoding conflict, and the testcases should be merged. llvm-svn: 112711	2010-09-01 16:00:50 +00:00
Dan Gohman	ad593f9d93	Revert 112442 and 112440 until the compile time problems introduced by 112440 are resolved. llvm-svn: 112692	2010-09-01 01:45:53 +00:00
Bill Wendling	bb6052cfd6	We have a chance for an optimization. Consider this code: int x(int t) { if (t & 256) return -26; return 0; } We generate this: tst.w r0, #256 mvn r0, #25 it eq moveq r0, #0 while gcc generates this: ands r0, r0, #256 it ne mvnne r0, #25 bx lr Scandalous really! During ISel time, we can look for this particular pattern. One where we have a "MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND instruction to 0. Something like this (greatly simplified): %r0 = ISD::AND ... ARMISD::CMPZ %r0, 0 @ sets [CPSR] %r0 = ARMISD::MOVCC 0, -26 @ reads [CPSR] All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR] when it's zero. The zero value will all ready be in the %r0 register and we only need to change it if the AND wasn't zero. Easy! llvm-svn: 112664	2010-08-31 22:41:22 +00:00
Jim Grosbach	0a357d79c7	Update test for 112609 llvm-svn: 112610	2010-08-31 17:58:47 +00:00
Anton Korobeynikov	c3f039784a	Fix borken test llvm-svn: 112555	2010-08-30 23:41:49 +00:00
Bob Wilson	826a677f94	Remove NEON vmovn intrinsic, replacing it with vector truncate operations. Auto-upgrade the old intrinsic and update tests. llvm-svn: 112507	2010-08-30 20:02:30 +00:00
Chris Lattner	765e59210c	two changes: 1) nuke ConstDataCoalSection, which is dead. 2) revise my previous patch for rdar://8018335, which was completely wrong. Specifically, it doesn't make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS, because it is for readonly data. templates (it turns out) go to const_coal_nt. The real fix for rdar://8018335 was to give ConstTextCoalSection a section kind of ReadOnly instead of Text. llvm-svn: 112496	2010-08-30 18:12:35 +00:00
Duncan Sands	254f8ff0a6	Correct bogus module triple specifications. llvm-svn: 112469	2010-08-30 10:48:29 +00:00
Dan Gohman	9ea315c5ca	Make IVUsers iterative instead of recursive. This has the side effect of reversing the order of most of IVUser's results. llvm-svn: 112442	2010-08-29 16:40:03 +00:00
Dan Gohman	0fdab32e0f	Make this test less dependent on register allocation choices. llvm-svn: 112426	2010-08-29 14:49:42 +00:00
Kalle Raiskila	daba4ffc75	Fix lowering of INSERT_VECTOR_ELT in SPU. The IDX was treated as byte index, not element index. llvm-svn: 112422	2010-08-29 12:41:50 +00:00
Bob Wilson	807d004452	Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm IR add/sub operations with one or both operands sign- or zero-extended. Auto-upgrade the old intrinsics. llvm-svn: 112416	2010-08-29 05:57:34 +00:00
Chris Lattner	5f911e6fe9	merge a bunch of shuffle tests into sse2.ll llvm-svn: 112398	2010-08-29 03:19:04 +00:00
Chris Lattner	c2ef3180a8	add some nounwind's llvm-svn: 112396	2010-08-29 03:07:47 +00:00
Chris Lattner	8cb4abbc0e	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	c3b630d64b	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Dan Gohman	507f5a8ae7	Completely disable tail calls when fast-isel is enabled, as fast-isel doesn't currently support dealing with this. llvm-svn: 112341	2010-08-28 00:51:03 +00:00
Bob Wilson	31d487d235	Change ARM VFP VLDM/VSTM instructions to use addressing mode #4 , just like all the other LDM/STM instructions. This fixes asm printer crashes when compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run with -O0 to check this in the future. Prior to this change VLDM/VSTM used addressing mode #5, but not really. The offset field was used to hold a count of the number of registers being loaded or stored, and the AM5 opcode field was expanded to specify the IA or DB mode, instead of the standard ADD/SUB specifier. Much of the backend was not aware of these special cases. The crashes occured when rewriting a frameindex caused the AM5 offset field to be changed so that it did not have a valid submode. I don't know exactly what changed to expose this now. Maybe we've never done much with -O0 and NEON. Regardless, there's no longer any reason to keep a count of the VLDM/VSTM registers, so we can use addressing mode #4 and clean things up in a lot of places. llvm-svn: 112322	2010-08-27 23:18:17 +00:00
Chris Lattner	d5777f8e47	get this test passing on linux builders. llvm-svn: 112280	2010-08-27 18:49:08 +00:00

1 2 3 4 5 ...

3584 Commits