llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Rafael Espindola	83257fe618	Some test code to check if correct code is being generated. Patch by Sanjoy Das. llvm-svn: 138820	2011-08-30 19:51:29 +00:00
Roman Divacky	7ac1bc57f7	Set CR1EQ only when lowering vararg floating arguments (not any vararg arguments as before), unset CR1EQ otherwise. llvm-svn: 138802	2011-08-30 17:04:16 +00:00
Evan Cheng	1eacb83316	Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical register dependency (rather than glue them together). This is general goodness as it gives scheduler more freedom. However it is motivated by a nasty bug in isel. When a i64 sub is expanded to subc + sube. libcall #1 \ \ subc \ / \ \ / \ \ / libcall #2 sube If the libcalls are not serialized (i.e. both have chains which are dag entry), legalizer can serialize them in arbitrary orders. If it's unlucky, it can force libcall #2 before libcall #1 in the above case. subc \| libcall #2 \| libcall #1 \| sube However since subc and sube are "glued" together, this ends up being a cycle when the scheduler combine subc and sube as a single scheduling unit. The right solution is to fix LegalizeType too chains the libcalls together. However, LegalizeType is not processing nodes in order so that's harder than it should be. For now, the move to physical register dependency will do. rdar://10019576 llvm-svn: 138791	2011-08-30 01:34:54 +00:00
Eli Friedman	4d90e53381	Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802. llvm-svn: 138768	2011-08-29 21:15:46 +00:00
Owen Anderson	d2fc51c0e5	Add testcase for r138746. llvm-svn: 138747	2011-08-29 18:02:40 +00:00
Duncan Sands	1a69e0119a	Fix PR5329: pay attention to constructor/destructor priority when outputting them. With this, the entire LLVM testsuite passes when built with dragonegg. llvm-svn: 138724	2011-08-28 13:17:22 +00:00
Bill Wendling	aeeb59947e	Update to new EH scheme. llvm-svn: 138699	2011-08-27 04:53:41 +00:00
Bill Wendling	38b5c3a5bc	Cannot have an llvm.eh.exception call in a non-landing pad block. llvm-svn: 138698	2011-08-27 04:53:28 +00:00
Eli Friedman	9f95c7d381	Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction. llvm-svn: 138660	2011-08-26 21:21:21 +00:00
Bill Wendling	5b7cbeacad	Revert r138606 until LowerInvoke has been converted to the new EH scheme. llvm-svn: 138656	2011-08-26 21:11:23 +00:00
Eli Friedman	802dd20495	Atomic load/store on ARM/Thumb. I don't really like the patterns, but I'm having trouble coming up with a better way to handle them. I plan on making other targets use the same legalization ARM-without-memory-barriers is using... it's not especially efficient, but if anyone cares, it's not that hard to fix for a given target if there's some better lowering. llvm-svn: 138621	2011-08-26 02:59:24 +00:00
Bill Wendling	077e9ea84b	Update to the new EH scheme. llvm-svn: 138606	2011-08-25 23:48:37 +00:00
Bruno Cardoso Lopes	5b3d2c9e17	Add support for AVX 256-bit version of MOVDDUP! llvm-svn: 138588	2011-08-25 21:40:37 +00:00
Andrew Trick	0dd0ae11f8	ARM fix for missing implicit operands on ldmia_ret. rdar://10005094: miscompile of 176.gcc llvm-svn: 138568	2011-08-25 17:50:53 +00:00
Bill Wendling	1eec5affec	LSR wants to split the landing pad's critical edge. Let it do it, but use the proper function to do it. llvm-svn: 138550	2011-08-25 05:55:40 +00:00
Bruno Cardoso Lopes	5d34219953	Add support for 256-bit versions of VSHUFPD and VSHUFPS. llvm-svn: 138546	2011-08-25 02:58:26 +00:00
Eli Friedman	b6597a2e70	Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually. llvm-svn: 138505	2011-08-24 22:33:28 +00:00
Eli Friedman	688794e1d1	Basic tests for atomic load and store on x86. llvm-svn: 138486	2011-08-24 21:16:59 +00:00
Richard Osborne	6b6b0b535d	Add Uses=[SP] to call instructions. This fixes a miscompilation with a variable sized alloca. llvm-svn: 138433	2011-08-24 13:32:43 +00:00
Craig Topper	1da38a34a6	Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711. llvm-svn: 138427	2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes	8959b54713	Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit permutations. Also tidy up some patterns and make them close to their instruction definition! llvm-svn: 138392	2011-08-23 22:06:37 +00:00
Nick Lewycky	11874a4e0a	PerformSubCombine to work on integers larger than i128. Fixes a crasher. llvm-svn: 138354	2011-08-23 19:01:24 +00:00
Craig Topper	67b22aedb4	Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712. llvm-svn: 138321	2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes	8024703a16	Introduce a pass to insert vzeroupper instructions to avoid AVX to SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper" llc command line option. This is only the first step (very naive and conservative one) to sketch out the idea, but proper DFA is coming next to allow smarter decisions. Comments and ideas now and in further commits will be very appreciated. llvm-svn: 138317	2011-08-23 01:14:17 +00:00
Bruno Cardoso Lopes	8007165688	Add support for breaking 256-bit int VETCC into two 128-bit ones, avoding scalarization of the compare. Reduces code from 59 to 6 instructions. Fix PR10712. llvm-svn: 138271	2011-08-22 20:31:04 +00:00
Chad Rosier	0bfea70d09	With the fix in r138164: "Add <imp-def> operands to QQ and QQQQ stack loads." -verify-machineinstrs can be enabled for this test case. llvm-svn: 138171	2011-08-20 00:34:45 +00:00
Chad Rosier	55c57f07dd	VMOVQQQQs pseudo instructions are only created by ARMBaseInstrInfo::copyPhysReg. Therefore, rather then generate a pseudo instruction, which is later expanded, generate the necessary instructions in place. llvm-svn: 138163	2011-08-20 00:17:25 +00:00
Devang Patel	e4127d626e	Do not use named md nodes to track variables that are completely optimized. This does not scale while doing LTO with debug info. New approach is to include list of variables in the subprogram info directly. llvm-svn: 138145	2011-08-19 23:28:12 +00:00
Jim Grosbach	969c7a9037	Use regex to remove false dependencies on register allocation. llvm-svn: 138137	2011-08-19 23:10:31 +00:00
Jim Grosbach	5481e15390	Update tests. llvm-svn: 138116	2011-08-19 22:19:48 +00:00
Jakob Stoklund Olesen	f847cb77db	Add test case for r138018. llvm-svn: 138033	2011-08-19 04:30:24 +00:00
Akira Hatanaka	163382894e	Use subword loads instead of a 4-byte load when the size of a structure (or a piece of it) that is being passed by value is smaller than a word. llvm-svn: 138007	2011-08-18 23:39:37 +00:00
Ivan Krasin	338df71d60	FastISel: avoid function calls between the materialization of the constant and its use. llvm-svn: 137993	2011-08-18 22:06:10 +00:00
Jim Grosbach	7ecefeb594	Thumb assembly parsing and encoding for LDM instruction. Fix base register type and canonicallize to the "ldm" spelling rather than "ldmia." Add diagnostics for incorrect writeback token and out-of-range registers. llvm-svn: 137986	2011-08-18 21:50:53 +00:00
Richard Osborne	415c5ff412	Add intrinsics for SETEV, GETED, GETET. llvm-svn: 137938	2011-08-18 13:00:48 +00:00
Bruno Cardoso Lopes	c174d8ac48	Cleanup vector logical ops in AVX and add use int versions for simple v2i64 llvm-svn: 137919	2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes	82795e6b41	Fix PR10688. Add support for spliting 256-bit vector shifts when the shift amount is variable llvm-svn: 137885	2011-08-17 22:12:20 +00:00
Jim Grosbach	0115c6f75b	Thumb assembly parsing and encoding for ADR. llvm-svn: 137864	2011-08-17 20:37:40 +00:00
Bruno Cardoso Lopes	98531dfd08	Introduce matching patterns for vbroadcast AVX instruction. The idea is to match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810	2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes	0a3b3123fd	Update test to not use the scalar type to splat from a load llvm-svn: 137809	2011-08-17 02:29:15 +00:00
Bruno Cardoso Lopes	4ff4ed28af	Now that we have a canonical way to handle 256-bit splats: vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807	2011-08-17 02:29:10 +00:00
Akira Hatanaka	0179c7fa68	Add support for ext and ins. llvm-svn: 137804	2011-08-17 02:05:42 +00:00
Bruno Cardoso Lopes	d64294fb0a	Instead of always leaving the work to the generic legalizer when there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733	2011-08-16 18:21:54 +00:00
Akira Hatanaka	dcbf455b98	Add test case for r137711. llvm-svn: 137725	2011-08-16 17:32:01 +00:00
Akira Hatanaka	12df91513e	Fix handling of double precision loads and stores when Mips1 is targeted. Mips1 does not support double precision loads or stores, therefore two single precision loads or stores must be used in place of these instructions. This patch treats double precision loads and stores as if they are legal instructions until MCInstLowering, instead of generating the single precision instructions during instruction selection or Prolog/Epilog code insertion. Without the changes made in this patch, llc produces code that has the same problem described in r137484 or bails out when MipsInstrInfo::storeRegToStackSlot or loadRegFromStackSlot is called before register allocation. llvm-svn: 137711	2011-08-16 03:51:51 +00:00
Bruno Cardoso Lopes	b81c3ed76d	Fix PR10656. It's only profitable to use 128-bit inserts and extracts when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661	2011-08-15 21:45:54 +00:00
Eric Christopher	1bb5eaa978	Fix this test to avoid leaving a temporary file behind. llvm-svn: 137651	2011-08-15 20:55:03 +00:00
Bob Wilson	90799621b3	Expand VMOVQQQQ pseudo instructions. Apparently we never added code to expand these pseudo instructions, and in over a year, no one has noticed. Our register allocator must be awesome! llvm-svn: 137551	2011-08-13 05:14:55 +00:00
Bruno Cardoso Lopes	2d100ca13c	The VPERM2F128 is a AVX instruction which permutes between two 256-bit vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519	2011-08-12 21:48:26 +00:00
Akira Hatanaka	c9c0190cbe	Define unaligned load and store. llvm-svn: 137515	2011-08-12 21:30:06 +00:00
Akira Hatanaka	6caf61a6ac	Test case for 137484 llvm-svn: 137486	2011-08-12 18:12:06 +00:00
Akira Hatanaka	b787f8a8a5	Enclose directive .cprestore with .set macro and nomacro to silence assembler warning. llvm-svn: 137378	2011-08-11 22:42:31 +00:00
Bruno Cardoso Lopes	328a6a980b	Add a dag combine to xform 256-bit shuffles into simple vector inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362	2011-08-11 21:50:44 +00:00
Bruno Cardoso Lopes	884d8b9cb5	Fix the test added by Nadav in r137308. Make it more strict: 1) check for the "v" version of movaps 2) add a couple of CHECK-NOT to guarantee the behavior 3) move to a more appropriate test file llvm-svn: 137361	2011-08-11 21:50:35 +00:00
Bruno Cardoso Lopes	38d4afa02f	Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict. llvm-svn: 137324	2011-08-11 18:59:13 +00:00
Jim Grosbach	9717a9c0d3	ARM push of a single register encodes as pre-indexed STR. Per the ARM ARM, a 'push' of a single register encodes as an STR, not an STM. llvm-svn: 137318	2011-08-11 18:07:11 +00:00
Jim Grosbach	abaaf4513f	ARM pop of a single register encodes as post-indexed LDR. Per the ARM ARM, a 'pop' of a single register encodes as an LDR, not an LDM. llvm-svn: 137316	2011-08-11 17:35:48 +00:00
Nadav Rotem	de1b485f3f	[AVX] If the data which is going to be saved is already in two XMM registers (for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308	2011-08-11 16:41:21 +00:00
Chris Lattner	3ae8704c4f	add missing colon, thanks peter. llvm-svn: 137306	2011-08-11 16:15:10 +00:00
Chris Lattner	575057916a	fix PR10605 / rdar://9930964 by adding a pretty scary missed check. It's somewhat surprising anything works without this. Before we would compile the testcase into: test: # @test movl $4, 8(%rdi) movl 8(%rdi), %eax orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 now we produce: test: # @test movl 8(%rdi), %eax movl $4, 8(%rdi) orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 llvm-svn: 137303	2011-08-11 06:26:54 +00:00
Bruno Cardoso Lopes	8674ddf55a	Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296	2011-08-11 02:49:44 +00:00
Bruno Cardoso Lopes	954ac403c7	Use the splat index to generate the desired shuffle. Otherwise we could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295	2011-08-11 02:49:41 +00:00
Eli Friedman	17bd9e5d7c	Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292	2011-08-11 01:48:05 +00:00
NAKAMURA Takumi	5d316f7632	test/CodeGen/X86/opt-shuff-tstore.ll: Add explicit -mtriple=x86_64-linux. llvm-svn: 137262	2011-08-10 22:52:48 +00:00
Devang Patel	393d6e1fd0	While extending definition range of a debug variable, consult lexical scopes also. There is no point extending debug variable out side its lexical block. This provides 6x compile time speedup in some cases. llvm-svn: 137250	2011-08-10 21:25:34 +00:00
Nadav Rotem	1b3075c0ab	Fix the test. Add cpu target. llvm-svn: 137241	2011-08-10 19:49:19 +00:00
Nadav Rotem	4a8d78d24a	When performing a truncating store, it is sometimes possible to rearrange the data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238	2011-08-10 19:30:14 +00:00
Bruno Cardoso Lopes	565ab1542a	The following X86 pattern is incorrect: def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227	2011-08-10 17:45:17 +00:00
Rafael Espindola	45cd7316b5	Add support for the R and Q constraints. llvm-svn: 137217	2011-08-10 16:26:42 +00:00
Bruno Cardoso Lopes	4a435a361d	Fix a bug in vpermilps mask checking. Fix PR10560 llvm-svn: 137194	2011-08-10 01:54:17 +00:00
Bruno Cardoso Lopes	9a695724bd	Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 llvm-svn: 137179	2011-08-09 23:27:13 +00:00
Bruno Cardoso Lopes	7461b930f3	Add v16i16 and v32i8 store patterns llvm-svn: 137166	2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes	028c6aa951	Use fp unpack instructions to unpack int types. Until we have AVX2, this is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161	2011-08-09 22:18:37 +00:00
Eli Friedman	44fd5b2b59	Fix a couple ridiculous copy-paste errors. rdar://9914773 . llvm-svn: 137160	2011-08-09 22:17:39 +00:00
Bill Wendling	250ea7930e	Revert r137134. It breaks some code as Eli pointed out. llvm-svn: 137135	2011-08-09 18:56:35 +00:00
Bill Wendling	ca256c0d2d	Print out the variable declaration only if it is a declaration. Otherwise, a 'static' variable will be emitted twice. PR10081 llvm-svn: 137134	2011-08-09 18:31:50 +00:00
Jakob Stoklund Olesen	e43aca1c39	Inflate register classes after coalescing. Coalescing can remove copy-like instructions with sub-register operands that constrained the register class. Examples are: x86: GR32_ABCD:sub_8bit_hi -> GR32 arm: DPR_VFP2:ssub0 -> DPR Recompute the register class of any virtual registers that are used by less instructions after coalescing. This affects code generation for the Cortex-A8 where we use NEON instructions for f32 operations, c.f. fp_convert.ll: vadd.f32 d16, d1, d0 vcvt.s32.f32 d0, d16 The register allocator is now free to use d16 for the temporary, and that comes first in the allocation order because it doesn't interfere with any s-registers. llvm-svn: 137133	2011-08-09 18:19:41 +00:00
Bruno Cardoso Lopes	633400ee00	Reapply a more appropriate solution than in r137114. AVX supports v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128	2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes	1962a341d8	Revert r137114 llvm-svn: 137127	2011-08-09 17:39:01 +00:00
Justin Holewinski	021ab783b7	PTX: Add initial support for device function calls - Calls are supported on SM 2.0+ for function with no return values llvm-svn: 137125	2011-08-09 17:36:31 +00:00
Bruno Cardoso Lopes	5dac86dac6	Handle sitofp between v4f64 <- v4i32. Fix PR10559 llvm-svn: 137114	2011-08-09 05:48:01 +00:00
Bruno Cardoso Lopes	d521431558	Add support for avx vector fextend llvm-svn: 137105	2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes	81534df169	Rename and tidy up tests llvm-svn: 137103	2011-08-09 03:04:23 +00:00
Bruno Cardoso Lopes	1025d1eb3b	Add two patterns to match special vmovss and vmovsd cases. Also fix the patterns already there to be more strict regarding the predicate. This fixes PR10558 llvm-svn: 137100	2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes	d7eac41193	Make LowerVSETCC aware of AVX types and add patterns to match them. llvm-svn: 137090	2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes	d8534855ff	Add support for several vector shifts operations while in AVX mode. Fix PR10581 llvm-svn: 137067	2011-08-08 21:31:08 +00:00
Eli Friedman	7a34419c6f	Fix up the patterns for SXTB, SXTH, UXTB, and UXTH so that they are correctly active without HasT2ExtractPack. PR10611. llvm-svn: 137061	2011-08-08 19:49:37 +00:00
Jakob Stoklund Olesen	85931574b0	Don't clobber pending ST regs when FP regs are killed. X86FloatingPoint keeps track of pending ST registers for an upcoming inline asm instruction with fixed stack register constraints. It does this by remembering which FP register holds the value that should appear at a fixed stack position for the inline asm. When that FP register is killed before the inline asm, make sure to duplicate it to a scratch register, so the ST register still has a live FP reference. This could happen when the same FP register was copied to two ST registers, or when a spill instruction is inserted between the ST copy and the inline asm. This fixes PR10602. llvm-svn: 137050	2011-08-08 17:15:43 +00:00
Rafael Espindola	2da6e6a1d8	print st_shndx with the correct number of bits. llvm-svn: 136880	2011-08-04 15:50:13 +00:00
Rafael Espindola	c1a076eeb1	print st_other with the correct number of bits. llvm-svn: 136877	2011-08-04 15:38:19 +00:00
Rafael Espindola	368850841d	print st_type with the correct number of bits. llvm-svn: 136875	2011-08-04 15:24:00 +00:00
Rafael Espindola	e08bb3d50f	Print st_bind with the correct number of bits. llvm-svn: 136874	2011-08-04 15:10:35 +00:00
Rafael Espindola	865ab6cb05	Print r_sym with the correct number of bits. llvm-svn: 136873	2011-08-04 14:48:27 +00:00
Rafael Espindola	f65dd30907	Print r_type with the correct number of bits. llvm-svn: 136872	2011-08-04 14:39:30 +00:00
Rafael Espindola	edfafcbfb0	Change anther counter to decimal. llvm-svn: 136870	2011-08-04 14:01:03 +00:00
Rafael Espindola	3e8393e6f7	Don't print a counter in hex. llvm-svn: 136869	2011-08-04 13:39:15 +00:00
Bill Wendling	60e17f8212	Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR. Fixes PR10527. llvm-svn: 136853	2011-08-04 00:32:58 +00:00
Benjamin Kramer	d93ac7d0b6	Remove underscore that's breaking linux buildbots. llvm-svn: 136833	2011-08-03 23:13:01 +00:00
Jakub Staszak	9d083611d4	Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics. llvm-svn: 136826	2011-08-03 22:34:43 +00:00
Jakob Stoklund Olesen	002075193b	Handle IMPLICIT_DEF instructions in X86FloatingPoint. This fixes PR10575. llvm-svn: 136787	2011-08-03 16:33:19 +00:00
Devang Patel	99a2f0d98c	Use byte offset, instead of element number, to access merged global. llvm-svn: 136759	2011-08-03 01:25:46 +00:00
Rafael Espindola	cefc38659a	Assume .cfi_startproc is the first thing in a function. If the function is externally visable, create a local symbol to use in the CFE. If not, use the function label itself. Fixes PR10420. llvm-svn: 136716	2011-08-02 20:24:22 +00:00
Bruno Cardoso Lopes	ac0984dc7e	Make this kind of lowering to be supported by 256-bit instructions: shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> To: shuffle (vload ptr)), undef, <1, 1, 1, 1> Fix PR10494 llvm-svn: 136691	2011-08-02 16:06:18 +00:00
Bruno Cardoso Lopes	771876cade	Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654	2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes	d3a5171087	Since vectors with all ones can't be created with a 256-bit instruction, avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642	2011-08-01 19:51:53 +00:00
Richard Osborne	2cd07cf351	Fix crash with varargs function with no named parameters. llvm-svn: 136623	2011-08-01 16:45:59 +00:00
Jakob Stoklund Olesen	0f099a3c58	Revert "Don't check liveness of unallocatable registers." The ARM target depends on CPSR liveness being tracked after register allocation. llvm-svn: 136548	2011-07-30 00:57:25 +00:00
Jakob Stoklund Olesen	a05b70241c	Don't check liveness of unallocatable registers. This includes registers like EFLAGS and ST0-ST7. We don't check for liveness issues in the verifier and scavenger because registers will never be allocated from these classes. While in SSA form, we do care about the liveness of unallocatable unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for MachineDCE and MachineSinking. llvm-svn: 136541	2011-07-29 23:36:21 +00:00
Eric Christopher	96b31d5681	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Bruno Cardoso Lopes	871df895f4	Fix two tests that I crashed in the previous commits. The mask elts on the second half must be reindexed. llvm-svn: 136454	2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes	2b3d85d81c	Match VPERMIL masks more strictly and update the target specific mask generation to always catch the weird cases. llvm-svn: 136453	2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes	473d982caf	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Jakob Stoklund Olesen	cc29034b4c	Transfer implicit operands in NEONMoveFixPass. Later passes /are/ using this information when running the register scavenger. This fixes the second problem in PR10520. llvm-svn: 136440	2011-07-29 00:27:35 +00:00
Jakob Stoklund Olesen	f97f492104	Add -verify-arm-pseudo-expand. This hidden llc option runs the machine code verifier after expanding ARM pseudo-instructions, but before if-conversion. The machine code verifier is much better at pointing out liveness errors that can trip up the register scavenger. llvm-svn: 136439	2011-07-29 00:27:32 +00:00
Jakob Stoklund Olesen	5f429460ba	Handle REG_SEQUENCE with implicitly defined operands. Code like that would only be produced by bugpoint, but we should still handle it correctly. When a register is defined by a REG_SEQUENCE of undefs, the register itself is undef. Previously, we would create a register with uses but no defs. Fixes part of PR10520. llvm-svn: 136401	2011-07-28 21:38:51 +00:00
Bruno Cardoso Lopes	e24a043703	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	1f63a37172	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	06d8be564f	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes	8830fde434	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Devang Patel	e85a416d4e	It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry. llvm-svn: 136196	2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen	3f729850d3	Eliminate copies of undefined values during coalescing. These copies would coalesce easily, but the resulting value would be defined by a deleted instruction. Now we also remove the undefined value number from the destination register. This fixes PR10503. llvm-svn: 136174	2011-07-26 23:00:24 +00:00
Benjamin Kramer	32a2ce8416	Update test. llvm-svn: 136170	2011-07-26 22:45:39 +00:00
Benjamin Kramer	bfc2dfe3f7	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	e53bb853ea	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	4e16c5341a	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Jim Grosbach	906ecb46ed	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Eli Friedman	8779017138	XFAIL this test while I investigate it; it's failing for an unexpected reason. llvm-svn: 136131	2011-07-26 20:41:03 +00:00
Eli Friedman	e52bee3cc9	Add obvious missing case to switch. PR10497. llvm-svn: 136130	2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes	ab40a57cce	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	c94d6a2d2c	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	9380919dc5	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Eli Friedman	dc213dadcc	Attempt to fix test failure reported on llvm-commits. llvm-svn: 135995	2011-07-25 22:28:51 +00:00
Eli Friedman	99fd6d41b5	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Eli Friedman	234bbb2b95	Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980	2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen	0e4f7f92a2	Correctly handle <undef> tied uses when rewriting after a split. This fixes PR10463. A two-address instruction with an <undef> use operand was incorrectly rewritten so the def and use no longer used the same register, violating the tie constraint. Fix this by always rewriting <undef> operands with the register a def operand would use. llvm-svn: 135885	2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes	7347599e42	Fix test check! llvm-svn: 135802	2011-07-22 20:55:28 +00:00
Bruno Cardoso Lopes	50a38b479a	Fix PR10422 by adding the necessary AVX UCOMISD memory versions to load folding logic llvm-svn: 135801	2011-07-22 20:53:20 +00:00
Rafael Espindola	0c8190c4a3	Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 too. Patch by Jeff Muizelaar. llvm-svn: 135789	2011-07-22 18:56:05 +00:00
Bruno Cardoso Lopes	b7b9688aa5	-Inspected a AVX code block added by someone in early Feb. This was never used and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729	2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes	85357a460f	Although we already support this, add testcases for consistency llvm-svn: 135728	2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes	1ee6122518	Add a DAGCombine for transforming 128->256 casts into a simple vxorps + vinsertf128 pair of instructions llvm-svn: 135727	2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes	3691063149	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	ba1a2a9135	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Devang Patel	9914fe1aca	While emitting constant value, look through derived type and use underlying basic type to determine size and signness of the constant value. llvm-svn: 135627	2011-07-20 21:57:04 +00:00
Eli Friedman	3af0eb7b5f	PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS. llvm-svn: 135595	2011-07-20 18:14:33 +00:00
Evan Cheng	380dc98371	Add MCObjectFileInfo and sink the MCSections initialization code from TargetLoweringObjectFileImpl down to MCObjectFileInfo. TargetAsmInfo is done to one last method. It's almost gone! llvm-svn: 135569	2011-07-20 05:58:47 +00:00
Eric Christopher	7510091996	New pointer rotate test. llvm-svn: 135562	2011-07-20 03:09:11 +00:00
Akira Hatanaka	a50bbdfe15	Lower memory barriers to sync instructions. llvm-svn: 135537	2011-07-19 23:30:50 +00:00
Evan Cheng	9a80b0a7e6	Fix an obvious typo that's preventing x86 (32-bit) from using .literal16. llvm-svn: 135535	2011-07-19 23:14:32 +00:00
Akira Hatanaka	14e517df43	Use the correct opcodes: SLLV/SRLV or AND must be used instead of SLL/SRL or ANDi, when the instruction does not have any immediate operands. llvm-svn: 135520	2011-07-19 20:34:00 +00:00
Akira Hatanaka	f59cbeec14	Remove redundant instructions. - In EmitAtomicBinaryPartword, mask incr in loopMBB only if atomic.swap is the instruction being expanded, instead of masking it in thisMBB. - Remove redundant Or in EmitAtomicCmpSwap. llvm-svn: 135495	2011-07-19 18:14:26 +00:00
Richard Osborne	b469141419	Add intrinsics for the zext / sext instructions. llvm-svn: 135476	2011-07-19 13:28:50 +00:00
Richard Osborne	50303e0d38	Add intrinsics for the testct, testwct instructions. llvm-svn: 135475	2011-07-19 13:00:40 +00:00
Richard Osborne	409c0d7768	Add intrinsics for the peek and endin instructions. llvm-svn: 135474	2011-07-19 12:50:25 +00:00
Evan Cheng	bfc0cac54d	Introduce MCCodeGenInfo, which keeps information that can affect codegen (including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468	2011-07-19 06:37:02 +00:00
Devang Patel	72886ba8d8	Revert r135423. llvm-svn: 135454	2011-07-19 00:28:24 +00:00
Eli Friedman	887bb0b25a	FileCheck-ize a couple tests. llvm-svn: 135427	2011-07-18 21:23:42 +00:00
Devang Patel	389cb9d8c6	During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases. [take 2] llvm-svn: 135423	2011-07-18 20:55:23 +00:00
Akira Hatanaka	52263f51f1	Do not treat atomic.load.sub differently than other atomic binary intrinsics. llvm-svn: 135418	2011-07-18 19:58:59 +00:00
Akira Hatanaka	79f38f0ae7	Set mayLoad or mayStore flags for SC and LL in order to prevent LICM from moving them out of the loop. Previously, stores and loads to a stack frame object were inserted to accomplish this. Remove the code that was needed to do this. Patch by Sasa Stankovic. llvm-svn: 135415	2011-07-18 18:52:12 +00:00
Jakob Stoklund Olesen	89e84069d2	Fix a crash when building 177.mesa for armv6. When splitting a live range immediately before an LDR_POST instruction that redefines the address register, make sure to use the correct value number in leaveIntvBefore. We need the value number entering the instruction. <rdar://problem/9793765> llvm-svn: 135413	2011-07-18 18:47:13 +00:00
Bruno Cardoso Lopes	da90f383ab	Add AVX 128-bit sqrt versions llvm-svn: 135404	2011-07-18 17:51:40 +00:00
Nick Lewycky	47f28ebead	Delete empty unused file. llvm-svn: 135379	2011-07-18 05:54:06 +00:00
Bruno Cardoso Lopes	d258749f73	Add AVX 128-bit patterns for sint_to_fp llvm-svn: 135332	2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes	d5b62f3403	Fix a couple of things: 1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313	2011-07-15 22:24:33 +00:00
Owen Anderson	7a380bac06	Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler. llvm-svn: 135290	2011-07-15 18:46:47 +00:00
Eric Christopher	ca7ae418a5	Check register class matching instead of width of type matching when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180	2011-07-14 20:13:52 +00:00
Bruno Cardoso Lopes	d24f039847	Add 256-bit load/store recognition and matching in several places. llvm-svn: 135171	2011-07-14 18:50:58 +00:00
Eric Christopher	be21240f6f	Add a testcase for r135123. Part of rdar://9761830 llvm-svn: 135133	2011-07-14 06:23:09 +00:00
Benjamin Kramer	1cab6179ab	Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient. llvm-svn: 135126	2011-07-14 01:38:42 +00:00
Bruno Cardoso Lopes	f29783ee55	We already support 256-bit packed ADD, SUB, DIV, MUL. Add testcases. llvm-svn: 135099	2011-07-13 22:28:55 +00:00
Bruno Cardoso Lopes	c0401dddf7	Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088	2011-07-13 21:36:51 +00:00
Eli Friedman	30d557cc28	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Bruno Cardoso Lopes	cb49278ad6	AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd llvm-svn: 135023	2011-07-13 01:15:33 +00:00
Evan Cheng	37ff73dfaf	Improve codegen for select's: if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017	2011-07-13 00:42:17 +00:00
Jim Grosbach	863f0216d5	Improve test cases from r134746. Use memory barriers to force if-conversion off for these tests instead of the internal llc command line option ifcvt-limit. llvm-svn: 134986	2011-07-12 16:06:01 +00:00
Andrew Trick	a53688c65c	Comment correction. llvm-svn: 134958	2011-07-12 03:39:22 +00:00
Jim Grosbach	93f2ebb5e7	Simplify printing of ARM shifted immediates. Print shifted immediate values directly rather than as a payload+shifter value pair. This makes for more readable output assembly code, simplifies the instruction printer, and is consistent with how Thumb immediates are displayed. llvm-svn: 134902	2011-07-11 16:48:36 +00:00
NAKAMURA Takumi	183ec41f4a	test/CodeGen/PowerPC/vector.ll: Tweak redirection >%t >%t to >%t >>%t. See also r134814 (test/CodeGen/X86/vector.ll). llvm-svn: 134900	2011-07-11 16:21:52 +00:00
Cameron Zwarich	1efde78890	Add a missing test for r134882. llvm-svn: 134889	2011-07-11 08:35:17 +00:00
Chris Lattner	a106725fc5	Land the long talked about "type system rewrite" patch. This patch brings numerous advantages to LLVM. One way to look at it is through diffstat: 109 files changed, 3005 insertions(+), 5906 deletions(-) Removing almost 3K lines of code is a good thing. Other advantages include: 1. Value::getType() is a simple load that can be CSE'd, not a mutating union-find operation. 2. Types a uniqued and never move once created, defining away PATypeHolder. 3. Structs can be "named" now, and their name is part of the identity that uniques them. This means that the compiler doesn't merge them structurally which makes the IR much less confusing. 4. Now that there is no way to get a cycle in a type graph without a named struct type, "upreferences" go away. 5. Type refinement is completely gone, which should make LTO much MUCH faster in some common cases with C++ code. 6. Types are now generally immutable, so we can use "Type " instead "const Type " everywhere. Downsides of this patch are that it removes some functions from the C API, so people using those will have to upgrade to (not yet added) new API. "LLVM 3.0" is the right time to do this. There are still some cleanups pending after this, this patch is large enough as-is. llvm-svn: 134829	2011-07-09 17:41:24 +00:00
Chris Lattner	4ddffa2acc	more tests not making the jump into the brave new world. llvm-svn: 134820	2011-07-09 16:57:10 +00:00
NAKAMURA Takumi	2cbabf301a	test/CodeGen/X86/vector.ll: Tweak temporary output to appease Win32 hosts. With Lit (not bash) in a test, multiple redirects >%t might open(%t, "w") multiple. It can be avoided if latter redirect is >>%t. It might work even if ">/dev/null" were used. llvm-svn: 134814	2011-07-09 10:22:28 +00:00
Jakob Stoklund Olesen	fe41eb3bda	Hoist spills within a basic block. Try to move spills as early as possible in their basic block. This can help eliminate interferences by shortening the live range being spilled. This fixes PR10221. llvm-svn: 134776	2011-07-09 00:25:03 +00:00
Evan Cheng	9719ca7c76	Fix broken x86_64 tests which specify non-64-bit cpu's. llvm-svn: 134756	2011-07-08 22:29:33 +00:00
Eli Friedman	0ea2c325a9	Default 64-bit target features and SSE2 on when a triple specifies x86-64. Clean up all the other hacks which are now unnecessary. llvm-svn: 134753	2011-07-08 22:16:47 +00:00
Jim Grosbach	2b8103505a	Make tBX_RET and tBX_RET_vararg predicable. The normal tBX instruction is predicable, so there's no reason the pseudos for using it as a return shouldn't be. Gives us some nice code-gen improvements as can be seen by the test changes. In particular, several tests now have to disable if-conversion because it works too well and defeats the test. llvm-svn: 134746	2011-07-08 21:50:04 +00:00
Julien Lerouge	75e462e164	Add _allrem, _aullrem and _allmul to the runtime for MSVC. http://llvm.org/bugs/show_bug.cgi?id=10305 llvm-svn: 134744	2011-07-08 21:40:25 +00:00
Cameron Zwarich	c23366d357	Add an intrinsic and codegen support for fused multiply-accumulate. The intent is to use this for architectures that have a native FMA instruction. llvm-svn: 134742	2011-07-08 21:39:21 +00:00
Jakob Stoklund Olesen	acaf9e9ce1	Be more aggressive about following hints. RAGreedy::tryAssign will now evict interference from the preferred register even when another register is free. To support this, add the EvictionCost struct that counts how many hints are broken by an eviction. We don't want to break one hint just to satisfy another. Rename canEvict to shouldEvict, and add the first bit of eviction policy that doesn't depend on spill weights: Always make room in the preferred register as long as the evictees can be split and aren't already assigned to their preferred register. Also make the CSR avoidance more accurate. When looking for a cheaper register it is OK to use a new volatile register. Only CSR aliases that have never been used before should be avoided. llvm-svn: 134735	2011-07-08 20:46:18 +00:00
Jim Grosbach	435ca7304c	Use ARMPseudoExpand for ARM tail calls. llvm-svn: 134719	2011-07-08 18:50:22 +00:00
Benjamin Kramer	44c76d239a	Emit a more efficient magic number multiplication for exact sdivs. We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building. struct foo { char x[24]; }; long bar(struct foo a, struct foo b) { return a-b; } is now compiled into movl 4(%esp), %eax subl 8(%esp), %eax sarl $3, %eax imull $-1431655765, %eax, %eax instead of movl 4(%esp), %eax subl 8(%esp), %eax movl $715827883, %ecx imull %ecx movl %edx, %eax shrl $31, %eax sarl $2, %edx addl %eax, %edx movl %edx, %eax llvm-svn: 134695	2011-07-08 10:31:30 +00:00
Jakob Stoklund Olesen	99c67603c7	Fix more register allocation sensitive tests. llvm-svn: 134667	2011-07-08 00:24:06 +00:00
Jakob Stoklund Olesen	47bc41b3c3	Remove a test that no longer makes sense. It was testing a linear scan feature: Test if linearscan is unfavoring registers for allocation to allow more reuse of reloads from stack slots. The greedy register allocator doesn't access any stack slots in this function, so the linear scan feature was not being tested. llvm-svn: 134666	2011-07-08 00:24:03 +00:00
Nick Lewycky	a82f7a687e	Let the inline asm 'q' constraint match float, and on 64-bit double too. Fixes PR9602! llvm-svn: 134665	2011-07-08 00:19:27 +00:00
Eric Christopher	5fb023bb10	Go ahead and emit the barrier on x86-64 even without sse2. The processor supports it just fine. Fixes PR9675 and rdar://9740801 llvm-svn: 134664	2011-07-08 00:04:56 +00:00
Eric Christopher	b7597bc669	Add support for the X86 'l' constraint. Fixes PR10149 and rdar://9738585 llvm-svn: 134648	2011-07-07 22:29:07 +00:00
Evan Cheng	bbed81df25	Add Mode64Bit feature and sink it down to MC layer. llvm-svn: 134641	2011-07-07 21:06:52 +00:00
Evan Cheng	952943f744	Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests. llvm-svn: 134590	2011-07-07 03:55:05 +00:00
Lang Hames	2c2f6ed1f7	Added a testcase for PR10220. llvm-svn: 134573	2011-07-07 00:36:02 +00:00
Jakub Staszak	28bcc8673e	Introduce "expect" intrinsic instructions. llvm-svn: 134516	2011-07-06 18:22:43 +00:00
Dan Gohman	151e8ce446	Revert r134366 and add an explicit triple to make this test host-independent. llvm-svn: 134447	2011-07-05 22:09:19 +00:00
Jakob Stoklund Olesen	f95a1068bd	Fix PR10277. Remat during spilling triggers dead code elimination. If a phi-def becomes unused, that may also cause live ranges to split into separate connected components. This type of splitting is different from normal live range splitting. In particular, there may not be a common original interval. When the split range is its own original, make sure that the new siblings are also their own originals. The range being split cannot be used as an original since it doesn't cover the new siblings. llvm-svn: 134413	2011-07-05 15:38:41 +00:00
NAKAMURA Takumi	c0837d703b	test/CodeGen/X86/lsr-nonaffine.ll: Relax expressions for Win64 CC to appease Win32 hosts. llvm-svn: 134366	2011-07-03 09:26:14 +00:00
Chandler Carruth	e07bb36a9e	FileCheck-ize another test. Reduces the llc invocations from 8 to 1, and makes one of the tests actually mean something (as the string 'add' will always appear in the output of this file). llvm-svn: 134358	2011-07-02 21:34:52 +00:00
Chandler Carruth	78b12b3ed4	FileCheck-ize another X86 test, making it more precisely verify the desired result based on the comments in the file. llvm-svn: 134354	2011-07-02 20:43:16 +00:00
Chandler Carruth	1926e141f1	FileCheck-ize and simplify RUN lines. llvm-svn: 134352	2011-07-02 20:43:11 +00:00
Chandler Carruth	5de1d825e4	FileCheck-ize llvm-svn: 134351	2011-07-02 20:43:08 +00:00
Chandler Carruth	01e8f9314e	FileCheck-ize and tighten up assertions to only check the relevant sections. llvm-svn: 134350	2011-07-02 20:43:04 +00:00
Chandler Carruth	500b05b1bb	FileCheck-ize and cleanup IR. llvm-svn: 134349	2011-07-02 20:43:01 +00:00
Chandler Carruth	c674fb38ef	FileCheck-ize llvm-svn: 134348	2011-07-02 20:42:59 +00:00
Chandler Carruth	341ed5f0a0	Remove a grep that is already checked with FileCheck. llvm-svn: 134346	2011-07-02 20:42:56 +00:00
Chandler Carruth	88e183829b	FileCheck-ize llvm-svn: 134345	2011-07-02 20:42:53 +00:00
Chandler Carruth	7a0f51e003	FileCheck-ize and modernize IR. llvm-svn: 134344	2011-07-02 20:42:50 +00:00
Chandler Carruth	4af34fe339	FileCheck-ize and simplify RUNs. llvm-svn: 134343	2011-07-02 20:42:48 +00:00
Chandler Carruth	9e114fc3ee	FileCheck-ize and modernize the RUN line. llvm-svn: 134342	2011-07-02 20:42:44 +00:00
Chandler Carruth	df1690a113	FileCheck-ize, tightening checks and avoiding a temporary file. llvm-svn: 134341	2011-07-02 20:42:42 +00:00
Chandler Carruth	a5b1de166b	FileCheck-ize, tightening checks and avoiding a temporary file. llvm-svn: 134340	2011-07-02 20:42:39 +00:00
Chandler Carruth	c041ee0766	FileCheck-ize llvm-svn: 134339	2011-07-02 20:42:36 +00:00
Chandler Carruth	4f82b948fd	FileCheck-ize llvm-svn: 134338	2011-07-02 20:42:33 +00:00
Chandler Carruth	e344d9c676	FileCheck-ize a test, avoiding a temporary file. llvm-svn: 134337	2011-07-02 20:42:31 +00:00
Chandler Carruth	d939fba46d	FileCheck-ize and simplify this test. llvm-svn: 134336	2011-07-02 20:42:28 +00:00
Chandler Carruth	b870175dd5	FileCheck-ize llvm-svn: 134335	2011-07-02 20:42:25 +00:00
Chandler Carruth	d98a57cc5a	FileCheck-ize another codegen test. llvm-svn: 134334	2011-07-02 20:42:22 +00:00
Chandler Carruth	4c7e28777b	Partially FileCheck-ize a test to remove a weird quoting situation. llvm-svn: 134333	2011-07-02 20:42:20 +00:00
Chandler Carruth	0d1da937eb	FileCheck-ize another test, and upgrade its syntax a bit. llvm-svn: 134332	2011-07-02 20:42:17 +00:00
Chandler Carruth	4fd8502d12	FileCheck-ize another codegen test, tightening it up. llvm-svn: 134331	2011-07-02 20:42:14 +00:00
Chandler Carruth	b74aff3ce8	FileCheck-ize another test, making it much more precise for testing the individual cases, while hard coding less about registers in use. llvm-svn: 134330	2011-07-02 20:42:11 +00:00
Chandler Carruth	70fa55f478	FileCheck-ize another test. This one is more clear and runs fewer commands as a result. llvm-svn: 134329	2011-07-02 20:42:08 +00:00
Chandler Carruth	72358a4bf8	FileCheck-ize a test, no functionality changed. llvm-svn: 134328	2011-07-02 20:42:06 +00:00
Jakob Stoklund Olesen	b94d989634	Better diagnostics when inline asm fails to allocate. asm.c:2:7: error: ran out of registers during register allocation asm(""::"r"(0), "r"(1), "r"(2), "r"(3), "r"(4), "r"(5), "r"(6), "r"(7), "r"(8), "r"(9)); ^ llvm-svn: 134310	2011-07-02 07:17:37 +00:00
Eric Christopher	9689f96b1e	Be less specific about register allocation ordering. llvm-svn: 134308	2011-07-02 04:06:41 +00:00
Eric Christopher	7260817287	TargetConstant immediates won't be placed into registers so tighten up the valid constant check earlier. rdar://9692967 llvm-svn: 134286	2011-07-01 23:04:38 +00:00
Dan Gohman	c093f48834	Teach IVUsers to stop at non-affine expressions unless they are both outside the loop and reducible. This more completely hides them from LSR, which isn't usually able to do anything meaningful with non-affine expressions anyway, and this consequently hides them from SCEVExpander, which is acutely unprepared for non-affine expressions. Replace test/CodeGen/X86/lsr-nonaffine.ll with a new test that tests the new behavior. This works around the bug in PR10117 / rdar://problem/9633149, and is generally an improvement besides. llvm-svn: 134268	2011-07-01 22:05:19 +00:00
Jim Grosbach	461adc233e	ARMv7M vs. ARMv7E-M support. The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261	2011-07-01 21:12:19 +00:00
Eric Christopher	d369a9fe83	Add support for the 'j' immediate constraint. This is conditionalized on supporting the instruction that the constraint is for 'movw'. Part of rdar://9119939 llvm-svn: 134222	2011-07-01 01:00:07 +00:00
Eric Christopher	4bc6b7e1a6	Add support for the ARM 't' register constraint. And another testcase for the 'x' register constraint. Part of rdar://9119939 llvm-svn: 134220	2011-07-01 00:30:46 +00:00
Eric Christopher	d40f06b48f	Add support for the 'x' constraint. Part of rdar://9307836 and rdar://9119939 llvm-svn: 134215	2011-07-01 00:14:47 +00:00
Jakob Stoklund Olesen	8b22811785	Fix a problem with fast-isel return values introduced in r134018. We would put the return value from long double functions in the wrong register. This fixes gcc.c-torture/execute/conversion.c llvm-svn: 134205	2011-06-30 23:42:18 +00:00
Eric Christopher	2582061ec1	Add support for the 'h' constraint. Part of rdar://9119939 llvm-svn: 134203	2011-06-30 23:23:01 +00:00
Jim Grosbach	32d3b2625b	Thumb1 register to register MOV instruction is predicable. Fix a FIXME and allow predication (in Thumb2) for the T1 register to register MOV instructions. This allows some better codegen with if-conversion (as seen in the test updates), plus it lays the groundwork for pseudo-izing the tMOVCC instructions. llvm-svn: 134197	2011-06-30 22:10:46 +00:00
Jim Grosbach	8c1fb3c4e1	Pseudo-ize the t2LDMIA_RET instruction. It's just a t2LDMIA_UPD instruction with extra codegen properties, so it doesn't need the encoding information. As a side-benefit, we now correctly recognize for instruction printing as a 'pop' instruction. llvm-svn: 134173	2011-06-30 18:25:42 +00:00
Eric Christopher	7ce905754f	Fix a small thinko for constant i64 lock/orq optimization where we we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121	2011-06-30 00:48:30 +00:00
Devang Patel	66c4bc1dda	Revert r133953 for now. llvm-svn: 134116	2011-06-29 23:50:13 +00:00
Cameron Zwarich	2ffbcf9b96	In the ARM global merging pass, allow extraneous alignment specifiers. This pass already makes the assumption, which is correct on ARM, that a type's alignment is less than its alloc size. This improves codegen with Clang (which inserts a lot of extraneous alignment specifiers) and fixes <rdar://problem/9695089>. llvm-svn: 134106	2011-06-29 22:24:25 +00:00
Benjamin Kramer	d97872524b	Don't depend on the optimization reverted in r134067. llvm-svn: 134068	2011-06-29 14:07:18 +00:00
Benjamin Kramer	cc91642a94	Revert a part of r126557 which could create unschedulable DAGs. llvm-svn: 134067	2011-06-29 13:47:25 +00:00
Jakob Stoklund Olesen	7d3e1553d2	Clean up the handling of the x87 fp stack to make it more robust. Drop the FpMov instructions, use plain COPY instead. Drop the FpSET/GET instruction for accessing fixed stack positions. Instead use normal COPY to/from ST registers around inline assembly, and provide a single new FpPOP_RETVAL instruction that can access the return value(s) from a call. This is still necessary since you cannot tell from the CALL instruction alone if it returns anything on the FP stack. Teach fast isel to use this. This provides a much more robust way of handling fixed stack registers - we can tolerate arbitrary FP stack instructions inserted around calls and inline assembly. Live range splitting could sometimes break x87 code by inserting spill code in unfortunate places. As a bonus we handle floating point inline assembly correctly now. llvm-svn: 134018	2011-06-28 18:32:28 +00:00
Roman Divacky	736e37d9b9	Implement ISD::VAARG lowering on PPC32. llvm-svn: 134005	2011-06-28 15:30:42 +00:00
Jakob Stoklund Olesen	55a0ce1776	FileCheckize a couple of tests. Also and add a test for popping dead return values and avoid testing the spill precision. llvm-svn: 133997	2011-06-28 06:25:03 +00:00

... 3 4 5 6 7 ...

5122 Commits