llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00

Author	SHA1	Message	Date
Joel Jones	3d5ae56be4	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> llvm-svn: 158659	2012-06-18 14:51:32 +00:00
Manman Ren	7ffcd63dea	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Jakob Stoklund Olesen	6fd22231ba	Preserve <undef> flags in ARMExpandPseudo. This probably mostly shows up in bugpoint-generated code. llvm-svn: 158527	2012-06-15 17:46:54 +00:00
Manman Ren	797e146fae	Revert: test/CodeGen/ARM/iabs.ll in r158441 Sorry that I accidently checked in this file with my previous commit. llvm-svn: 158442	2012-06-14 06:04:02 +00:00
Manman Ren	e3471c0bdf	InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y). uno && ueq was converted to ueq, it should be converted to uno. llvm-svn: 158441	2012-06-14 05:57:42 +00:00
Andrew Trick	1115f0067a	sched: fix latency of memory dependence chain edges for consistency. For store->load dependencies that may alias, we should always use TrueMemOrderLatency, which may eventually become a subtarget hook. In effect, we should guarantee at least TrueMemOrderLatency on at least one DAG path from a store to a may-alias load. This should fix the standard mode as well as -enable-aa-sched-mi". llvm-svn: 158380	2012-06-13 02:39:03 +00:00
Chad Rosier	9734752b2c	[arm-fast-isel] Add support for -arm-long-calls. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 158368	2012-06-12 19:25:13 +00:00
Jakob Stoklund Olesen	b6d5f36f3d	Fix test that depends on register allocation. The test is really checking the prolog/epilog load/store multiple formation. llvm-svn: 158328	2012-06-11 21:14:28 +00:00
Bill Wendling	2320ff76d7	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Jakob Stoklund Olesen	ce0f9aef12	Don't run RAFast in the optimizing regalloc pipeline. The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. llvm-svn: 158242	2012-06-08 23:15:12 +00:00
Joel Jones	6fb688ff18	Revert commit r157966 llvm-svn: 157972	2012-06-05 00:47:21 +00:00
Joel Jones	fdd5284d04	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. <rdar://problem/11481151> llvm-svn: 157966	2012-06-04 23:38:57 +00:00
Nadav Rotem	6969a673e9	Remove the "-promote-elements" flag. This flag is now enabled by default. llvm-svn: 157925	2012-06-04 11:27:21 +00:00
Manman Ren	f8ff637b8d	ARM: add testing case for struct byval rdar://9877866 llvm-svn: 157876	2012-06-02 05:37:44 +00:00
Owen Anderson	060176473a	Make this testcase independent of register allocation. llvm-svn: 157761	2012-05-31 18:07:02 +00:00
Owen Anderson	38e510d831	Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention. llvm-svn: 157708	2012-05-30 18:54:50 +00:00
Owen Anderson	49ddb93c30	Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node. llvm-svn: 157707	2012-05-30 18:50:39 +00:00
Chad Rosier	80159b26a5	[arm-fast-isel] Add support for the llvm.frameaddress() intrinsic. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 157696	2012-05-30 17:23:22 +00:00
Evan Cheng	a46443d6db	Teach taildup to update livein set. rdar://11538365 llvm-svn: 157663	2012-05-30 00:42:39 +00:00
Chris Lattner	3ae033bbe9	These tests used intrinsics with the wrong prototype. They weren't caught because the old verifier just checked that something "was a pointer", but not that the pointee was correct. llvm-svn: 157544	2012-05-27 19:35:41 +00:00
Chad Rosier	d9dca3aef3	[arm-fast-isel] Add support for non-global callee. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 157336	2012-05-23 18:38:57 +00:00
Nuno Lopes	944814b41a	revert my previous patches that introduced an additional parameter to the objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking llvm-svn: 157255	2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen	3bee83c3a5	Transfer memory operands to the right instruction. They need to go on the PICLDR as the verifier points out. llvm-svn: 157151	2012-05-20 06:38:42 +00:00
Jim Grosbach	343a996ca5	Refactor data-in-code annotations. Use a dedicated MachO load command to annotate data-in-code regions. This is the same format the linker produces for final executable images, allowing consistency of representation and use of introspection tools for both object and executable files. Data-in-code regions are annotated via ".data_region"/".end_data_region" directive pairs, with an optional region type. data_region_directive := ".data_region" { region_type } region_type := "jt8" \| "jt16" \| "jt32" \| "jta32" end_data_region_directive := ".end_data_region" The previous handling of ARM-style "$d.*" labels was broken and has been removed. Specifically, it didn't handle ARM vs. Thumb mode when marking the end of the section. rdar://11459456 llvm-svn: 157062	2012-05-18 19:12:01 +00:00
Tim Northover	fc7737e7dc	Remove incorrect pattern for ARM SMML instruction. Patch by Meador Inge. llvm-svn: 156989	2012-05-17 13:12:13 +00:00
Jakob Stoklund Olesen	1da786c936	Enable sub-sub-register copy coalescing. It is now possible to coalesce weird skewed sub-register copies by picking a super-register class larger than both original registers. The included test case produces code like this: vld2.32 {d16, d17, d18, d19}, [r0]! vst2.32 {d18, d19, d20, d21}, [r0] We still perform interference checking as if it were a normal full copy join, so this is still quite conservative. In particular, the f1 and f2 functions in the included test case still have remaining copies because of false interference. llvm-svn: 156878	2012-05-15 23:31:35 +00:00
Chad Rosier	4a65a2a197	[fast-isel] Add support for selecting @llvm.trap(). llvm-svn: 156646	2012-05-11 21:33:49 +00:00
Chad Rosier	72bd34ca71	[fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup. llvm-svn: 156632	2012-05-11 19:40:25 +00:00
Chad Rosier	c20de37076	[fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg retval. Hoists check before emitting the call to avoid unnecessary work. rdar://11430407 PR12796 llvm-svn: 156628	2012-05-11 18:51:55 +00:00
Manman Ren	c82d0e71b9	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599	2012-05-11 01:30:47 +00:00
Manman Ren	5abcae1320	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Manman Ren	727b7d5e4c	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Nuno Lopes	e8880a9916	change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept. This commit only adds the parameter. Code taking advantage of it will follow. llvm-svn: 156473	2012-05-09 15:52:43 +00:00
Owen Anderson	8adb0322ce	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	e3d41b44cc	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	3cfe269707	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Bob Wilson	d2f6ff588b	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824	2012-04-30 16:53:34 +00:00
Lang Hames	7d83af4ed0	Fix the order of the operands in the llvm.fma intrinsic patterns for ARM, <rdar://problem/11325085>. llvm-svn: 155724	2012-04-27 18:51:24 +00:00
Evan Cheng	f35523d08a	Implement a bastardized ABI. llvm-svn: 155686	2012-04-27 02:11:10 +00:00
Tim Northover	876c151146	Use VLD1 in NEON extenting-load patterns instead of VLDR. On some cores it's a bad idea for performance to mix VFP and NEON instructions and since these patterns are NEON anyway, the NEON load should be used. llvm-svn: 155630	2012-04-26 08:46:29 +00:00
Evan Cheng	7f9bf43bcf	MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144 llvm-svn: 155470	2012-04-24 19:06:55 +00:00
James Molloy	44927f5296	Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON. llvm-svn: 154915	2012-04-17 08:18:00 +00:00
Jakob Stoklund Olesen	8bea8a2373	FileCheckize these tests. Add an extra test to ldr_post with an immediate increment. llvm-svn: 154859	2012-04-16 20:56:42 +00:00
Jakob Stoklund Olesen	8d6641a2b2	Disable code placement for this test. It makes it less sensitive to small changes in heuristics. llvm-svn: 154857	2012-04-16 20:49:06 +00:00
Chandler Carruth	728acc9bd9	Flip the new block-placement pass to be on by default. This is mostly to test the waters. I'd like to get results from FNT build bots and other bots running on non-x86 platforms. This feature has been pretty heavily tested over the last few months by me, and it fixes several of the execution time regressions caused by the inlining work by preventing inlining decisions from radically impacting block layout. I've seen very large improvements in yacr2 and ackermann benchmarks, along with the expected noise across all of the benchmark suite whenever code layout changes. I've analyzed all of the regressions and fixed them, or found them to be impossible to fix. See my email to llvmdev for more details. I'd like for this to be in 3.1 as it complements the inliner changes, but if any failures are showing up or anyone has concerns, it is just a flag flip and so can be easily turned off. I'm switching it on tonight to try and get at least one run through various folks' performance suites in case SPEC or something else has serious issues with it. I'll watch bots and revert if anything shows up. llvm-svn: 154816	2012-04-16 13:49:17 +00:00
Evan Cheng	3499593c7e	On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly. llvm-svn: 154689	2012-04-13 18:59:28 +00:00
Evan Cheng	f138fb4599	Add more fused mul+add/sub patterns. rdar://10139676 llvm-svn: 154484	2012-04-11 06:59:47 +00:00
Evan Cheng	b5291aea18	Match (fneg (fma) to vfnma. rdar://10139676 llvm-svn: 154469	2012-04-11 01:21:25 +00:00
Evan Cheng	eaf8eba8c4	Merge fma.ll into fusedMAC.ll llvm-svn: 154466	2012-04-11 01:03:11 +00:00
Owen Anderson	a8319713a4	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447	2012-04-10 22:46:53 +00:00
Evan Cheng	f9617f7f54	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Eric Christopher	ec1405e930	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417	2012-04-10 18:18:10 +00:00
Anton Korobeynikov	0fc5fe0430	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Evan Cheng	d9ff163215	Add proper checks. llvm-svn: 154379	2012-04-10 03:15:42 +00:00
Evan Cheng	5825e9dbf5	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Chad Rosier	a588421976	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Duncan Sands	cd52f3d447	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Jakob Stoklund Olesen	bb7b631def	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
James Molloy	3604b95957	An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090	2012-04-05 10:01:12 +00:00
Jakob Stoklund Olesen	e1ae4f161c	Pass the right sign to TLI->isLegalICmpImmediate. LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079	2012-04-05 03:10:56 +00:00
Jakob Stoklund Olesen	0419ed395c	Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr. A MOVCCr instruction can be commuted by inverting the condition. This can help reduce register pressure and remove unnecessary copies in some cases. <rdar://problem/11182914> llvm-svn: 154033	2012-04-04 18:23:42 +00:00
Jakob Stoklund Olesen	97f47c37b6	Allocate virtual registers in ascending order. This is just the fallback tie-breaker ordering, the main allocation order is still descending size. Patch by Shamil Kurmangaleev! llvm-svn: 153904	2012-04-02 22:30:39 +00:00
Lang Hames	dbc3175c89	During two-address lowering, rescheduling an instruction does not untie operands. Make TryInstructionTransform return false to reflect this. Fixes PR11861. llvm-svn: 153892	2012-04-02 19:58:43 +00:00
Nadav Rotem	2729f54295	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Evan Cheng	f3c23907f5	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Lang Hames	5685c98688	Change the constant in this testcase so that it results in a constant pool load. llvm-svn: 153704	2012-03-29 23:52:38 +00:00
Evan Cheng	72cf12e416	ARM has a peephole optimization which looks for a def / use pair. The def produces a 32-bit immediate which is consumed by the use. It tries to fold the immediate by breaking it into two parts and fold them into the immmediate fields of two uses. e.g movw r2, #40885 movt r3, #46540 add r0, r0, r3 => add.w r0, r0, #3019898880 add.w r0, r0, #30146560 ; However, this transformation is incorrect if the user produces a flag. e.g. movw r2, #40885 movt r3, #46540 adds r0, r0, r3 => add.w r0, r0, #3019898880 adds.w r0, r0, #30146560 Note the adds.w may not set the carry flag even if the original sequence would. rdar://11116189 llvm-svn: 153484	2012-03-26 23:31:00 +00:00
Eli Bendersky	3ef88c1833	Continue cleanup of LIT, getting rid of the remaining artifacts from dejagnu * Removed test/lib/llvm.exp - it is no longer needed * Deleted the dg.exp reading code from test/lit.cfg. There are no dg.exp files left in the test suite so this code is no longer required. test/lit.cfg is now much shorter and clearer * Removed a lot of duplicate code in lit.local.cfg files that need access to the root configuration, by adding a "root" attribute to the TestingConfig object. This attribute is dynamically computed to provide the same information as was previously provided by the custom getRoot functions. * Documented the config.root attribute in docs/CommandGuide/lit.pod llvm-svn: 153408	2012-03-25 09:02:19 +00:00
Andrew Trick	121e3e153c	Remove -enable-lsr-nested in time for 3.1. Tests cases have been removed but attached to open PR12330. llvm-svn: 153286	2012-03-22 22:42:45 +00:00
Chad Rosier	ce3a78be60	[fast-isel] Fold "urem x, pow2" -> "and x, pow2-1". This should fix the 271% execution-time regression for nsieve-bits on the ARMv7 -O0 -g nightly tester. This may also improve compile-time on architectures that would otherwise generate a libcall for urem (e.g., ARM) or fall back to the DAG selector. rdar://10810716 llvm-svn: 153230	2012-03-22 00:21:17 +00:00
Chad Rosier	c6293b3e41	Fix test case from r153135. llvm-svn: 153140	2012-03-20 21:49:54 +00:00
Anton Korobeynikov	ccc669ff8f	Perform mul combine when multiplying wiht negative constants. Patch by Weiming Zhao! This fixes PR12212 llvm-svn: 153049	2012-03-19 19:19:50 +00:00
Chad Rosier	e007850778	[fast-isel] Address Eli's comments for r152847. Specifically, add a test case and still allow immediate encoding, just not with cmn. rdar://11038907 llvm-svn: 152869	2012-03-15 22:54:20 +00:00
Jim Grosbach	3812c82b92	ARM case-insensitive checking for APSR_nzcv. rdar://11056591 llvm-svn: 152846	2012-03-15 21:34:14 +00:00
Lang Hames	7918b0b225	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00
Evan Cheng	155a7230b7	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to (i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675	2012-03-13 22:00:52 +00:00
Evan Cheng	f04f2e7a52	Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162	2012-03-06 23:33:32 +00:00
Jakob Stoklund Olesen	d4e1cb591a	Add <imp-def> operands when reloading into physregs. When an instruction only writes sub-registers, it is still necessary to add an <imp-def> operand for the super-register. When reloading into a virtual register, rewriting will add the operand, but when loading directly into a virtual register, the <imp-def> operand is still necessary. llvm-svn: 152095	2012-03-06 02:48:17 +00:00
Lang Hames	a49054ac9c	Split fpscr into two registers: FPSCR and FPSCR_NZCV. The fpscr register contains both flags (set by FP operations/comparisons) and control bits. The control bits (FPSCR) should be reserved, since they're always available and needn't be defined before use. The flag bits (FPSCR_NZCV) should like to be unreserved so they can be hoisted by MachineCSE. This fixes PR12165. llvm-svn: 152076	2012-03-06 00:19:55 +00:00
Sebastian Pop	e6eeed8151	updated patch for the ARM fused multiply add/sub In this update: - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2. - I kept setting .fpu=neon-vfpv4 code attribute because that is what the assembler understands. Patch by Ana Pazos <apazos@codeaurora.org> llvm-svn: 152036	2012-03-05 17:39:52 +00:00
Jakob Stoklund Olesen	fd29132e44	Use <def,undef> operands when spilling NEON bundles. MachineOperands that define part of a virtual register must have an <undef> flag if they are not intended as read-modify-write operands. The old trick of adding an <imp-def> operand doesn't work any longer. Fixes PR12177. llvm-svn: 152008	2012-03-04 18:40:30 +00:00
Bill Wendling	88f55b45b2	Do trivial CSE of dead BBs during codegen preparation. Some BBs can become dead after codegen preparation. If we delete them here, it could help enable tail-call optimizations later on. <rdar://problem/10256573> llvm-svn: 152002	2012-03-04 10:46:01 +00:00
Jakob Stoklund Olesen	cfef07fd05	Fix RA-dependent test. llvm-svn: 151958	2012-03-03 00:26:30 +00:00
Evan Cheng	31b407de17	Neuter the optimization I implemented with r107852 and r108258 which turn some floating point equality comparisons into integer ones with -ffast-math. The issue is the optimization causes +0.0 != -0.0. Now the optimization is only done when one side is known to be 0.0. The other side's sign bit is masked off for the comparison. rdar://10964603 llvm-svn: 151861	2012-03-01 23:27:13 +00:00
Chad Rosier	3bdd700004	Revert r151816 as Jim has the appropriate fix. llvm-svn: 151818	2012-03-01 17:41:19 +00:00
Chad Rosier	89b9ecae75	Fix testcases from r151807. llvm-svn: 151816	2012-03-01 17:31:30 +00:00
Jim Grosbach	e41072bbb4	Add missing triple for tests. Make darwin bots happier. llvm-svn: 151813	2012-03-01 17:30:32 +00:00
James Molloy	1038b57cac	Fix a codegen fault in which log2 or exp2 could be dead-code eliminated even though they could have sideeffects. Only allow log2/exp2 to be converted to an intrinsic if they are declared "readnone". llvm-svn: 151807	2012-03-01 14:32:18 +00:00
Daniel Dunbar	b448d31a6b	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Evan Cheng	d29a22e4b0	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Jakob Stoklund Olesen	c74b7b271e	Handle regmasks in MachineCSE. Don't attempt to extend physreg live ranges across calls. <rdar://problem/10942095> llvm-svn: 151610	2012-02-28 02:08:50 +00:00
Kristof Beyls	3f16b0ead0	test commit. removing unnecessary whitespace. llvm-svn: 151363	2012-02-24 13:52:45 +00:00
Jim Grosbach	4ff2fb2fbc	Thumb2 size reduction fix for tied operands of tMUL. The tied source operand of tMUL is the second source operand, not the first like every other two-address thumb instruction. Special case it in the size reduction pass to make sure we create the tMUL instruction properly. llvm-svn: 151315	2012-02-24 00:33:36 +00:00
Dan Gohman	8da4093a80	When emitting a cmp with 0 for a lowered select, mask out the high bits of the value carying the boolean condition, as their contents are undefined. This fixes rdar://10887484. llvm-svn: 151310	2012-02-24 00:09:36 +00:00
Jakob Stoklund Olesen	3809cf9ffe	Make tests less sensitive to scheduling changes. llvm-svn: 151260	2012-02-23 17:19:34 +00:00
Anton Korobeynikov	fb863cd279	Fix to make sure that a comdat group gets generated correctly for a static member of instantiated C++ templates. Patch by Kristof Beyls! llvm-svn: 151250	2012-02-23 10:36:04 +00:00
Evan Cheng	9d9b58cc0d	Canonicalize (srl (bswap x), 16) to (rotr (bswap x), 16) if the high 16 bits of x are zero. This optimizes rev + lsr 16 to rev16. rdar://10750814 llvm-svn: 151230	2012-02-23 02:58:19 +00:00
Evan Cheng	d18a688213	Optimize a couple of common patterns involving conditional moves where the false value is zero. Instead of a cmov + op, issue an conditional op instead. e.g. cmp r9, r4 mov r4, #0 moveq r4, #1 orr lr, lr, r4 should be: cmp r9, r4 orreq lr, lr, #1 That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y). It's possible to extend this to ADD and SUB but I don't think they are common. rdar://8659097 llvm-svn: 151224	2012-02-23 01:19:06 +00:00
Evan Cheng	9759637dc1	Proper support for a bastardized darwin-eabi hybird ABI. llvm-svn: 151083	2012-02-21 20:46:00 +00:00
Chad Rosier	7867a0bd92	[fast-isel] Add support for returning non-legal types with no sign- or zero- entend flag. llvm-svn: 150774	2012-02-17 01:21:28 +00:00

1 2 3 4 5 ...

1347 Commits