llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
James Molloy	220547e625	Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM. llvm-svn: 164685	2012-09-26 09:48:32 +00:00
Bill Wendling	28f1f0139e	Generate an error message instead of asserting or segfaulting when we have a scalar-to-vector conversion that we cannot handle. For instance, when an invalid constraint is used in an inline asm statement. <rdar://problem/12284092> llvm-svn: 164662	2012-09-26 06:16:18 +00:00
Bill Wendling	9a7fa167f9	Generate an error message instead of asserting or segfaulting when we have a scalar-to-vector conversion that we cannot handle. For instance, when an invalid constraint is used in an inline asm statement. <rdar://problem/12284092> llvm-svn: 164657	2012-09-26 04:04:19 +00:00
Chad Rosier	a58913fc00	[fast-isel] Fallback to SelectionDAG isel if we require strict alignment for non-aligned i32 loads/stores. rdar://12304911 llvm-svn: 164381	2012-09-21 16:58:35 +00:00
NAKAMURA Takumi	a7a5a7cd7d	llvm/test/CodeGen/ARM/fast-isel.ll: Fix possible typos, s/@unaligned_i16_store/@unaligned_i16_load/g. I guess this had apparently passed in +Asserts possibly due to verborsity. llvm-svn: 164350	2012-09-21 01:15:05 +00:00
Chad Rosier	d80fb0b13d	Testcase does not need to be this strict. llvm-svn: 164347	2012-09-21 00:47:08 +00:00
Chad Rosier	c15c6508e0	Add newline. llvm-svn: 164346	2012-09-21 00:43:18 +00:00
Chad Rosier	8a1b0217f6	[fast-isel] Fallback to SelectionDAG isel if we require strict alignment for non-halfword-aligned i16 loads/stores. rdar://12304911 llvm-svn: 164345	2012-09-21 00:41:42 +00:00
Jim Grosbach	135898ebe3	ARM: Use a dedicated intrinsic for vector bitwise select. The expression based expansion too often results in IR level optimizations splitting the intermediate values into separate basic blocks, preventing the formation of the VBSL instruction as the code author intended. In particular, LICM would often hoist part of the computation out of a loop. rdar://11011471 llvm-svn: 164340	2012-09-21 00:18:20 +00:00
Jakob Stoklund Olesen	801e92ce89	Ignore PHI-defs for -new-coalescer interference checks. A PHI can't create interference on its own. If two live ranges interfere at a PHI, they must also interfere when leaving one of the PHI predecessors. llvm-svn: 164330	2012-09-20 23:08:42 +00:00
Evan Cheng	959ad65636	Try to make these tests more portable. llvm-svn: 164320	2012-09-20 21:35:21 +00:00
Jakob Stoklund Olesen	557b4e64be	Resolve conflicts involving dead vector lanes for -new-coalescer. A common coalescing conflict in vector code is lane insertion: %dst = FOO %src = BAR %dst:ssub0 = COPY %src The live range of %src interferes with the ssub0 lane of %dst, but that lane is never read after %src would have clobbered it. That makes it safe to merge the live ranges and eliminate the COPY: %dst = FOO %dst:ssub0 = BAR This patch teaches the new coalescer to resolve conflicts where dead vector lanes would be clobbered, at least as long as the clobbered vector lanes don't escape the basic block. llvm-svn: 164250	2012-09-19 21:29:18 +00:00
Evan Cheng	1a3416521f	MOVi16 (movw) is only legal on cpus with V6T2 support. rdar://12300648 llvm-svn: 164169	2012-09-18 21:24:16 +00:00
James Molloy	4cb3751b3e	More domain conversion; convert VFP VMOVS to NEON instructions in more cases - when we may clobber the other S-lane by converting an S to a D instruction, make an effort to work out if the S lane is clobberable or not. llvm-svn: 164114	2012-09-18 08:31:15 +00:00
Evan Cheng	82c85585f9	Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte aligned address. Based on patch by David Peixotto. Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment hints. rdar://12090772, rdar://12238782 llvm-svn: 164089	2012-09-18 01:42:45 +00:00
Jakob Stoklund Olesen	b761721812	Merge into undefined lanes under -new-coalescer. Add LIS::pruneValue() and extendToIndices(). These two functions are used by the register coalescer when merging two live ranges requires more than a trivial value mapping as supported by LiveInterval::join(). The pruneValue() function can remove the part of a value number that is going to conflict in join(). Afterwards, extendToIndices can restore the live range, using any new dominating value numbers and updating the SSA form. Use this complex value mapping to support merging a register into a vector lane that has a conflicting value, but the clobbered lane is undef. llvm-svn: 164074	2012-09-17 23:03:25 +00:00
Silviu Baranga	aa267976b5	Removed the VMLxForwarding feature for the Cortex-A15 target. llvm-svn: 164030	2012-09-17 14:10:54 +00:00
Silviu Baranga	11ff2a551d	This patch introduces A15 as a target in LLVM. llvm-svn: 163803	2012-09-13 15:05:10 +00:00
Kristof Beyls	76d939497b	Fix constant folding through bitcasts by no longer relying on undefined behaviour (converting NaN values between float and double). SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget); should not be used when Val is not a simple constant (as the comment in SelectionDAG.h indicates). This patch avoids using this function when folding an unknown constant through a bitcast, where it cannot be guaranteed that Val will be a simple constant. llvm-svn: 163703	2012-09-12 11:25:02 +00:00
Jakob Stoklund Olesen	ab77839866	Don't attempt to use flags from predicated instructions. The ARM backend can eliminate cmp instructions by reusing flags from a nearby sub instruction with similar arguments. Don't do that if the sub is predicated - the flags are not written unconditionally. <rdar://problem/12263428> llvm-svn: 163535	2012-09-10 19:17:25 +00:00
James Molloy	fe38f1d2b0	Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types. llvm-svn: 163511	2012-09-10 14:01:21 +00:00
Craig Topper	eb1db45675	Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458	2012-09-08 04:58:43 +00:00
James Molloy	791ec0aa52	Improve codegen for BUILD_VECTORs on ARM. If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base. llvm-svn: 163304	2012-09-06 09:55:02 +00:00
James Molloy	90179e600b	Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer. llvm-svn: 163298	2012-09-06 09:16:01 +00:00
Jakob Stoklund Olesen	0324528c8c	Use predication instead of pseudo-opcodes when folding into MOVCC. Now that it is possible to dynamically tie MachineInstr operands, predicated instructions are possible in SSA form: %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR Becomes a predicated SUBri with a tied imp-use: SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0> This means that any instruction that is safe to move can be folded into a MOVCC, and the *CC pseudo-instructions are no longer needed. The test case changes reflect that Thumb2SizeReduce recognizes the predicated instructions. It didn't understand the pseudos. llvm-svn: 163274	2012-09-05 23:58:02 +00:00
Tim Northover	4e03b89c79	Strip old MachineInstrs after we know we can put them back. Previous patch accidentally decided it couldn't convert a VFP to a NEON instruction after it had already destroyed the old one. Not a good move. llvm-svn: 163230	2012-09-05 18:37:53 +00:00
Silviu Baranga	6f46bb1705	Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. llvm-svn: 163203	2012-09-05 08:57:21 +00:00
Arnold Schwaighofer	d606c6fcdf	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136	2012-09-04 14:37:49 +00:00
Nadav Rotem	d1815a0763	Not all targets have efficient ISel code generation for select instructions. For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. llvm-svn: 163093	2012-09-02 12:10:19 +00:00
Nadav Rotem	3425b10f0f	Generate better select code by allowing the target to use scalar select, and not sign-extend. llvm-svn: 163086	2012-09-02 08:20:07 +00:00
Owen Anderson	27ba45c764	Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode. llvm-svn: 163051	2012-09-01 06:04:27 +00:00
Jakob Stoklund Olesen	eb687a399c	Fix a couple of typos in EmitAtomic. Thumb2 instructions are mostly constrained to rGPR, not tGPR which is for Thumb1. rdar://problem/12203728 llvm-svn: 162968	2012-08-31 02:08:34 +00:00
Nadav Rotem	17b9d8cba2	Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2). rdar://12201387 llvm-svn: 162926	2012-08-30 19:17:29 +00:00
Tim Northover	627f946e05	Add support for moving pure S-register to NEON pipeline if desired llvm-svn: 162898	2012-08-30 10:17:45 +00:00
Jush Lu	5a78c68e1d	[arm-fast-isel] Add support for ARM PIC. llvm-svn: 162823	2012-08-29 02:41:21 +00:00
Bill Wendling	d49e183a6f	Make sure we add the predicate after all of the registers are added. <rdar://problem/12183003> llvm-svn: 162703	2012-08-27 22:12:44 +00:00
Stepan Dyatkovskiy	56ead97c8d	Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here. llvm-svn: 162354	2012-08-22 09:33:55 +00:00
Tim Northover	ff29cd0939	Add correct set of regression tests for r162094 commit. llvm-svn: 162276	2012-08-21 12:43:03 +00:00
Jakob Stoklund Olesen	4403f82dbf	Add a missing def flag. * Bad machine code: Explicit definition marked as use * - function: test_cos - basic block: BB#0 L.entry (0x7ff2a2024fd0) - instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def> - operand 0: %D11 llvm-svn: 162247	2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen	a3264c242c	Don't add CFG edges for redundant conditional branches. IR that hasn't been through SimplifyCFG can look like this: br i1 %b, label %r, label %r Make sure we don't create duplicate Machine CFG edges in this case. Fix the machine code verifier to accept conditional branches with a single CFG edge. llvm-svn: 162230	2012-08-20 21:39:52 +00:00
Stepan Dyatkovskiy	0a4850d1ef	Forget to add testcase for r162195. Sorry. llvm-svn: 162196	2012-08-20 08:03:18 +00:00
Jakob Stoklund Olesen	e78d4a5b08	Also combine zext/sext into selects for ARM. This turns common i1 patterns into predicated instructions: (add (zext cc), x) -> (select cc (add x, 1), x) (add (sext cc), x) -> (select cc (add x, -1), x) For a function like: unsigned f(unsigned s, int x) { return s + (x>0); } We now produce: cmp r1, #0 it gt addgt.w r0, r0, #1 Instead of: movs r2, #0 cmp r1, #0 it gt movgt r2, #1 add r0, r2 llvm-svn: 162177	2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen	ece4a53017	Also pass logical ops to combineSelectAndUse. Add these transformations to the existing add/sub ones: (and (select cc, -1, c), x) -> (select cc, x, (and, x, c)) (or (select cc, 0, c), x) -> (select cc, x, (or, x, c)) (xor (select cc, 0, c), x) -> (select cc, x, (xor, x, c)) The selects can then be transformed to a single predicated instruction by peephole. This transformation will make it possible to eliminate the ISD::CAND, COR, and CXOR custom DAG nodes. llvm-svn: 162176	2012-08-18 21:25:16 +00:00
Jakob Stoklund Olesen	40eb30013e	Avoid folding ADD instructions with FI operands. PEI can't handle the pseudo-instructions. This can be removed when the pseudo-instructions are replaced by normal predicated instructions. Fixes PR13628. llvm-svn: 162130	2012-08-17 20:55:34 +00:00
Tim Northover	1de091468c	Implement NEON domain switching for scalar <-> S-register vmovs on ARM llvm-svn: 162094	2012-08-17 11:32:52 +00:00
Jakob Stoklund Olesen	88217b055d	Add ADD and SUB to the predicable ARM instructions. It is not my plan to duplicate the entire ARM instruction set with predicated versions. We need a way of representing predicated instructions in SSA form without requiring a separate opcode. Then the pseudo-instructions can go away. llvm-svn: 162061	2012-08-16 23:21:55 +00:00
Jush Lu	767c82d4e0	[arm-fast-isel] Add support for fastcc. Without fastcc support, the caller just falls through to CallingConv::C for fastcc, but callee still uses fastcc, this inconsistency of calling convention is a problem, and fastcc support can fix it. llvm-svn: 162013	2012-08-16 05:15:53 +00:00
Jakob Stoklund Olesen	55aee8b58a	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Bill Wendling	8c8ad77086	Rework test so that it reproduces the error without the horrible flag. llvm-svn: 161989	2012-08-15 21:10:18 +00:00
Evan Cheng	625c0ca5ee	Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029 llvm-svn: 161962	2012-08-15 17:44:53 +00:00
Anton Korobeynikov	d13403fbd1	The names of VFP variants of half-to-float conversion instructions were reversed. This leads to wrong codegen for float-to-half conversion intrinsics which are used to support storage-only fp16 type. NEON variants of same instructions are fine. llvm-svn: 161907	2012-08-14 23:36:01 +00:00
Nadav Rotem	eb22b069bb	During the CodeGenPrepare we often lower intrinsics (such as objsize) and allow some optimizations to turn conditional branches into unconditional. This commit adds a simple control-flow optimization which merges two consecutive basic blocks which are connected by a single edge. This allows the codegen to operate on larger basic blocks. rdar://11973998 llvm-svn: 161852	2012-08-14 05:19:07 +00:00
NAKAMURA Takumi	79503bef02	llvm/test/CodeGen/ARM/floorf.ll: Add explicit -mtriple=arm-unknown-unknown, or it fails on msvc. llvm-svn: 161825	2012-08-14 00:56:06 +00:00
Owen Anderson	c5b77c0317	Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC. llvm-svn: 161807	2012-08-13 23:32:49 +00:00
Nadav Rotem	03c4d5f036	Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user. rdar://11876519 llvm-svn: 161775	2012-08-13 18:52:44 +00:00
Eric Christopher	3aea549423	Add support for the %H output modifier. Patch by Weiming Zhao. llvm-svn: 161768	2012-08-13 18:18:52 +00:00
Tim Northover	7acf444c11	Add test for previous commit correcting NEON load patterns. llvm-svn: 161750	2012-08-13 10:38:45 +00:00
Arnold Schwaighofer	c751a25aed	Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736	2012-08-12 05:11:56 +00:00
Arnold Schwaighofer	f3d4d73157	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Nadav Rotem	63ebca3806	Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly handle the cases where the memory value type was illegal. PR 13111. llvm-svn: 161565	2012-08-09 01:56:44 +00:00
Bob Wilson	b173ddc53e	Add test triples to fix win32 failures. Revert workaround from r161292. I don't have a win32 system to test, so hopefully I got them all fixed here. llvm-svn: 161519	2012-08-08 20:31:37 +00:00
Anton Korobeynikov	6dd5c91aae	Add stack spill / reload instructions for DTriple and DQuad register classes, which were missed for no reason. This fixes PR13377 llvm-svn: 161299	2012-08-04 13:16:12 +00:00
Bob Wilson	9f6e25017a	Refactor and check "onlyReadsMemory" before optimizing builtins. This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282	2012-08-03 23:29:17 +00:00
Jush Lu	ca5d760bf2	[arm-fast-isel] Add support for shl, lshr, and ashr. llvm-svn: 161230	2012-08-03 02:37:48 +00:00
Jakob Stoklund Olesen	ed1a4d695a	Clear kill flags in removeCopyByCommutingDef(). We are extending live ranges, so kill flags are not accurate. They aren't needed until they are recomputed after RA anyway. <rdar://problem/11950722> llvm-svn: 161023	2012-07-31 02:47:24 +00:00
Jush Lu	54c7329b88	[arm-fast-isel] Add support for vararg function calls. llvm-svn: 160500	2012-07-19 09:49:00 +00:00
Chandler Carruth	5d1c4f0605	Fix a somewhat nasty crasher in PR13378. This crashes inside of LiveIntervals due to the two-addr pass generating bogus MI code. The crux of the issue was a loop nesting problem. The intent of the code which attempts to transform instructions before converting them to two-addr form is to defer and reprocess any transformed instructions as the second processing is likely to have more opportunities to coalesce copies, etc. Unfortunately, there was one section of processing that was not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this rewriting proceeded, not only did it occur early, it removed the bits of information needed for the deferred processing to correctly generate the necessary two address form (specifically inserting a copy), but didn't trigger any immediate assertions and produced what appeared to be already valid two-address from code. Thus, the assertion only fired much later in the pipeline. The fix is to hoist the transformation logic up layer to where it can more firmly defer all further processing, and to teach the normal processing to handle an edge case previously handled as part of the transformation logic. This edge case (already matched tied register operands) needs to not defer any steps. As has been brought up repeatedly in the process: wow does this code need refactoring. I may squeeze in some time to at least bring sanity to this loop... but wow... =] Thanks to Jakob for helpful hints on the way here, and the review. llvm-svn: 160443	2012-07-18 18:58:22 +00:00
Joel Jones	4ce75efda5	More replacing of target-dependent intrinsics with target-indepdent intrinsics. The second instruction(s) to be handled are the vector versions of count set bits (ctpop). The changes here are to clang so that it generates a target independent vector ctpop when it sees an ARM dependent vector bits set count. The changes in llvm are to match the target independent vector ctpop and in VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM dependent vector pop counts with target-independent ctpops. There are also changes to an existing test case in llvm for ARM vector count instructions and to a test for the bitcode upgrade. <rdar://problem/11892519> There is deliberately no test for the change to clang, as so far as I know, no consensus has been reached regarding how to test neon instructions in clang; q.v. <rdar://problem/8762292> llvm-svn: 160410	2012-07-18 00:02:16 +00:00
Joel Jones	12ea066486	This is one of the first steps at moving to replace target-dependent intrinsics with target-indepdent intrinsics. The first instruction(s) to be handled are the vector versions of count leading zeros (ctlz). The changes here are to clang so that it generates a target independent vector ctlz when it sees an ARM dependent vector ctlz. The changes in llvm are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM dependent vector ctlzs with target-independent ctlzs. There are also changes to an existing test case in llvm for ARM vector count instructions and a new test for the bitcode upgrade. <rdar://problem/11831778> There is deliberately no test for the change to clang, as so far as I know, no consensus has been reached regarding how to test neon instructions in clang; q.v. <rdar://problem/8762292> llvm-svn: 160200	2012-07-13 23:25:25 +00:00
Manman Ren	0479bff92d	ARM: Fix optimizeCompare to correctly check safe condition. It is safe if CPSR is killed or re-defined. When we are done with the basic block, check whether CPSR is live-out. Do not optimize away cmp if CPSR is live-out. llvm-svn: 160090	2012-07-11 22:51:44 +00:00
Owen Anderson	e0c4f94d45	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
NAKAMURA Takumi	bd5a31d598	Revert r159804, "[arm-fast-isel] Add support for vararg function calls." It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts. llvm-svn: 159817	2012-07-06 11:12:44 +00:00
Jush Lu	4fdc23801c	[arm-fast-isel] Add support for vararg function calls. llvm-svn: 159804	2012-07-06 03:02:37 +00:00
Chandler Carruth	5d3a0ce4e5	Fix the remaining TCL-style quotes found in the testsuite. This is another mechanical change accomplished though the power of terrible Perl scripts. I have manually switched some "s to 's to make escaping simpler. While I started this to fix tests that aren't run in all configurations, the massive number of tests is due to a really frustrating fragility of our testing infrastructure: things like 'grep -v', 'not grep', and 'expected failures' can mask broken tests all too easily. Essentially, I'm deeply disturbed that I can change the testsuite so radically without causing any change in results for most platforms. =/ llvm-svn: 159547	2012-07-02 19:09:46 +00:00
Chandler Carruth	d200829a4f	Convert the uses of '\|&' to use '2>&1 \|' instead, which works on old versions of Bash. In addition, I can back out the change to the lit built-in shell test runner to support this. This should fix the majority of fallout on Darwin, but I suspect there will be a few straggling issues. llvm-svn: 159544	2012-07-02 18:37:59 +00:00
Chandler Carruth	8a358b3669	Convert all tests using TCL-style quoting to use shell-style quoting. This was done through the aid of a terrible Perl creation. I will not paste any of the horrors here. Suffice to say, it require multiple staged rounds of replacements, state carried between, and a few nested-construct-parsing hacks that I'm not proud of. It happens, by luck, to be able to deal with all the TCL-quoting patterns in evidence in the LLVM test suite. If anyone is maintaining large out-of-tree test trees, feel free to poke me and I'll send you the steps I used to convert things, as well as answer any painful questions etc. IRC works best for this type of thing I find. Once converted, switch the LLVM lit config to use ShTests the same as Clang. In addition to being able to delete large amounts of Python code from 'lit', this will also simplify the entire test suite and some of lit's architecture. Finally, the test suite runs 33% faster on Linux now. ;] For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s llvm-svn: 159525	2012-07-02 12:47:22 +00:00
Rafael Espindola	dd05a97f8e	Now that RegistersDefinedFromSameValue handles one instruction being an implicit_def, the other instruction can be anything, including instructions that define multiple values. Be careful about that and don't assume what operand 0 is. Fixes pr13249. llvm-svn: 159509	2012-07-01 17:08:01 +00:00
Manman Ren	bd339c27e1	ARM: update peephole optimization. More condition codes are included when deciding whether to remove cmp after a sub instruction. Specifically, we extend from GE\|LT\|GT\|LE to GE\|LT\|GT\|LE\|HS\|LS\|HI\|LO\|EQ\|NE. If we have "sub a, b; cmp b, a; movhs", we should be able to replace with "sub a, b; movls". rdar: 11725965 llvm-svn: 159166	2012-06-25 21:49:38 +00:00
Pete Cooper	9f89f00988	DAG legalisation can now handle illegal fma vector types by scalarisation llvm-svn: 159092	2012-06-24 00:05:44 +00:00
Hans Wennborg	8c011bd43a	Extend the IL for selecting TLS models (PR9788) This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. llvm-svn: 159077	2012-06-23 11:37:03 +00:00
Evan Cheng	2d498dc096	(sub X, imm) gets canonicalized to (add X, -imm) There are patterns to handle immediates when they fit in the immediate field. e.g. %sub = add i32 %x, -123 => sub r0, r0, #123 Add patterns to catch immediates that do not fit but should be materialized with a single movw instruction rather than movw + movt pair. e.g. %sub = add i32 %x, -65535 => movw r1, #65535 sub r0, r0, r1 rdar://11726136 llvm-svn: 159057	2012-06-23 00:29:06 +00:00
Lang Hames	7d298105e5	Rename fp-op fusion option (yet again) for compatibility with GCC option. llvm-svn: 159042	2012-06-22 22:31:00 +00:00
Andrew Trick	764fe3bfef	ARM scheduling fix: compute predicated implicit use properly. Minor drive by fix to cleanup latency computation. Calling getOperandLatency with a deliberately incorrect operand index does not give you the latency you want. llvm-svn: 158959	2012-06-22 02:50:31 +00:00
Lang Hames	68cf87e3ef	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Lang Hames	662801dbc8	Add a missing llvm.fma -> VFNMS pattern to the ARM backend. llvm-svn: 158902	2012-06-21 06:10:00 +00:00
Evan Cheng	5e3a175c65	Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and _umodsi3 libcalls if they have the same arguments. This optimization was apparently broken if one of the node was replaced in place. rdar://11714607 llvm-svn: 158900	2012-06-21 05:56:05 +00:00
Lang Hames	f0b9601a6d	Add DAG-combines for aggressive FMA formation. This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757	2012-06-19 22:51:23 +00:00
Manman Ren	6d2895c506	ARM: use NOEN loads and stores if possible when handling struct byval. This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684	2012-06-18 22:23:48 +00:00
Joel Jones	3d5ae56be4	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> llvm-svn: 158659	2012-06-18 14:51:32 +00:00
Manman Ren	7ffcd63dea	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Jakob Stoklund Olesen	6fd22231ba	Preserve <undef> flags in ARMExpandPseudo. This probably mostly shows up in bugpoint-generated code. llvm-svn: 158527	2012-06-15 17:46:54 +00:00
Manman Ren	797e146fae	Revert: test/CodeGen/ARM/iabs.ll in r158441 Sorry that I accidently checked in this file with my previous commit. llvm-svn: 158442	2012-06-14 06:04:02 +00:00
Manman Ren	e3471c0bdf	InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y). uno && ueq was converted to ueq, it should be converted to uno. llvm-svn: 158441	2012-06-14 05:57:42 +00:00
Andrew Trick	1115f0067a	sched: fix latency of memory dependence chain edges for consistency. For store->load dependencies that may alias, we should always use TrueMemOrderLatency, which may eventually become a subtarget hook. In effect, we should guarantee at least TrueMemOrderLatency on at least one DAG path from a store to a may-alias load. This should fix the standard mode as well as -enable-aa-sched-mi". llvm-svn: 158380	2012-06-13 02:39:03 +00:00
Chad Rosier	9734752b2c	[arm-fast-isel] Add support for -arm-long-calls. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 158368	2012-06-12 19:25:13 +00:00
Jakob Stoklund Olesen	b6d5f36f3d	Fix test that depends on register allocation. The test is really checking the prolog/epilog load/store multiple formation. llvm-svn: 158328	2012-06-11 21:14:28 +00:00
Bill Wendling	2320ff76d7	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Jakob Stoklund Olesen	ce0f9aef12	Don't run RAFast in the optimizing regalloc pipeline. The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. llvm-svn: 158242	2012-06-08 23:15:12 +00:00
Joel Jones	6fb688ff18	Revert commit r157966 llvm-svn: 157972	2012-06-05 00:47:21 +00:00
Joel Jones	fdd5284d04	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. <rdar://problem/11481151> llvm-svn: 157966	2012-06-04 23:38:57 +00:00

1 2 3 4 5 ...

1435 Commits