llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Matt Beaumont-Gay	dd419ee85c	Add a 'using' declaration to suppress GCC's -Woverloaded-virtual while we decide what pattern we want to follow in the future. llvm-svn: 169561	2012-12-06 23:15:36 +00:00
Evan Cheng	4cdc6c4eef	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 llvm-svn: 169536	2012-12-06 19:13:27 +00:00
Evan Cheng	c1db873871	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459	2012-12-06 01:28:01 +00:00
Chandler Carruth	a98c778194	Sort includes for all of the .h files under the 'lib' tree. These were missed in the first pass because the script didn't yet handle include guards. Note that the script is now able to handle all of these headers without manual edits. =] llvm-svn: 169224	2012-12-04 07:12:27 +00:00
Silviu Baranga	d93d64a5fd	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. llvm-svn: 168886	2012-11-29 14:41:25 +00:00
Benjamin Kramer	bd65c85dc1	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. llvm-svn: 168809	2012-11-28 20:55:10 +00:00
Dmitri Gribenko	610d06f00e	Use empty parens for empty function parameter list instead of '(void)'. llvm-svn: 168049	2012-11-15 16:51:49 +00:00
Stepan Dyatkovskiy	ece4c2a9c1	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. llvm-svn: 166273	2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy	09c6b0a273	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018	2012-10-16 07:16:47 +00:00
Stepan Dyatkovskiy	5182bb8695	Issue description: SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted before it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. llvm-svn: 165616	2012-10-10 11:37:36 +00:00
Arnold Schwaighofer	d606c6fcdf	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136	2012-09-04 14:37:49 +00:00
Nadav Rotem	d1815a0763	Not all targets have efficient ISel code generation for select instructions. For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. llvm-svn: 163093	2012-09-02 12:10:19 +00:00
Jakob Stoklund Olesen	abf0a9ec82	Remove the CAND/COR/CXOR custom ISD nodes and their select code. These nodes are no longer needed because the peephole pass can fold CMOV+AND into ANDCC etc. llvm-svn: 162179	2012-08-18 21:49:50 +00:00
Arnold Schwaighofer	c751a25aed	Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736	2012-08-12 05:11:56 +00:00
Craig Topper	4d9cbceefd	Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed. llvm-svn: 161735	2012-08-12 03:16:37 +00:00
Arnold Schwaighofer	f3d4d73157	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Bob Wilson	d1eefbeac2	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Bill Wendling	2320ff76d7	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Manman Ren	2a5898a76b	ARM: properly handle alignment for struct byval. Factor out the expansion code into a function. This change is to be enabled in clang. rdar://9877866 llvm-svn: 157830	2012-06-01 19:33:18 +00:00
Manman Ren	f30d4765c5	ARM: support struct byval in llvm We handle struct byval by inserting a pseudo op, which will be expanded to a loop at ExpandISelPseudos. A separate patch for clang will be submitted to enable struct byval. rdar://9877866 llvm-svn: 157793	2012-06-01 02:44:42 +00:00
Justin Holewinski	77c4679dae	Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall to pass around a struct instead of a large set of individual values. This cleans up the interface and allows more information to be added to the struct for future targets without requiring changes to each and every target. NV_CONTRIB llvm-svn: 157479	2012-05-25 16:35:28 +00:00
Hans Wennborg	b3c41d012d	Make ARM and Mips use TargetMachine::getTLSModel() This moves the logic for selecting a TLS model to a single place, instead of the previous three (ARM, Mips, and X86 which already uses this function). llvm-svn: 156162	2012-05-04 09:40:39 +00:00
Evan Cheng	5825e9dbf5	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Rafael Espindola	88a1aeb123	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Craig Topper	0534d071b7	Reorder includes to match coding standards. Fix an issue or two exposed by that. llvm-svn: 152978	2012-03-17 07:33:42 +00:00
Lang Hames	7918b0b225	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00
Evan Cheng	c5ead6c49e	Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call. llvm-svn: 151645	2012-02-28 18:51:51 +00:00
Daniel Dunbar	b448d31a6b	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Evan Cheng	d29a22e4b0	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Evan Cheng	d18a688213	Optimize a couple of common patterns involving conditional moves where the false value is zero. Instead of a cmov + op, issue an conditional op instead. e.g. cmp r9, r4 mov r4, #0 moveq r4, #1 orr lr, lr, r4 should be: cmp r9, r4 orreq lr, lr, #1 That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y). It's possible to extend this to ADD and SUB but I don't think they are common. rdar://8659097 llvm-svn: 151224	2012-02-23 01:19:06 +00:00
Craig Topper	3ed929de0a	Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified. llvm-svn: 151134	2012-02-22 05:59:10 +00:00
Bob Wilson	0e88464871	Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602> The EmitBasePointerRecalculation function has 2 problems, one minor and one fatal. The minor problem is that it inserts the code at the setjmp instead of in the dispatch block. The fatal problem is that at the point where this code runs, we don't know whether there will be a base pointer, so the entire function is a no-op. The base pointer recalculation needs to be handled as it was before, by inserting a pseudo instruction that gets expanded late. Most of the support for the old approach is still here, but it no longer has any connection to the eh_sjlj_dispatchsetup intrinsic. Clean up the parts related to the intrinsic and just generate the pseudo instruction directly. llvm-svn: 144781	2011-11-16 07:11:57 +00:00
Evan Cheng	47d8f8af84	Add vmov.f32 to materialize f32 immediate splats which cannot be handled by integer variants. rdar://10437054 llvm-svn: 144608	2011-11-15 02:12:34 +00:00
Lang Hames	ba63f9da8b	Fixed parameter name. llvm-svn: 143594	2011-11-02 23:37:04 +00:00
Lang Hames	ceec8ec67e	Try to lower memset/memcpy/memmove to vector instructions on ARM where the alignment permits. llvm-svn: 143582	2011-11-02 22:52:45 +00:00
Bill Wendling	1f7c16c63f	Take the code that was emitted for the llvm.eh.dispatch.setup intrinsic and emit it with the new SjLj emitter stuff. This way there's no need to emit that kind-of-hacky intrinsic. llvm-svn: 141419	2011-10-07 22:08:37 +00:00
Bill Wendling	3c68e1d212	Refactor some of the code that sets up the entry block for SjLj EH. No functionality change. llvm-svn: 141323	2011-10-06 22:18:16 +00:00
Bill Wendling	834bb83a41	Check-pointing the new SjLj EH lowering. This code will replace the version in ARMAsmPrinter.cpp. It creates a new machine basic block, which is the dispatch for the return from a longjmp call. It then shoves the address of that machine basic block into the correct place in the function context so that the EH runtime will jump to it directly instead of having to go through a compare-and-jump-to-the-dispatch bit. This should be more efficient in the common case. llvm-svn: 141031	2011-10-03 21:25:38 +00:00
Jim Grosbach	d94ffffc87	ARM fix encoding of VMOV.f32 and VMOV.f64 immediates. Encode the immediate into its 8-bit form as part of isel rather than later, which simplifies things for mapping the encoding bits, allows the removal of the custom disassembler decoding hook, makes the operand printer trivial, and prepares things more cleanly for handling these in the asm parser. rdar://10211428 llvm-svn: 140834	2011-09-30 00:50:06 +00:00
Duncan Sands	d1311488fe	Add codegen support for vector select (in the IR this means a select with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159	2011-09-06 19:07:46 +00:00
Eli Friedman	5d3814e0c4	64-bit atomic cmpxchg for ARM. llvm-svn: 138868	2011-08-31 17:52:22 +00:00
Eli Friedman	d71c865ae0	Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next. llvm-svn: 138845	2011-08-31 00:31:29 +00:00
Evan Cheng	91aa81acaa	Follow up to r138791. Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to call a target hook to adjust the instruction. For ARM, this is used to adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC instructions have implicit def of CPSR (required since it now uses CPSR physical register dependency rather than "glue"). If the carry flag is used, then the target hook will fill in the optional operand with CPSR. Otherwise, the hook will remove the CPSR implicit def from the MachineInstr. llvm-svn: 138810	2011-08-30 19:09:48 +00:00
Evan Cheng	1eacb83316	Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical register dependency (rather than glue them together). This is general goodness as it gives scheduler more freedom. However it is motivated by a nasty bug in isel. When a i64 sub is expanded to subc + sube. libcall #1 \ \ subc \ / \ \ / \ \ / libcall #2 sube If the libcalls are not serialized (i.e. both have chains which are dag entry), legalizer can serialize them in arbitrary orders. If it's unlucky, it can force libcall #2 before libcall #1 in the above case. subc \| libcall #2 \| libcall #1 \| sube However since subc and sube are "glued" together, this ends up being a cycle when the scheduler combine subc and sube as a single scheduling unit. The right solution is to fix LegalizeType too chains the libcalls together. However, LegalizeType is not processing nodes in order so that's harder than it should be. For now, the move to physical register dependency will do. rdar://10019576 llvm-svn: 138791	2011-08-30 01:34:54 +00:00
Chris Lattner	e1fe7061ce	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Evan Cheng	37ff73dfaf	Improve codegen for select's: if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017	2011-07-13 00:42:17 +00:00
Eric Christopher	2090e793c1	Remove getRegClassForInlineAsmConstraint from the ARM port. Part of rdar://9643582 llvm-svn: 134095	2011-06-29 21:10:36 +00:00
Eric Christopher	d68494ffdd	Have LowerOperandForConstraint handle multiple character constraints. Part of rdar://9119939 llvm-svn: 132510	2011-06-02 23:16:42 +00:00
Eli Friedman	12e590e760	Make the logic for determining function alignment more explicit. No functionality change. llvm-svn: 131012	2011-05-06 20:34:06 +00:00
Dan Gohman	7beb845bab	Add an unfolded offset field to LSR's Formula record. This is used to model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743	2011-05-03 00:46:49 +00:00
Jim Grosbach	77d45564c3	ARM and Thumb2 support for atomic MIN/MAX/UMIN/UMAX loads. rdar://9326019 llvm-svn: 130234	2011-04-26 19:44:18 +00:00
Andrew Trick	a130d110d1	Thumb2 and ARM add/subtract with carry fixes. Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>. t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the assembly printer correctly prints the 's' suffix. Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags. Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS. Fixes ARM SBC lowering to check for live carry (potential bug). llvm-svn: 130048	2011-04-23 03:55:32 +00:00
Andrew Trick	31c7962ce5	whitespace llvm-svn: 130046	2011-04-23 03:24:11 +00:00
Stuart Hastings	a552942e02	ARM byval support. Will be enabled by another patch to the FE. <rdar://problem/7662569> llvm-svn: 129858	2011-04-20 16:47:52 +00:00
Cameron Zwarich	1b8f91d2c8	Add a ARM-specific SD node for VBSL so that forms with a constant first operand can be recognized. This fixes <rdar://problem/9183078>. llvm-svn: 128584	2011-03-30 23:01:21 +00:00
Evan Cheng	dd99a0a548	Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981	2011-03-21 01:19:09 +00:00
Daniel Dunbar	34c65737c3	Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR", it broke a lot of things. llvm-svn: 127954	2011-03-19 21:47:14 +00:00
Evan Cheng	c5f50f7322	SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953	2011-03-19 17:17:39 +00:00
Bill Wendling	388dad6d62	Some minor cleanups based on feedback. llvm-svn: 127694	2011-03-15 20:47:26 +00:00
Bill Wendling	da1364d669	Generate a VTBL instruction instead of a series of loads and stores when we can. As Nate pointed out, VTBL isn't super performant, but it has to be better than this: _shuf: @ BB#0: @ %entry push {r4, r7, lr} add r7, sp, #4 sub sp, #12 mov r4, sp bic r4, r4, #7 mov sp, r4 mov r2, sp vmov d16, r0, r1 orr r0, r2, #6 orr r3, r2, #7 vst1.8 {d16[0]}, [r3] vst1.8 {d16[5]}, [r0] subs r4, r7, #4 orr r0, r2, #5 vst1.8 {d16[4]}, [r0] orr r0, r2, #4 vst1.8 {d16[4]}, [r0] orr r0, r2, #3 vst1.8 {d16[0]}, [r0] orr r0, r2, #2 vst1.8 {d16[2]}, [r0] orr r0, r2, #1 vst1.8 {d16[1]}, [r0] vst1.8 {d16[3]}, [r2] vldr.64 d16, [sp] vmov r0, r1, d16 mov sp, r4 pop {r4, r7, pc} The "illegal" testcase in vext.ll is no longer illegal. <rdar://problem/9078775> llvm-svn: 127630	2011-03-14 23:02:38 +00:00
Bob Wilson	f8c4d1ded9	Fix a compiler crash where a Glue value had multiple uses. Radar 9049552. llvm-svn: 127198	2011-03-08 01:17:20 +00:00
Cameron Zwarich	a1920d7f51	Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo. llvm-svn: 127175	2011-03-07 21:56:36 +00:00
Bob Wilson	1497601a7b	Remove unused conditional negate operations. llvm-svn: 127090	2011-03-05 16:54:31 +00:00
Stuart Hastings	539d4e1460	Support for byval parameters on ARM. Will be enabled by a forthcoming patch to the front-end. Radar 7662569. llvm-svn: 126655	2011-02-28 17:17:53 +00:00
Bob Wilson	65f4a70b82	Add codegen support for using post-increment NEON load/store instructions. The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using post-increment versions, but all the rest of the NEON load/store instructions should be handled now. llvm-svn: 125014	2011-02-07 17:43:21 +00:00
Evan Cheng	c7ce7e2ac3	Given a pair of floating point load and store, if there are no other uses of the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 llvm-svn: 124708	2011-02-02 01:06:55 +00:00
Evan Cheng	0dfe28a9b5	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Evan Cheng	53ec6fc591	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Evan Cheng	1afd04fc59	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Bob Wilson	c485ff3ced	Lower some BUILD_VECTORS using VEXT+shuffle. Patch by Tim Northover. llvm-svn: 123035	2011-01-07 21:37:30 +00:00
Evan Cheng	f7e586d749	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Bob Wilson	3bb61d1932	Add support for NEON VLD2-dup instructions. llvm-svn: 120236	2010-11-28 06:51:26 +00:00
Owen Anderson	52e3873edc	Add support for ARM's specialized vector-compare-against-zero instructions. llvm-svn: 118453	2010-11-08 23:21:22 +00:00
Owen Anderson	f26ea37db7	Disallow the certain NEON modified-immediate forms when generating vorr or vbic. llvm-svn: 118300	2010-11-05 21:57:54 +00:00
Owen Anderson	add19dd6dd	Add codegen and encoding support for the immediate form of vbic. llvm-svn: 118291	2010-11-05 19:27:46 +00:00
Owen Anderson	1a89511e5d	Add support for code generation of the one register with immediate form of vorr. We could be more aggressive about making this work for a larger range of constants, but this seems like a good start. llvm-svn: 118201	2010-11-03 22:44:51 +00:00
Evan Cheng	eab7251695	Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. llvm-svn: 118160	2010-11-03 06:34:55 +00:00
Bob Wilson	183c466006	Overhaul memory barriers in the ARM backend. Radar 8601999. There were a number of issues to fix up here: * The "device" argument of the llvm.memory.barrier intrinsic should be used to distinguish the "Full System" domain from the "Inner Shareable" domain. It has nothing to do with using DMB vs. DSB instructions. * The compiler should never need to emit DSB instructions. Remove the ARMISD::SYNCBARRIER node and also remove the instruction patterns for DSB. * Merge the separate DMB/DSB instructions for options only used for the disassembler with the default DMB/DSB instructions. Add the default "full system" option ARM_MB::SY to the ARM_MB::MemBOpt enum. * Add a separate ARMISD::MEMBARRIER_MCR node for subtargets that implement a data memory barrier using the MCR instruction. * Fix up encodings for these instructions (except MCR). I also updated the tests and added a few new ones to check for DMB options that were not currently being exercised. llvm-svn: 117756	2010-10-30 00:54:37 +00:00
John Thompson	6115a7f1d4	Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support. llvm-svn: 117667	2010-10-29 17:29:13 +00:00
Jim Grosbach	a8c0be5343	Add a pre-dispatch SjLj EH hook on the unwind edge for targets to do any setup they require. Use this for ARM/Darwin to rematerialize the base pointer from the frame pointer when required. rdar://8564268 llvm-svn: 116879	2010-10-19 23:27:08 +00:00
Bob Wilson	6b6b53ad6f	Remove unused ARMISD::AND selection DAG node. llvm-svn: 116566	2010-10-15 04:34:40 +00:00
Bob Wilson	c4345abcc0	Define the TargetLowering::getTgtMemIntrinsic hook for ARM so that NEON load and store intrinsics are represented with MemIntrinsicSDNodes. llvm-svn: 114454	2010-09-21 17:56:22 +00:00
Evan Cheng	c9cb37516d	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Bob Wilson	3348d2eb50	Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply, add, and subtract operations with zero-extended or sign-extended vectors. Update tests. Add auto-upgrade support for the old intrinsics. llvm-svn: 112773	2010-09-01 23:50:19 +00:00
Bill Wendling	385ad1516f	Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but it sets the CPSR register. llvm-svn: 112393	2010-08-29 03:02:11 +00:00
Bill Wendling	f10d5c00fc	Consider this code snippet: float t1(int argc) { return (argc == 1123) ? 1.234f : 2.38213f; } We would generate truly awful code on ARM (those with a weak stomach should look away): _t1: movw r1, #1123 movs r2, #1 movs r3, #0 cmp r0, r1 mov.w r0, #0 it eq moveq r0, r2 movs r1, #4 cmp r0, #0 it ne movne r3, r1 adr r0, #LCPI1_0 ldr r0, [r0, r3] bx lr The problem was that legalization was creating a cascade of SELECT_CC nodes, for for the comparison of "argc == 1123" which was fed into a SELECT node for the ?: statement which was itself converted to a SELECT_CC node. This is because the ARM back-end doesn't have custom lowering for SELECT nodes, so it used the default "Expand". I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this testcase, but can obviously be expanded to include more cases. Now we generate this, which looks optimal to me: _t1: movw r1, #1123 movs r2, #0 cmp r0, r1 adr r0, #LCPI0_0 it eq moveq r2, #4 ldr r0, [r0, r2] bx lr .align 2 LCPI0_0: .long 1075344593 @ float 2.382130e+00 .long 1067316150 @ float 1.234000e+00 llvm-svn: 110799	2010-08-11 08:43:16 +00:00
Nate Begeman	b506e13a32	Add support for getting & setting the FPSCR application register on ARM when VFP is enabled. Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding. Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed. llvm-svn: 110152	2010-08-03 21:31:55 +00:00
Jim Grosbach	03b130774b	Remove dead prototype llvm-svn: 109691	2010-07-28 23:16:12 +00:00
Anton Korobeynikov	7ae895e007	Hook in GlobalMerge pass llvm-svn: 109359	2010-07-24 21:52:08 +00:00
Evan Cheng	f215e55d5f	- Allow target to specify when is register pressure "too high". In most cases, it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279	2010-07-23 22:39:59 +00:00
Eric Christopher	3d118d5e8a	Baby steps towards ARM fast-isel. llvm-svn: 109047	2010-07-21 22:26:11 +00:00
Evan Cheng	df725c25dd	Teach bottom up pre-ra scheduler to track register pressure. Work in progress. llvm-svn: 108991	2010-07-21 06:09:07 +00:00
Evan Cheng	b2ad0066f5	ARM has to provide its own TargetLowering::findRepresentativeClass because its scalar floating point registers alias its vector registers. llvm-svn: 108761	2010-07-19 22:15:08 +00:00
Jim Grosbach	dc21ac2e0a	Since ARM emits inline jump tables as part of the ConstantIsland pass, it should set the jump table encloding the EK_Inline. This prevents a second, unused, copy of the table from being emitted after the function body. PR6581. llvm-svn: 108730	2010-07-19 17:20:38 +00:00
Jim Grosbach	5b8c14ce8a	revert so I can get the right PR# in the log message. llvm-svn: 108727	2010-07-19 17:19:40 +00:00
Jim Grosbach	42f3134738	Since ARM emits inline jump tables as part of the ConstantIsland pass, it should set the jump table encloding the EK_Inline. This prevents a second, unused, copy of the table from being emitted after the function body. PR7499. llvm-svn: 108722	2010-07-19 17:18:28 +00:00
Jim Grosbach	749f4fca0a	Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction and a combine pattern to use it for setting a bit-field to a constant value. More to come for non-constant stores. llvm-svn: 108570	2010-07-16 23:05:05 +00:00
Bob Wilson	34f481e895	Add support for NEON VMVN immediate instructions. llvm-svn: 108324	2010-07-14 06:31:50 +00:00
Bob Wilson	7feb850d36	Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent NEON VMOV-immediate instructions. This simplifies some things. llvm-svn: 108275	2010-07-13 21:16:48 +00:00
Evan Cheng	069f1f7c9a	Extend the r107852 optimization which turns some fp compare to code sequence using only i32 operations. It now optimize some f64 compares when fp compare is exceptionally slow (e.g. cortex-a8). It also catches comparison against 0.0. llvm-svn: 108258	2010-07-13 19:27:42 +00:00

1 2 3 4 5 ...

278 Commits