llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Quentin Colombet	ca131ef450	[AArch64] Run a peephole pass right after AdvSIMD pass. The AdvSIMD pass may produce copies that are not coalescer-friendly. The peephole optimizer knows how to fix that as demonstrated in the test case. <rdar://problem/12702965> llvm-svn: 216200	2014-08-21 18:10:07 +00:00
Juergen Ributzka	83c6e645d7	[FastISel][AArch64] Remove redundant test. These tests and many more are already covered by fast-isel-addressing-modes.ll. llvm-svn: 216186	2014-08-21 16:40:05 +00:00
Jiangning Liu	c14e7de948	Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type". llvm-svn: 216147	2014-08-21 01:59:30 +00:00
Juergen Ributzka	7c8f6aa104	[FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare. This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero- extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both operands have to be sign-/zero-extended with separate instructions. Related to <rdar://problem/17913111>. llvm-svn: 216073	2014-08-20 16:34:15 +00:00
Jiangning Liu	c3dd378a9e	Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type legalization stage. With those two optimizations, fewer signed/zero extension instructions can be inserted, and then we can expose more opportunities to Machine CSE pass in back-end. llvm-svn: 216066	2014-08-20 12:05:15 +00:00
Yi Kong	57329b1cc2	ARM: Fix codegen for rbit intrinsic LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic. According to ARM ARM, rbit only takes register as argument, not immediate. The correct instruction should be rbit <Rd>, <Rm>. The bug was originally introduced in r211057. Differential Revision: http://reviews.llvm.org/D4980 llvm-svn: 216064	2014-08-20 10:40:20 +00:00
Juergen Ributzka	ce5953230a	[FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0. Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register class to be used with the zero register. This makes the MachineInstruction verifier happy again. This is related to <rdar://problem/18027157>. llvm-svn: 216040	2014-08-20 01:10:36 +00:00
Juergen Ributzka	21b19be38f	[FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding. Factor out the ADDS/SUBS instruction emission code into helper functions and make the helper functions more clever to support most of the different ADDS/SUBS instructions the architecture support. This includes better immedediate support, shift folding, and sign-/zero-extend folding. This fixes <rdar://problem/17913111>. llvm-svn: 216033	2014-08-19 22:29:55 +00:00
Juergen Ributzka	233cb7bf1a	[FastISel][AArch64] Extend floating-point materialization test. This adds the missing test that I promised for r215753 to test the materialization of the floating-point value +0.0. Related to <rdar://problem/18027157>. llvm-svn: 216019	2014-08-19 20:35:07 +00:00
Juergen Ributzka	f39a032c8b	Reapply [FastISel][AArch64] Add support for more addressing modes (r215597). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] llvm-svn: 216013	2014-08-19 19:44:17 +00:00
Juergen Ributzka	1cb2d0a61e	Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. llvm-svn: 216009	2014-08-19 19:44:02 +00:00
Juergen Ributzka	9261bd7fe9	[FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register. This fixes a few BuildMI callsites where the result register was added by using addReg, which is per default a use and therefore an operand register. Also use the zero register as result register when emitting a compare instruction (SUBS with unused result register). llvm-svn: 215997	2014-08-19 17:41:53 +00:00
Oliver Stannard	0f36700d69	Teach the AArch64 backend to handle f16 This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv operations on f16 (half-precision) types by promoting to f32. llvm-svn: 215891	2014-08-18 14:22:39 +00:00
Oliver Stannard	159a549ea3	[ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage Externally-defined functions with weak linkage should not be tail-called on ARM or AArch64, as the AAELF spec requires normal calls to undefined weak functions to be replaced with a NOP or jump to the next instruction. The behaviour of branch instructions in this situation (as used for tail calls) is implementation-defined, so we cannot rely on the linker replacing the tail call with a return. llvm-svn: 215890	2014-08-18 12:42:15 +00:00
Amara Emerson	03cfd262eb	[AArch64] Narrow arguments passed in wrong position on the stack in big-endian mode. Patch by Asiri Rathnayake. Differential Revision: http://reviews.llvm.org/D4922 llvm-svn: 215716	2014-08-15 14:29:57 +00:00
Juergen Ributzka	a981de1e50	Revert several FastISel commits to track down a buildbot error. This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X86] Use XOR to materialize the "0" value." r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization." r215591 "[FastISel][AArch64] Make use of the zero register when possible." r215588 "[FastISel] Let the target decide first if it wants to materialize a constant." r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI." llvm-svn: 215673	2014-08-14 19:56:28 +00:00
Juergen Ributzka	49e924e64e	Revert "[FastISel][AArch64] Add support for more addressing modes." This reverts commits r215597, because it might have broken the build bots. llvm-svn: 215659	2014-08-14 17:10:54 +00:00
Akira Hatanaka	87c53cc314	[AArch64, fast-isel] Fall back to SelectionDAG to select tail calls. Certain functions such as objc_autoreleaseReturnValue have to be called as tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls, we have to fall back to SelectionDAG to select calls that are marked as tail. <rdar://problem/17991614> llvm-svn: 215600	2014-08-13 23:23:58 +00:00
Juergen Ributzka	1428168ab7	[FastISel][AArch64] Add support for more addressing modes. FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] llvm-svn: 215597	2014-08-13 22:53:29 +00:00
Juergen Ributzka	410902c24a	[FastISel][AArch64] Make use of the zero register when possible. This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. llvm-svn: 215591	2014-08-13 22:13:14 +00:00
Gerolf Hoflehner	59d9e5a101	[MachineCombiner] Fix for ICE bug 20598 The combiner ignored DBG nodes when checking the uses of a virtual register. It combined a sequence like %vreg1 = madd %vreg2, %vreg3,... DBG_VALUE (%vreg1 ...) %vreg4 = add %vreg1,... to %vreg4 = madd %vreg2, %vreg3 leaving behind a dangling DBG_VALUE with a definition. This triggered an assertion in the MachineTraceMetrics.cpp module. llvm-svn: 215431	2014-08-12 07:54:12 +00:00
Quentin Colombet	03e2418837	[AArch64] Fix registerAllocator assigns same register for base and wback in pre/post-index load and store. Patch by Steven Wu <stevenwu@apple.com> llvm-svn: 215390	2014-08-11 21:39:53 +00:00
Jiangning Liu	b923ade46b	In Machine CSE pass, the source register of a COPY machine instruction can be propagated to all its users, and this propagation could increase the probability of finding common subexpressions. If the COPY has only one user, the COPY itself can be removed. llvm-svn: 215344	2014-08-11 05:17:19 +00:00
Jiangning Liu	264179daf4	[AArch64] Fix a type conversion bug for anlyzing compare. The bug can cause spec2006/483.xalancbmk failure. Patched by David Xu. llvm-svn: 215206	2014-08-08 14:19:29 +00:00
James Molloy	c2be58dbb0	[AArch64] Add an FP load balancing pass for Cortex-A57 For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations. This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU. Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness). This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected. llvm-svn: 215199	2014-08-08 12:33:21 +00:00
Tim Northover	43f2f4770b	AArch64: stop trying to take control of all UnknownArch triples. This short-circuited our error reporting for incorrectly specified target triples (you'd get AArch64 code instead). Should fix PR20567. llvm-svn: 215191	2014-08-08 08:27:44 +00:00
Akira Hatanaka	6d09dc6814	[stack protector] Look through bitcasts to get global variable __stack_chk_guard. Handle the case where the pointer operand of the load instruction that loads the stack guard is not a global variable but instead a bitcast. %StackGuard = load i8 bitcast (i64 @__stack_chk_guard to i8*) call void @llvm.stackprotector(i8 %StackGuard, i8** %StackGuardSlot) Original test case provided by Ana Pazos. This fixes PR20558. llvm-svn: 215167	2014-08-07 23:08:24 +00:00
Gerolf Hoflehner	35c59e408d	MachineCombiner Pass for selecting faster instruction sequence on AArch64 Re-commit of r214832,r21469 with a work-around that avoids the previous problem with gcc build compilers The work-around is to use SmallVector instead of ArrayRef of basic blocks in preservesResourceLen()/MachineCombiner.cpp llvm-svn: 215151	2014-08-07 21:40:58 +00:00
James Molloy	7518c61a09	[AArch64] Add a testcase for r214957. llvm-svn: 214965	2014-08-06 13:31:32 +00:00
Yi Kong	a05145f812	AArch64: Add support for instruction prefetch intrinsic Instruction prefetch is not implemented for AArch64, it is incorrectly translated into data prefetch instruction. Differential Revision: http://reviews.llvm.org/D4777 llvm-svn: 214860	2014-08-05 12:46:47 +00:00
Juergen Ributzka	ec5a9526be	[FastISel][AArch64] Implement the FastLowerArguments hook. This implements basic argument lowering for AArch64 in FastISel. It only handles a small subset of the C calling convention. It supports simple arguments that can be passed in GPR and FPR registers. This should cover most of the trivial cases without falling back to SelectionDAG. This fixes <rdar://problem/17890986>. llvm-svn: 214846	2014-08-05 05:43:48 +00:00
Kevin Qin	2c9a6f5e83	Revert "r214832 - MachineCombiner Pass for selecting faster instruction" It broke compiling of most Benchmark and internal test, as clang got clashed by segmentation fault or assertion. llvm-svn: 214845	2014-08-05 05:43:47 +00:00
Juergen Ributzka	20218b6fa0	[FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended. llvm-svn: 214844	2014-08-05 05:43:44 +00:00
Gerolf Hoflehner	40e09fb7dd	MachineCombiner Pass for selecting faster instruction sequence on AArch64 Re-commit of r214669 without changes to test cases LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and LLVM:: CodeGen/AArch64/dp-3source.ll This resolves the reported compfails of the original commit. llvm-svn: 214832	2014-08-05 01:16:13 +00:00
Juergen Ributzka	f39fd7e490	[FastISel][AArch64] Fix shift lowering for i8 and i16 value types. This fix changes the parameters #r and #s that are passed to the UBFM/SBFM instruction to get the zero/sign-extension for free. The original problem was that the shift left would use the 32-bit shift even for i8/i16 value types, which could leave the upper bits set with "garbage" values. The arithmetic shift right on the other side would use the wrong MSB as sign-bit to determine what bits to shift into the value. This fixes <rdar://problem/17907720>. llvm-svn: 214788	2014-08-04 21:49:51 +00:00
Chad Rosier	d843d13525	[AArch64] Extend the number of scalar instructions supported in the AdvSIMD scalar integer instruction pass. This is a patch I had lying around from a few months ago. The pass is currently disabled by default, so nothing to interesting. llvm-svn: 214779	2014-08-04 21:20:25 +00:00
Kevin Qin	348a8dd760	Revert "r214669 - MachineCombiner Pass for selecting faster instruction" This commit broke "make check" for several hours, so get it reverted. llvm-svn: 214697	2014-08-04 05:10:33 +00:00
Gerolf Hoflehner	265dd68643	MachineCombiner Pass for selecting faster instruction sequence - AArch64 target support This patch turns off madd/msub generation in the DAGCombiner and generates them in the MachineCombiner instead. It replaces the original code sequence with the combined sequence when it is beneficial to do so. When there is no machine model support it always generates the madd/msub instruction. This is true also when the objective is to optimize for code size: when the combined sequence is shorter is always chosen and does not get evaluated. When there is a machine model the combined instruction sequence is evaluated for critical path and resource length using machine trace metrics and the original code sequence is replaced when it is determined to be faster. rdar://16319955 llvm-svn: 214669	2014-08-03 22:03:40 +00:00
James Molloy	b4d821ca5b	Update test to use a more modern AArch64 triple, as requested by Renato. llvm-svn: 214637	2014-08-02 17:15:11 +00:00
James Molloy	240087c61a	[AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available. The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers! llvm-svn: 214634	2014-08-02 14:51:24 +00:00
Juergen Ributzka	bb4322fe8b	[FastISel][AArch64] Fold offset into the memory operation. Fold simple offsets into the memory operation: add x0, x0, #8 ldr x0, [x0] --> ldr x0, [x0, #8] Fixes <rdar://problem/17887945>. llvm-svn: 214545	2014-08-01 19:40:16 +00:00
Juergen Ributzka	5fad1ddebc	[FastISel][AArch64] Add branch weights. Add branch weights to branch instructions, so that the following passes can optimize based on it (i.e. basic block ordering). Fixes <rdar://problem/17887137>. llvm-svn: 214537	2014-08-01 18:39:24 +00:00
Chad Rosier	f6bd4ee872	[AArch64] Fix test from r214518 in an attempt to appease buildbots. llvm-svn: 214521	2014-08-01 15:30:41 +00:00
Chad Rosier	69caabf908	[AArch64] Generate tbz/tbnz when comparing against zero. The tbz/tbnz checks the sign bit to convert op w1, w1, w10 cmp w1, #0 b.lt .LBB0_0 to op w1, w1, w10 tbnz w1, #31, .LBB0_0 Differential Revision: http://reviews.llvm.org/D4440 llvm-svn: 214518	2014-08-01 14:48:56 +00:00
Juergen Ributzka	1686869897	[FastISel][AArch64] Fix the immediate versions of the {s\|u}{add\|sub}.with.overflow intrinsics. ADDS and SUBS cannot encode negative immediates or immediates larger than 12bit. This fix checks if the immediate version can be used under this constraints and if we can convert ADDS to SUBS or vice versa to support negative immediates. Also update the test cases to test the immediate versions. llvm-svn: 214470	2014-08-01 01:25:55 +00:00
Juergen Ributzka	ed5bbc5130	[FastISel][AArch64] Add basic bitcast support for conversion between float and int. Fixes <rdar://problem/17867078>. llvm-svn: 214389	2014-07-31 06:25:37 +00:00
Juergen Ributzka	35e07f4cd0	[FastISel][AArch64] Add sqrt intrinsic support. Fixes <rdar://problem/17867067>. llvm-svn: 214388	2014-07-31 06:25:33 +00:00
Juergen Ributzka	4e074494e8	[FastISel][AArch64] Update and enable patchpoint and stackmap intrinsic tests for FastISel. This commit updates the existing SelectionDAG tests for the stackmap and patchpoint intrinsics and enables FastISel testing. It also splits up the tests into separate files, due to different codegen between SelectionDAG and FastISel. llvm-svn: 214382	2014-07-31 04:10:43 +00:00
Juergen Ributzka	622f41919a	[FastISel][AArch64] Add MachO large code model support for function calls. Currently the large code model for MachO uses the GOT to make function calls. Emit the required adrp and ldr instructions to load the address from the GOT. Related to <rdar://problem/17733076>. llvm-svn: 214381	2014-07-31 04:10:40 +00:00
Juergen Ributzka	d1fc9d2924	[FastISel][AArch64] Add select folding support for the XALU intrinsics. This improves the code generation for the XALU intrinsics when the condition is feeding a select instruction. This also updates and enables the XALU unit tests for FastISel. This fixes <rdar://problem/17831117>. llvm-svn: 214350	2014-07-30 22:04:37 +00:00

1 2 3 4 5 ...

424 Commits