llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Colin LeMahieu	07e42127a4	[Hexagon] Adding sub/and/or reg, imm forms llvm-svn: 223522	2014-12-05 21:38:29 +00:00
Sanjay Patel	88b824a8d3	Optimize merging of scalar loads for 32-byte vectors [X86, AVX] Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ). Before we crack 32-byte build vectors into smaller chunks (and then subsequently glue them back together), we should look for the easy case where we can just load all elements in a single op. An example of the codegen change is: From: vmovss 16(%rdi), %xmm1 vmovups (%rdi), %xmm0 vinsertps $16, 20(%rdi), %xmm1, %xmm1 vinsertps $32, 24(%rdi), %xmm1, %xmm1 vinsertps $48, 28(%rdi), %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 retq To: vmovups (%rdi), %ymm0 retq Differential Revision: http://reviews.llvm.org/D6536 llvm-svn: 223518	2014-12-05 21:28:14 +00:00
Colin LeMahieu	4f286f5f23	[Hexagon] Updating mux_ir/ri/ii/rr with encoding bits llvm-svn: 223515	2014-12-05 21:09:27 +00:00
Jan Wen Voung	b856ac92dc	Use 32-bit ebp for NaCl64 in a limited case: llvm.frameaddress. Summary: Follow up to [x32] "Use ebp/esp as frame and stack pointer": http://reviews.llvm.org/D4617 In that earlier patch, NaCl64 was made to always use rbp. That's needed for most cases because rbp should hold a full 64-bit address within the NaCl sandbox so that load/stores off of rbp don't require sandbox adjustment (zeroing the top 32-bits, then filling those by adding r15). However, llvm.frameaddress returns a pointer and pointers are 32-bit for NaCl64. In this case, use ebp instead, which will make the register copy type check. A similar mechanism may be needed for llvm.eh.return, but is not added in this change. Test Plan: test/CodeGen/X86/frameaddr.ll Reviewers: dschuff, nadav Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6514 llvm-svn: 223510	2014-12-05 20:55:53 +00:00
Bill Seurer	b4d665d454	[PowerPC]Add VSX loads/stores to fastisel for PPC target This patch adds VSX floating point loads and stores to fastisel. Along with the change to tablegen (D6220), VSX instructions are now fully supported in fastisel. http://reviews.llvm.org/D6274 llvm-svn: 223507	2014-12-05 20:15:56 +00:00
Colin LeMahieu	ea4694aa6c	[Hexagon] Adding tfrih/l instructions. llvm-svn: 223506	2014-12-05 20:07:19 +00:00
Andrea Di Biagio	38a80209d6	[X86] Improved lowering of packed vector shifts to vpsllq/vpsrlq. SSE2/AVX non-constant packed shift instructions only use the lower 64-bit of the shift count. This patch teaches function 'getTargetVShiftNode' how to deal with shifts where the shift count node is of type MVT::i64. Before this patch, function 'getTargetVShiftNode' only knew how to deal with shift count nodes of type MVT::i32. This forced the backend to wrongly truncate the shift count to MVT::i32, and then zero-extend it back to MVT::i64. llvm-svn: 223505	2014-12-05 20:02:22 +00:00
Colin LeMahieu	873ff2f4a3	[Hexagon] Adding add reg, imm form with encoding bits and test. llvm-svn: 223504	2014-12-05 19:51:23 +00:00
Colin LeMahieu	5f7eada35a	[Hexagon] Adding DoubleRegs decoder. Moving C2_mux and A2_nop. Adding combine imm-imm form. llvm-svn: 223494	2014-12-05 18:24:06 +00:00
Colin LeMahieu	65798940e5	[Hexagon] [NFC] Rearranging patterns and mux instruction. llvm-svn: 223488	2014-12-05 17:58:06 +00:00
Colin LeMahieu	71d62a88df	[Hexagon] [NFC] Rearranging def order. llvm-svn: 223487	2014-12-05 17:55:51 +00:00
Colin LeMahieu	9d15eacd68	[Hexagon] Adding combine reg-reg forms. llvm-svn: 223485	2014-12-05 17:38:36 +00:00
Colin LeMahieu	77e5a6b190	[Hexagon] Marking several instructions as isCodeGenOnly=0 and adding direct disassembly tests for many instructions. llvm-svn: 223482	2014-12-05 17:27:39 +00:00
Andrea Di Biagio	549dad7c4c	[X86] Avoid introducing extra shuffles when lowering packed vector shifts. When lowering a vector shift node, the backend checks if the shift count is a shuffle with a splat mask. If so, then it introduces an extra dag node to extract the splat value from the shuffle. The splat value is then used to generate a shift count of a target specific shift. However, if we know that the shift count is a splat shuffle, we can use the splat index 'I' to extract the I-th element from the first shuffle operand. The advantage is that the splat shuffle may become dead since we no longer use it. Example: ;; define <4 x i32> @example(<4 x i32> %a, <4 x i32> %b) { %c = shufflevector <4 x i32> %b, <4 x i32> undef, <4 x i32> zeroinitializer %shl = shl <4 x i32> %a, %c ret <4 x i32> %shl } ;; Before this patch, llc generated the following code (-mattr=+avx): vpshufd $0, %xmm1, %xmm1 # xmm1 = xmm1[0,0,0,0] vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq With this patch, the redundant splat operation is removed from the code. vpxor %xmm2, %xmm2 vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7] vpslld %xmm1, %xmm0, %xmm0 retq llvm-svn: 223461	2014-12-05 12:13:30 +00:00
Charlie Turner	1fb529a753	Add missing FP build attribute tests. The test file test/CodeGen/ARM/build-attributes.ll was missing several floating-point build attribute tests. The intention of this commit is that for each CPU / architecture currently tested, there are now tests that make sure the following attributes are sufficiently checked, * Tag_ABI_FP_rounding * Tag_ABI_FP_denormal * Tag_ABI_FP_exceptions * Tag_ABI_FP_user_exceptions * Tag_ABI_FP_number_model Also in this commit, the -unsafe-fp-math flag has been augmented with the full suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is, `-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim -enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast' Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc llvm-svn: 223454	2014-12-05 08:22:47 +00:00
Eric Christopher	f0ccd8ac1c	Rename the x86 isTargetMacho to isTargetMachO for uniformity. llvm-svn: 223421	2014-12-05 00:22:38 +00:00
Eric Christopher	8ab71c495e	Both of these subtargets have functions that check whether or not the target is mach-o. Use them. llvm-svn: 223420	2014-12-05 00:22:35 +00:00
Ahmed Bougacha	18eba7c45f	[X86] Delete dead code in fcopysign lowering. NFC. r32900 introduced custom lowering for fcopysign, with two checks to change the magnitude value's type if it's larger/smaller than the sign value's type. r32932 replaced that code for the smaller case. r43205 did the same for the larger case, but left the old code, now dead. llvm-svn: 223415	2014-12-04 23:52:15 +00:00
Roman Divacky	30850dee4d	Add a FIXME as requested by Renato Golin. llvm-svn: 223390	2014-12-04 21:39:24 +00:00
Bruno Cardoso Lopes	cdb73ec2f9	[x86] Fix isOffsetSuitableForCodeModel kernel code model offset Offset == 0 is a valid offset for kernel code model according to the x86_64 System V ABI. Found by inspection, no testcase. llvm-svn: 223383	2014-12-04 20:36:06 +00:00
Weiming Zhao	b889d65e01	[AArch64] Combining Load and IntToFp should check for neon availability llvm-svn: 223382	2014-12-04 20:25:50 +00:00
Asiri Rathnayake	e5f983fb48	Fix yet another unseen regression caused by r223113 r223113 added support for ARM modified immediate assembly syntax. Which assumes all immediate operands are prefixed with a '#'. This assumption is wrong as per the ARMARM - which recommends that all '#' characters be treated optional. The current patch fixes this regression and adds a test case. A follow-up patch will expand the test coverage to other instructions. llvm-svn: 223381	2014-12-04 19:34:59 +00:00
Jonathan Roelofs	348155e124	Fix thumbv4t indirect calls So there are a couple of issues with indirect calls on thumbv4t. First, the most 'obvious' instruction, 'blx' isn't available until v5t. And secondly, the next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code because the saved off pc has its thumb bit cleared, so when the callee returns we end up in ARM mode.... yuck. The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it. We could cut down on code size by sharing the landing pads between call sites that are close enough, but for the moment let's do correctness first and look at performance later. Patch by: Iain Sandoe http://reviews.llvm.org/D6519 llvm-svn: 223380	2014-12-04 19:34:50 +00:00
Asiri Rathnayake	7121b6d3a3	Fix a minor regression introduced in r223113 r223113 added support for ARM modified immediate assembly syntax. That patch has broken support for immediate expressions, as in: add r0, #(4 * 4) It wasn't caught because we don't have any tests for this feature. This patch fixes this regression and adds test cases. llvm-svn: 223366	2014-12-04 14:49:07 +00:00
Rafael Espindola	ffb0b8fbb1	Revert "[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090 >" This reverts commit r223356. It was failing check-all (MC/ARM/thumb.s in particular). llvm-svn: 223363	2014-12-04 14:10:20 +00:00
Michael Kuperstein	7a925b41a3	[X86] Improve a dag-combine that handles a vector extract -> zext sequence. The current DAG combine turns a sequence of extracts from <4 x i32> followed by zexts into a store followed by scalar loads. According to measurements by Martin Krastev (see PR 21269) for x86-64, a sequence of an extract, movs and shifts gives better performance. However, for 32-bit x86, the previous sequence still seems better. Differential Revision: http://reviews.llvm.org/D6501 llvm-svn: 223360	2014-12-04 13:49:51 +00:00
Jyoti Allur	a4f3d11768	[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090 > llvm-svn: 223356	2014-12-04 11:52:49 +00:00
Andrea Di Biagio	66e5e80d7c	[X86] Simplify code. NFC. Replaced some logic that checked if a build_vector node is doing a splat of a non-undef value with a call to method BuildVectorSDNode::getSplatValue(). No functional change intended. llvm-svn: 223354	2014-12-04 11:21:44 +00:00
Elena Demikhovsky	befed29343	Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again. Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 223348	2014-12-04 09:40:44 +00:00
Michael Liao	59822d4755	[X86] Clean up whitespace as well as minor coding style llvm-svn: 223339	2014-12-04 05:20:33 +00:00
Colin LeMahieu	a0eacebee5	[Hexagon] Marking some instructions as CodeGenOnly=0 and adding disassembly tests. llvm-svn: 223334	2014-12-04 03:41:21 +00:00
Michael Liao	f24ea6579d	[X86] Restore X86 base pointer after call to llvm.eh.sjlj.setjmp Commit on - This patch fixes the bug described in http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-May/062343.html The fix allocates an extra slot just below the GPRs and stores the base pointer there. This is done only for functions containing llvm.eh.sjlj.setjmp that also need a base pointer. Because code containing llvm.eh.sjlj.setjmp saves all of the callee-save GPRs in the prologue, the offset to the extra slot can be computed before prologue generation runs. Impact at run-time on affected functions is:: - One extra store in the prologue, The store saves the base pointer. - One extra load after a llvm.eh.sjlj.setjmp. The load restores the base pointer. Because the extra slot is just above a gap between frame-pointer-relative and base-pointer-relative chunks of memory, there is no impact on other offset calculations other than ensuring there is room for the extra slot. http://reviews.llvm.org/D6388 Patch by Arch Robison <arch.robison@intel.com> llvm-svn: 223329	2014-12-04 00:56:38 +00:00
Hal Finkel	84b0b8cc3b	[PowerPC] 'cc' should be an alias only to 'cr0' We had mistakenly believed that GCC's 'cc' referred to the entire condition-code register (cr0 through cr7) -- and implemented this in r205630 to fix PR19326, but 'cc' is actually an alias only to 'cr0'. This is causing LLVM to clobber too much with legacy code with inline asm using the 'cc' clobber. Fixes PR21451. llvm-svn: 223328	2014-12-04 00:46:20 +00:00
NAKAMURA Takumi	c1e875f8b5	HexagonMCInst.h: Qualify constants explicitly to appease msc17. llvm-svn: 223325	2014-12-04 00:26:39 +00:00
Matt Arsenault	04135ddb57	Allow target to specify prefix for labels Use the MCAsmInfo instead of the DataLayout, and allow specifying a custom prefix for labels specifically. HSAIL requires that labels begin with @, but global symbols with &. llvm-svn: 223323	2014-12-04 00:06:57 +00:00
Hal Finkel	bf150eceab	[PowerPC] Fix inline asm memory operands not to use r0 On PowerPC, inline asm memory operands might be expanded as 0($r), where $r is a register containing the address. As a result, this register cannot be r0, and we need to enforce this register subclass constraint to prevent miscompiling the code (we'd get this constraint for free with the usual instruction definitions, but that scheme has no knowledge of how we end up printing inline asm memory operands, and so here we need to do it 'by hand'). We can accomplish this within the current address-mode selection framework by introducing an explicit COPY_TO_REGCLASS node. Fixes PR21443. llvm-svn: 223318	2014-12-03 23:40:13 +00:00
Jacques Pienaar	6c238c088a	Test commit. llvm-svn: 223310	2014-12-03 23:21:02 +00:00
Sanjay Patel	8f534da2fd	fix typos, grammar, formatting; NFC llvm-svn: 223276	2014-12-03 22:28:05 +00:00
Colin LeMahieu	e5b2561ad8	[Hexagon] Converting member InstrDesc to static variable. llvm-svn: 223268	2014-12-03 21:40:25 +00:00
Colin LeMahieu	b559c5e4fc	[Hexagon] Converting subclass members to an implicit operand. llvm-svn: 223264	2014-12-03 20:23:22 +00:00
Will Schmidt	67ea953093	Add TableGen info for Power8. This is based on the Power7 version, with units added and renamed to match P8. Differential Revision: http://reviews.llvm.org/D6358 llvm-svn: 223257	2014-12-03 18:46:30 +00:00
Roman Divacky	6409d170b1	Change the name to be in style. llvm-svn: 223255	2014-12-03 18:39:44 +00:00
Tom Stellard	11a07f331e	R600/SI: Move SIInsertWaits into AMDGPUPassConfig::addPreSched2() This pass needs to be run after PrologEpilogInserter, because that pass may inserter spill code which reads or writes memory. llvm-svn: 223253	2014-12-03 18:27:08 +00:00
Tom Stellard	41dd9bba8c	R600/SI: Don't run SI passes on R600 subtargets llvm-svn: 223252	2014-12-03 18:27:05 +00:00
Tim Northover	24557369d4	AArch64: fix wrong-endian parameter passing. The blocked arguments code didn't take account of the hacks needed to support it. llvm-svn: 223247	2014-12-03 17:49:26 +00:00
Colin LeMahieu	ffe6923b15	[NFC] Fixing pendantic warning extra semicolons. llvm-svn: 223246	2014-12-03 17:36:39 +00:00
Colin LeMahieu	6020d0e681	[Hexagon] [NFC] Moving function implementations out of header. Clang-formatting files. llvm-svn: 223245	2014-12-03 17:35:39 +00:00
Colin LeMahieu	93c8792d13	[Hexagon] [NFC] Renaming packetStart to packetBegin llvm-svn: 223243	2014-12-03 17:31:43 +00:00
Aaron Ballman	73be12a12b	Silencing a 32-bit implicit conversion warning in MSVC; NFC. llvm-svn: 223237	2014-12-03 14:39:58 +00:00
Hal Finkel	e95528845d	[PowerPC] Print all inline-asm consts as signed numbers Almost all immediates in PowerPC assembly (both 32-bit and 64-bit) are signed numbers, and it is important that we print them as such. To make sure that happens, we change PPCTargetLowering::LowerAsmOperandForConstraint so that it does all intermediate checks on a signed-extended int64_t value, and then creates the resulting target constant using MVT::i64. This will ensure that all negative values are printed as negative values (mirroring what is done in other backends to achieve the same sign-extension effect). This came up in the context of inline assembly like this: "add%I2 %0,%0,%2", ..., "Ir"(-1ll) where we used to print: addi 3,3,4294967295 and gcc would print: addi 3,3,-1 and gas accepts both forms, but our builtin assembler (correctly) does not. Now we print -1 like gcc does. While here, I replaced a bunch of custom integer checks with isInt<16> and friends from MathExtras.h. Thanks to Paul Hargrove for the bug report. llvm-svn: 223220	2014-12-03 09:37:50 +00:00

1 2 3 4 5 ...

30941 Commits