llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Ulrich Weigand	6d010ece69	[SystemZ] Add program mask and addressing mode instructions Add several instructions that operate on the program mask or the addressing mode. These are not really needed for code generation under Linux, but are provided for completeness for the assembler/disassembler. llvm-svn: 286284	2016-11-08 20:17:02 +00:00
Ulrich Weigand	e0f6c13cd6	[SystemZ] Model access registers as LLVM registers Add the 16 access registers as LLVM registers. This allows removing a lot of special cases in the assembler and disassembler where we were handling access registers; this can all just use the generic register code now. Also add a bunch of instructions to operate on access registers, for assembler/disassembler use only. No change in code generation intended. llvm-svn: 286283	2016-11-08 20:15:26 +00:00
Dan Gohman	37e7b16e05	[WebAssembly] Convert stackified IMPLICIT_DEF into constant 0. Since IMPLIFIT_DEF instructions are omitted in the output, when the output of an IMPLICIT_DEF instruction is stackified, the resulting register lacks an explicit push, leading to a push/pop mismatch. Fix this by converting such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit pushes. llvm-svn: 286274	2016-11-08 19:40:38 +00:00
Davide Italiano	ca68c4db42	[LibcallsShrinkWrap] This pass doesn't preserve the CFG. For example, it invalidates the domtree, causing assertions in later passes which need dominator infos. Make it preserve GlobalsAA, as suggested by Eli. Differential Revision: https://reviews.llvm.org/D26381 llvm-svn: 286271	2016-11-08 19:18:20 +00:00
Nirav Dave	d549d31eb9	[MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser. Reviewers: t.p.northover, rengolin Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D26309 llvm-svn: 286265	2016-11-08 18:31:04 +00:00
Ulrich Weigand	8bf37ac8ac	[SystemZ] Refactor branch and conditional instruction patterns Rework patterns for branches, call & return instructions, compare-and-branch, compare-and-trap, and conditional move instructions. In particular, simplify creation of patterns for the extended opcodes of instructions that take a CC mask. Also, use semantical instruction classes for all the instructions instead of open-coding them in SystemZInstrInfo.td. Adds a couple of the basic branch instructions (that are unused for codegen) for the assembler/disassembler. llvm-svn: 286263	2016-11-08 18:30:50 +00:00
Sanjay Patel	a887467540	[InstCombine] move min/max tests to min/max test file; NFC llvm-svn: 286256	2016-11-08 18:12:19 +00:00
Sanjay Patel	6b010278d0	[InstCombine] update checks; NFC llvm-svn: 286255	2016-11-08 18:06:14 +00:00
Tim Northover	9d8ad3d2eb	GlobalISel: support selecting fpext/fptrunc instructions on AArch64. llvm-svn: 286253	2016-11-08 17:44:07 +00:00
Anton Korobeynikov	5f215eefea	Fix PR27500: on MSP430 the branch destination offset is measured in words, not bytes. Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before. Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23718 llvm-svn: 286252	2016-11-08 17:19:59 +00:00
Simon Pilgrim	8c72d075d5	[X86][SSE] Regenerate test (just adds missing header) llvm-svn: 286241	2016-11-08 15:42:49 +00:00
Simon Pilgrim	ebd7d35206	[TargetLowering] Fix undef vector element issue with true/false result handling Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching. The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements.... This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs). The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed. Differential Revision: https://reviews.llvm.org/D26031 llvm-svn: 286238	2016-11-08 15:07:01 +00:00
Pablo Barrio	81c02096c5	[JumpThreading] Unfold selects that depend on the same condition Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: rengolin, haicheng, sebpop Subscribers: jojo, jmolloy, llvm-commits Differential Revision: https://reviews.llvm.org/D26391 llvm-svn: 286236	2016-11-08 14:53:30 +00:00
Simon Pilgrim	f59c806c0f	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233	2016-11-08 14:10:28 +00:00
Roger Ferrer Ibanez	1ef8a759be	[AArch64] Fix incorrect CSEL node created Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64 transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to "a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have the same type which can lead to a wrong CSEL node that crashes later due to nonsensical copies. Differential Revision: https://reviews.llvm.org/D26394 llvm-svn: 286231	2016-11-08 13:34:41 +00:00
Simon Dardis	80ea0632d1	[mips] Renable small data section test. llvm-svn: 286230	2016-11-08 13:03:45 +00:00
Craig Topper	770cc418cd	[AVX-512] Add an avx512f without avx512vl command line to vec_fp_to_int.ll and regenerate. This will make a change in a future patch easier to see. NFC llvm-svn: 286216	2016-11-08 06:58:53 +00:00
Tim Northover	55013bfe46	GlobalISel: support selecting G_SELECT on AArch64. llvm-svn: 286185	2016-11-08 00:45:29 +00:00
Tim Northover	7fe755bfd4	GlobalISel: constrain PHI registers on AArch64. Self-referencing PHI nodes need their destination operands to be constrained because nothing else is likely to do so. For now we just pick a register class naively. Patch mostly by Ahmed again. llvm-svn: 286183	2016-11-08 00:34:06 +00:00
Chad Rosier	acf03cea42	[AArch64] Remove dead check prefixes after r286110. NFC. llvm-svn: 286174	2016-11-07 23:13:59 +00:00
Chad Rosier	ce3a06bfd7	[AArch64] Rename test to reflect changes after r286110. NFC. llvm-svn: 286173	2016-11-07 23:13:55 +00:00
Stanislav Mekhanoshin	8fd3071d3d	[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions. Changed BE to report we have many condition registers. That way IR LICM pass would hoist an invariant comparison out of a loop and codegen prepare will not sink it. With that done a condition is calculated in one block and used in another. Current behavior is to store workitem's condition in a VGPR using v_cndmask and then restore it with yet another v_cmp instruction from that v_cndmask's result. To mitigate the issue a forward propagation of a v_cmp 64 bit result to an user is implemented. Additional side effect of this is that we may consume less VGPRs in a cost of more SGPRs in case if holding of multiple conditions is needed, and that is a clear win in most cases. llvm-svn: 286171	2016-11-07 23:04:50 +00:00
Adam Nemet	afb5f69d51	[OptDiag, opt-viewer] Save callee's location and display as link With this we get a new field in the YAML record if the value being streamed out has a debug location. For examples, please see the changes to the tests. This is then used in opt-viewer to display a link for the callee function in the inlining remarks. Differential Revision: https://reviews.llvm.org/D26366 llvm-svn: 286169	2016-11-07 22:41:13 +00:00
Sanjin Sijaric	f2f1248de9	[AArch64] Transfer memory operands when lowering vector load/store intrinsics Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168	2016-11-07 22:39:02 +00:00
Derek Schuff	82f1056a88	[WebAssembly] Emit a BasePointer when we have overly-aligned stack objects Because we shift the stack pointer by an unknown amount, we need an additional pointer. In the case where we have variable-size objects as well, we can't reuse the frame pointer, thus three pointers. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D26263 llvm-svn: 286160	2016-11-07 22:00:48 +00:00
Sanjoy Das	582475344b	Avoid tail recursion elimination across calls with operand bundles Summary: In some specific scenarios with well understood operand bundle types (like `"deopt"`) it may be possible to go ahead and convert recursion to iteration, but TailRecursionElimination does not have that logic today so avoid doing the right thing for now. I need some input on whether `"funclet"` operand bundles should also block tail recursion elimination. If not, I'll allow TRE across calls with `"funclet"` operand bundles and add a test case. Reviewers: rnk, majnemer, nlewycky, ahatanak Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26270 llvm-svn: 286147	2016-11-07 21:01:49 +00:00
Kuba Brecka	912a036573	[tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this. Differential Revision: https://reviews.llvm.org/D26266 llvm-svn: 286135	2016-11-07 19:09:56 +00:00
Matt Arsenault	74174e09f8	AMDGPU: Remove unnecessary and on conditional branch The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134	2016-11-07 19:09:33 +00:00
Matt Arsenault	2e0eb02554	AMDGPU: Preserve vcc undef flags when inverting branch If the branch was on a read-undef of vcc, passes that used analyzeBranch to invert the branch condition wouldn't preserve the undef flag resulting in a verifier error. Fixes verifier failures in a future commit. Also fix verifier error when inserting copy for vccz corruption bug. llvm-svn: 286133	2016-11-07 19:09:27 +00:00
Benjamin Kramer	5edd1d8287	[MemCpyOpt] Don't emit IR in an unspecified order Argument evaluation order is one of the edge cases where Clang differs from GCC, yielding different IR depending on which compiler LLVM was built with. Make the order deterministic and tune the test to actually verify the order instead of trying to hide it. llvm-svn: 286126	2016-11-07 17:47:28 +00:00
Richard Smith	78679af9aa	Add -O0 support for @llvm.invariant.group.barrier by discarding it if it gets to ISel. Differential Revision: https://reviews.llvm.org/D26292 llvm-svn: 286119	2016-11-07 16:47:20 +00:00
Sanjay Patel	e3709a01b8	[InstCombine] allow splat vector folds in adjustMinMax() (retry r285732) This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113	2016-11-07 15:52:45 +00:00
Amara Emerson	ff221e0718	This patch adds support for 16 bit floating point registers to the inline asm register selection on AArch64. Without this patch, register allocation for the example below fails. define half @test(half %a1, half %a2) #0 { entry: %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1 ret half %0 } Patch by Florian Hahn. Differential Revision: https://reviews.llvm.org/D25080 llvm-svn: 286111	2016-11-07 15:42:12 +00:00
Chad Rosier	9635b8783c	[AArch64] Removed the narrow load merging code in the ld/st optimizer. This feature has been disabled for some time now, so remove cruft. Differential Revision: https://reviews.llvm.org/D26248 llvm-svn: 286110	2016-11-07 15:27:22 +00:00
Chad Rosier	f56813d057	[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D26252 llvm-svn: 286108	2016-11-07 14:11:45 +00:00
James Molloy	39cc4bbf20	[Thumb1] Move padding earlier when synthesizing TBBs off of the PC When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding. If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset. In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4. llvm-svn: 286107	2016-11-07 13:38:21 +00:00
Simon Pilgrim	e8ff7800a3	[X86][AVX512] Add AVX512VL/AVX512BWVL vector truncation tests llvm-svn: 286105	2016-11-07 13:34:29 +00:00
Simon Pilgrim	02f5b60018	[X86][SSE] Drop unnecessary -mcpu argument from trunc tests cpu/triple duplication llvm-svn: 286104	2016-11-07 13:28:20 +00:00
Craig Topper	e3b3201e23	[AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to selects and native zext/sext. This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select. llvm-svn: 286092	2016-11-07 02:12:57 +00:00
Craig Topper	d66c30ad5a	[AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to legacy intrinsics and a select. llvm-svn: 286089	2016-11-07 00:13:39 +00:00
Saleem Abdulrasool	dd02e99d65	ARM: lower fpowi appropriately for Windows ARM This handles the last case of the builtin function calls that we would generate code which differed from Microsoft's ABI. Rather than generating a call to `__pow{d,s}i2` we now promote the parameter to a float or double and invoke `powf` or `pow` instead. Addresses PR30825! llvm-svn: 286082	2016-11-06 19:46:54 +00:00
Simon Pilgrim	6ff3359a0b	[SelectionDAG] Add support for vector demandedelts in XOR opcodes llvm-svn: 286075	2016-11-06 16:49:19 +00:00
Simon Pilgrim	ad8582a1b0	[X86] Add knownbits vector xor test In preparation for demandedelts support llvm-svn: 286074	2016-11-06 16:36:29 +00:00
Craig Topper	2adf7d91d8	[AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead upgrade them to a select and the older AVX2 intrinsic. llvm-svn: 286073	2016-11-06 16:29:19 +00:00
Craig Topper	303a104cf4	[AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286072	2016-11-06 16:29:14 +00:00
Simon Pilgrim	04f0d377ac	[SelectionDAG] Add support for vector demandedelts in OR opcodes llvm-svn: 286071	2016-11-06 16:29:09 +00:00
Craig Topper	1c0e04ad66	[AVX-512] Remove intrinsics for 128/256-bit masked shift by single element in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286070	2016-11-06 16:29:08 +00:00
Craig Topper	ce6181cd9f	[AVX-512] Remove a 512-bit test cases from the avx512vl test file. It already exists in the avx512f test file. llvm-svn: 286069	2016-11-06 16:29:03 +00:00
Simon Pilgrim	e95f112c54	[X86] Add knownbits vector or test In preparation for demandedelts support llvm-svn: 286068	2016-11-06 16:05:59 +00:00
Craig Topper	4cb57b6a7a	[X86] Add a few more fptoui test cases to the vec_fp_to_int.ll. The codegen for these test cases will be improved for AVX512 in a future commit. llvm-svn: 286063	2016-11-06 07:50:25 +00:00

1 2 3 4 5 ...

40618 Commits