llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Craig Topper	e73b5e0aba	[X86] Add avx512vpopcntdq to Knights Mill As indicated by Table 1-1 in Intel Architecture Instruction Set Extensions and Future Features Programming Reference from October 2017. llvm-svn: 316592	2017-10-25 17:10:32 +00:00
Simon Dardis	b53aa7b645	[mips] Clean up some whitespace (NFC). Also test that my email address was updated. llvm-svn: 316575	2017-10-25 13:35:53 +00:00
Diana Picus	f077d0bade	[ARM GlobalISel] Fix call opcodes We were generating BLX for all the calls, which was incorrect in most cases. Update ARMCallLowering to generate BL for direct calls, and BLX, BX_CALL or BMOVPCRX_CALL for indirect calls. llvm-svn: 316570	2017-10-25 11:42:40 +00:00
Sam Parker	7b10d04981	[ARM] OrCombineToBFI function Extract the functionality to combine OR to BFI into its own function. Differential Revision: https://reviews.llvm.org/D39001 llvm-svn: 316563	2017-10-25 08:37:33 +00:00
Sam Parker	b5345a57e5	[ARM] Swap cmp operands for automatic shifts Swap the compare operands if the lhs is a shift and the rhs isn't, as in arm and T2 the shift can be performed by the compare for its second operand. Differential Revision: https://reviews.llvm.org/D39004 llvm-svn: 316562	2017-10-25 08:33:06 +00:00
Martin Storsjo	abdaedf85b	[AArch64] Add support for dllimport of values and functions Previously, the dllimport attribute did the right thing in terms of treating it as a pointer to a value, but this makes sure the names get mangled properly, and calls to such functions load the function from the __imp_ pointer. This is based on SVN r212431 and r212430 where the same was implemented for ARM. Differential Revision: https://reviews.llvm.org/D38530 llvm-svn: 316555	2017-10-25 07:25:18 +00:00
Matt Arsenault	630768ab44	AMDGPU: Add max-mix-insts subtarget feature llvm-svn: 316553	2017-10-25 07:00:51 +00:00
Yonghong Song	b12f0ea4c2	bpf: fix an uninitialized variable issue Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 316519	2017-10-24 21:36:33 +00:00
David Blaikie	c1d52145bb	ARMAddressingModes.h: Don't mark header functions as file local llvm-svn: 316517	2017-10-24 21:29:21 +00:00
David Blaikie	3334cea8e6	HexagonDepTimingClasses.h: Don't mark header functions as file local llvm-svn: 316508	2017-10-24 21:29:16 +00:00
David Blaikie	ad7ec36059	WebassemblyAsmPrinter.h: Include WebAssemblyMachineFunctionInfo for use with MachineFunction::getInfo llvm-svn: 316507	2017-10-24 21:29:15 +00:00
David Blaikie	972b0da039	X86Operand.h: Include X86MCTargetDesc.h for SSE register enum/names llvm-svn: 316506	2017-10-24 21:29:15 +00:00
David Blaikie	55aef8c59a	X86AsmPrinter.h: Add missing header for complete type needed for MCCodeEmitter dtor. llvm-svn: 316505	2017-10-24 21:29:14 +00:00
Artem Belevich	dab780c7cb	[NVPTX] allow address space inference for volatile loads/stores. If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495	2017-10-24 20:31:44 +00:00
Gadi Haber	671f5b9cbf	[X86][Broadwell] Added the instruction scheduling information for the Broadwell CPU. Adding the scheduling information for the Browadwell (BDW) CPU target. This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target. We used the scheduling information retrieved from the Broadwell architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction. The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175. Performance fluctuations may be expected due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39054 Change-Id: If6f799e5ff60e1091c8d43b05ea78c53581bae01 llvm-svn: 316492	2017-10-24 20:19:47 +00:00
Yonghong Song	34fe2515e8	bpf: fix a bug in trunc-op optimization Previous implementation for per-function scope is incorrect and too conservative. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 316481	2017-10-24 18:21:10 +00:00
Stefan Pintilie	1e1dcf2d50	[PowerPC] Try to simplify a Swap if it feeds a Splat If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Fixed the test case that was failing and recommit after pulling the original commit. Original revision is here: https://reviews.llvm.org/D39009 llvm-svn: 316478	2017-10-24 17:44:27 +00:00
Yonghong Song	a5f6d3078d	bpf: fix a bug in bpf-isel trunc-op optimization In BPF backend, we try to optimize away redundant trunc operations so that kernel verifier rewrite remains valid. Previous implementation only works for a single function. This patch fixed the issue for multiple functions. It clears internal map data structure before performing optimization for each function. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 316469	2017-10-24 17:29:03 +00:00
Simon Pilgrim	cf7778b801	[X86][AVX] ComputeNumSignBitsForTargetNode - add support for X86ISD::VTRUNC llvm-svn: 316462	2017-10-24 17:04:57 +00:00
Saleem Abdulrasool	d2f2439b8e	PowerPC: support the separator character in the IAS PowerPC uses ; as a comment leader and the @ as a separator character. Support this properly. llvm-svn: 316454	2017-10-24 16:19:56 +00:00
Simon Pilgrim	699fd4c1e9	[X86] truncateVectorCompareWithPACKSS - use PACKSSDW/PACKSSWB instead of just PACKSSWB. By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits. llvm-svn: 316448	2017-10-24 15:38:16 +00:00
Oliver Stannard	ddb7bd4314	[ARM] Error for invalid shift in memory operand Report a diagnostic when we fail to parse a shift in a memory operand because the shift type is not an identifier. Without this, we were silently ignoring the whole instruction. Differential revision: https://reviews.llvm.org/D39237 llvm-svn: 316441	2017-10-24 14:19:08 +00:00
Simon Pilgrim	b5b0de59c7	[X86] truncateVectorCompareWithPACKSS - remove duplicate variables. NFCI. llvm-svn: 316440	2017-10-24 14:18:32 +00:00
Andrew V. Tischenko	9dd9aef0bb	Update f16c instruction scheduling on btver2. Differential Revision: https://reviews.llvm.org/D39051 llvm-svn: 316435	2017-10-24 13:38:30 +00:00
Zvi Rackover	8709876805	X86CallFrameOptimization: Update comments and variable names. NFCI. Following up on D38738. llvm-svn: 316434	2017-10-24 13:24:26 +00:00
Zvi Rackover	168ce5d5ba	X86CallFrameOptimization: Recognize 'store 0/-1 using and/or' idioms Summary: r264440 added or/and patterns for storing -1 or 0 with the intention of decreasing code size. However, X86CallFrameOptimization does not recognize these memory accesses so it will not replace them with push's when profitable. This patch fixes this problem by teaching X86CallFrameOptimization these store 0/-1 idioms. An alternative fix would be to prevent the 'store 0/1 idioms' patterns from firing when accessing the stack. This would save the need to teach the pass about these idioms. However, because X86CallFrameOptimization does not always fire we may result in cases where neither X86CallFrameOptimization not the patterns for 'store 0/1 idioms' fire. Fixes pr34863 Reviewers: DavidKreitzer, guyblank, aymanmus Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38738 llvm-svn: 316431	2017-10-24 12:13:05 +00:00
Marek Olsak	f5ea109f9c	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427	2017-10-24 10:27:13 +00:00
Marek Olsak	68ff4e0dd2	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426	2017-10-24 10:26:59 +00:00
Oliver Stannard	31d2b230dc	[ARM] Replace development diagnostics with normal DEBUG macro * Remove the -arm-asm-parser-dev-diags option. * Use normal DEBUG(dbgs()) printing for the extra development information about missing diagnostics. Differential Revision: https://reviews.llvm.org/D39194 llvm-svn: 316423	2017-10-24 09:46:56 +00:00
Oliver Stannard	a03f6228a7	[ARM] tSETEND needs IsThumb This is the Thumb encoding, so the Requires list must include IsThumb. No test because we happen to select the ARM one first, but that's just luck. Differential Revision: https://reviews.llvm.org/D39190 llvm-svn: 316421	2017-10-24 09:03:33 +00:00
Oliver Stannard	efed1c1110	[ARM] Remove tCPS alias which just crashed This alias caused a crash when trying to print the "cps #0" instruction in a diagnostic for thumbv6 (which doesn't have that instruction). The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits are set, so I don't think it's worth keeping. Differential Revision: https://reviews.llvm.org/D39191 llvm-svn: 316420	2017-10-24 08:55:36 +00:00
Zvi Rackover	7b39771114	X86: Fix X86CallFrameOptimization to search for the COPY StackPointer SelectionDAG inserts a copy of ESP into a virtual register. X86CallFrameOptimization assumed that the COPY, if present, is always right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a wrong assumption as the COPY can be located anywhere between the call-frame setup instruction and its first use. If the COPY happened to be located in a different location than what X86CallFrameOptimization assumed, visiting it while processing the call chain would lead to a conservative bail-out. The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note of it so it can be ignored while processing the call chain. Fixes pr34903 Differential Revision: https://reviews.llvm.org/D38730 llvm-svn: 316416	2017-10-24 07:38:29 +00:00
Omer Paparo Bivas	3f4c58083e	[MC] Adding code padding for performance stability - infrastructure. NFC. Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved. The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known. This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding. Reviewers: aaboud zvi craig.topper gadi.haber Differential revision: https://reviews.llvm.org/D34393 Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e llvm-svn: 316413	2017-10-24 06:16:03 +00:00
Zvi Rackover	ff00bdfe53	X86: Register the X86CallFrameOptimization pass Summary: The motivation of this change is to enable .mir testing for this pass. Added one test case to cover the functionality, this same case will be improved by a future patch. Reviewers: igorb, guyblank, DavidKreitzer Reviewed By: guyblank, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38729 llvm-svn: 316412	2017-10-24 05:47:07 +00:00
Konstantin Zhuravlyov	cc9ffa76b1	AMDGPU: Initialize WavefrontSize from TD files Differential Revision: https://reviews.llvm.org/D39205 llvm-svn: 316389	2017-10-23 23:02:39 +00:00
Simon Pilgrim	78dab77c07	[X86][SSE] combineBitcastvxi1 - use PACKSSWB directly to pack v8i16 to v16i8 Avoid difficulties determining the number of sign bits later on in shuffle lowering to lower to PACKSS llvm-svn: 316383	2017-10-23 22:05:02 +00:00
Stefan Pintilie	7df5a27d54	Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat" Revert commit r316366. Previous commit causes p8-scalar_vector_conversions.ll to fail. This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92. llvm-svn: 316371	2017-10-23 20:22:23 +00:00
Krzysztof Parzyszek	07f5a8acc7	[Hexagon] Return the correct chain edge for i1 function calls In HexagonISelLowering, there is code to handle the case when a function returns an i1 type. In this case, we need to generate extra nodes to copy the result from R0 to a predicate register. The code was returning the wrong value for the chain edge which caused an assert "Wrong topological sorting" when converting the instructions to MIs. This patch fixes the problem by returning the chain for the final copy. Patch by Brendon Cahoon. llvm-svn: 316367	2017-10-23 19:35:25 +00:00
Stefan Pintilie	789ee2f699	[PowerPC] Try to simplify a Swap if it feeds a Splat If we have the situation where a Swap feeds a Splat we can sometimes change the index on the Splat and then remove the Swap instruction. Differential Revision: https://reviews.llvm.org/D39009 llvm-svn: 316366	2017-10-23 19:33:31 +00:00
Krzysztof Parzyszek	d67f165bc2	[Hexagon] Add extra pattern for S4_addaddi One combination was missing: add(add(x,y),c). llvm-svn: 316363	2017-10-23 19:07:50 +00:00
Daniel Sanders	572038831c	[globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero. This patch enables the import of stores. Unfortunately, doing so by itself, loses an optimization where storing 0 to memory makes use of WZR/XZR. To mitigate this, this patch also introduces a new feature that allows register operands to nominate a zero register. When this is done, GlobalISel will substitute (G_CONSTANT 0) with the nominated register automatically. This is currently configured to only apply to the stores. Applying it to GPR32/GPR64 register classes in general will be done after review see (https://reviews.llvm.org/D39150). llvm-svn: 316360	2017-10-23 18:19:24 +00:00
Matt Arsenault	6ae9698d8a	AMDGPU: Cleanup local atomic node names llvm-svn: 316349	2017-10-23 17:16:43 +00:00
Matt Arsenault	a23dd86c65	AMDGPU: Fix default range in non-kernel functions The range should be assumed to be the hardware maximum if a workitem intrinsic is used in a callable function which does not know the restricted limit of the calling kernel. llvm-svn: 316346	2017-10-23 17:09:35 +00:00
Craig Topper	0a5a3dcf72	[X86] Change VMPTRST to use PS instead of TB to match VMPTRLD. llvm-svn: 316340	2017-10-23 16:22:40 +00:00
Craig Topper	308925e2ff	[X86] Change RDRAND to use PS instead of TB. Should be no functional change for now. A future disassembler change will prevent disassembling with 0xf2/0xf3. llvm-svn: 316339	2017-10-23 16:22:38 +00:00
Craig Topper	2e177825af	[X86] Change XRSTOR to use PS instead of TB to match XSAVE. I don't think this changes anything functionally yet, but I plan to fix the disassembler to use this to disable matching certain instructions with 0xf3/0xf2/0x66 prefixes. llvm-svn: 316337	2017-10-23 16:11:33 +00:00
Simon Pilgrim	08fead44ad	[X86][SSE] Remove AssertZext stage from PEXTRW/PEXTRB lowering. NFCI. Remove AssertZext and instead add PEXTRW/PEXTRB support to computeKnownBitsForTargetNode to simplify instruction selection. Differential Revision: https://reviews.llvm.org/D39169 llvm-svn: 316336	2017-10-23 16:00:57 +00:00
Andrew V. Tischenko	399b71394c	Update DPPD/DPPS instruction scheduling on btver2. Differential Revision: https://reviews.llvm.org/D39046 llvm-svn: 316334	2017-10-23 15:53:30 +00:00
Craig Topper	59b7ca4de6	[X86] Add PTWRITE instruction for assembler and disassembler. llvm-svn: 316333	2017-10-23 15:53:21 +00:00
Craig Topper	a0c1fe3d0d	[X86] Add RDPID instruction for assembler and disassembler. llvm-svn: 316332	2017-10-23 15:53:16 +00:00

1 2 3 4 5 ...

44480 Commits