llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	369005c7c7	[X86][AVX512] Add DQ+VLX scalar int<->fp tests cases for D43441 llvm-svn: 325804	2018-02-22 16:29:08 +00:00
Stefan Maksimovic	1018fe2e77	[mips] Generate memory dependencies for byVal arguments There were no memory dependencies made between stores generated when lowering formal arguments and loads generated when call lowering byVal arguments which made the Post-RA scheduler place a load before a matching store. Make the fixed object stored to mutable so that the load instructions can have their memory dependencies added Set the frame object as isAliased which clears the underlying objects vector in ScheduleDAGInstrs::buildSchedGraph(). This results in addition of all stores as dependenies for loads. This problem appeared when passing a byVal parameter coupled with a fastcc function call. Differential Revision: https://reviews.llvm.org/D37515 llvm-svn: 325782	2018-02-22 13:40:42 +00:00
Simon Dardis	e27e32a4b5	[mips] Regenerate tests for D38128 (NFC) llvm-svn: 325770	2018-02-22 11:53:01 +00:00
Sjoerd Meijer	eeeedc6071	Recommit: [ARM] f16 constant pool fix This recommits r325754; the modified and failing test case actually didn't need any modifications. llvm-svn: 325765	2018-02-22 10:43:57 +00:00
David Green	86bf0a6446	[ARM] Fix issue with large xor constants. Fixup to rL325573 for large xor constants. Thanks to Eli Friedman for the catch. Differential revision: https://reviews.llvm.org/D43549 llvm-svn: 325761	2018-02-22 09:38:57 +00:00
Sjoerd Meijer	49c327d9d4	Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy. llvm-svn: 325756	2018-02-22 08:41:55 +00:00
Sjoerd Meijer	2e623a8f3f	Added a test that I forgot to svn add in my previous commit r325754. llvm-svn: 325755	2018-02-22 08:20:50 +00:00
Sjoerd Meijer	963d510433	[ARM] f16 constant pool fix This is a follow up of r325012, that allowed half types in constant pools. Proper alignment was enforced when a big basic block was split up, but not when a CPE was placed before/after a block; the successor block had the wrong alignment. Differential Revision: https://reviews.llvm.org/D43580 llvm-svn: 325754	2018-02-22 08:16:05 +00:00
Nemanja Ivanovic	237f644f7b	[PowerPC] Do not produce invalid CTR loop with an FRem An FRem instruction inside a loop should prevent the loop from being converted into a CTR loop since this is not an operation that is legal on any PPC subtarget. This will always be a call to a library function which means the loop will be invalid if this instruction is in the body. Fixes PR36292. llvm-svn: 325739	2018-02-22 03:02:41 +00:00
Simon Pilgrim	b1c2f455d0	[X86][MMX] Generlize MMX_MOVD64rr combines to accept v4i16/v8i8 build vectors as well as v2i32 Also handle both cases where the lower 32-bits of the MMX is undef or zero extended. llvm-svn: 325736	2018-02-21 23:07:30 +00:00
Simon Pilgrim	79971ea2d5	[X86][MMX] Add MMX_MOVD64rr build vector tests showing undef elements in the lower half llvm-svn: 325729	2018-02-21 22:10:48 +00:00
Simon Pilgrim	cfbbbe13ff	[X86][MMX] Run MMX bitcast test on 32 and 64-bit targets llvm-svn: 325707	2018-02-21 18:52:16 +00:00
Simon Pilgrim	e11407b82a	[X86][MMX] Regenerate MMX MASKMOV test llvm-svn: 325698	2018-02-21 16:38:08 +00:00
Jonas Paulsson	1ba71f37a4	[Hexagon] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Krzysztof Parzyszek llvm-svn: 325697	2018-02-21 16:37:45 +00:00
Simon Pilgrim	1367bf131c	[X86][MMX] Regenerate MMX arithmetic tests llvm-svn: 325696	2018-02-21 16:37:10 +00:00
Jonas Devlieghere	79ff112122	[Sparc] Include __tls_get_addr in symbol table for TLS calls to it Global Dynamic and Local Dynamic call relocations only implicitly reference __tls_get_addr; there is no connection in the ELF file between the relocations and the symbol other than the specification for the relocations' semantics. However, it still needs to be in the symbol table despite the lack of explicit references to the symbol table entry, since it needs to be bound at link time for these relocations, otherwise any objects will fail to link. For details, see https://sourceware.org/bugzilla/show_bug.cgi?id=22832. Path by: James Clarke (jrtc27) Differential revision: https://reviews.llvm.org/D43271 llvm-svn: 325688	2018-02-21 15:25:26 +00:00
Simon Pilgrim	df67c3be6f	[X86][MMX] Regenerate MMX PSUB commutation test llvm-svn: 325685	2018-02-21 15:07:47 +00:00
Simon Pilgrim	b9bcc5c1cf	[X86] Regenerate GPR:XMM bitcast test llvm-svn: 325684	2018-02-21 15:05:47 +00:00
Nicolai Haehnle	ab865ff17a	AMDGPU: Do not combine loads/store across physreg defs Summary: Since this pass operates on machine SSA form, this should only really affect M0 in practice. Fixes various piglit variable-indexing/vs-varying-array-mat4-index-* Change-Id: Ib2a1dc3a8d7b08225a8da49a86f533faa0986aa8 Fixes: r317751 ("AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4") Reviewers: arsenm, mareko, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40343 llvm-svn: 325677	2018-02-21 13:31:35 +00:00
Simon Pilgrim	f41068eb9c	[X86][MMX] Add PR29222 test case llvm-svn: 325675	2018-02-21 12:06:27 +00:00
Simon Pilgrim	5e12a3341d	[X86][MMX] Add some MMX build vector tests llvm-svn: 325674	2018-02-21 12:01:30 +00:00
Craig Topper	bf52c09a2e	[X86] Disable CLWB for Cannon Lake Cannon Lake does not support CLWB, therefore it does not include all features listed under SKX anymore. Instead, enumerate all SKX features with the exception of CLWB. Patch by Gabor Buella Differential Revision: https://reviews.llvm.org/D43380 llvm-svn: 325654	2018-02-21 00:15:48 +00:00
Simon Dardis	8181753289	[mips] Spectre variant two mitigation for MIPSR2 This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It implements the LLVM part of -mindirect-jump=hazard. It is _not_ enabled by default for the P5600. The migitation strategy suggested by MIPS for these processors is to use hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the attribute +use-indirect-jump-hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Performance benchmarking of this option with -fpic and lld using -z hazardplt shows a difference of overall 10%~ time increase for the LLVM testsuite. Certain benchmarks such as methcall show a substantially larger increase in time due to their nature. Reviewers: atanasyan, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D43486 llvm-svn: 325653	2018-02-21 00:06:53 +00:00
Konstantin Zhuravlyov	b376e1565e	Revert "[AMDGPU] Increased vector length for global/constant loads." https://reviews.llvm.org/rL325518 It breaks following OpenCL conformance tests: - Basic - parameter_types - Basic - vload_private llvm-svn: 325643	2018-02-20 23:30:21 +00:00
Craig Topper	e0ad66947f	[X86] Fix copy/paste mistake in test. The contents of the test case didnt' match the name of the test case. And they were identical to the test above. llvm-svn: 325635	2018-02-20 22:33:23 +00:00
Craig Topper	6d7a4fefb4	[SelectionDAG] Support known true/false SimplifySetCC cases for comparing against vector splats of constants. This is split off from D42948 and includes just the cases that constant fold to true or false. It also includes some refactoring to keep predicate checks together. This supports things like (setcc uge X, 0) -> true Differential Revision: https://reviews.llvm.org/D43489 llvm-svn: 325627	2018-02-20 21:48:14 +00:00
Evandro Menezes	b11fbd7d6f	[AArch64] Refactor instructions using SIMD immediates Get rid of icky goto loops and make the code easier to maintain. Otherwise, NFC. Restore r324903 and fix PR36369. Differentail revision: https://reviews.llvm.org/D43364 llvm-svn: 325621	2018-02-20 20:31:45 +00:00
Sjoerd Meijer	c8b0ed116a	[ARM] Lower BR_CC for f16 This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616	2018-02-20 19:28:05 +00:00
Stanislav Mekhanoshin	f23fe1bf79	[AMDGPU] Removed redundant run lines for fmuladd.f16 test. NFC. llvm-svn: 325615	2018-02-20 19:19:56 +00:00
Simon Pilgrim	92e6f3ba0f	[X86][MMX] Regenerate MMX bitcast test llvm-svn: 325611	2018-02-20 18:48:29 +00:00
Simon Pilgrim	34a2d7157c	[X86][3DNow] Regenerate intrinsics tests llvm-svn: 325609	2018-02-20 18:44:21 +00:00
Krzysztof Parzyszek	69728227c1	[Hexagon] Handle *Low8 register classes in early if-conversion llvm-svn: 325606	2018-02-20 18:19:17 +00:00
Craig Topper	df44496118	[X86] Correct SHRUNKBLEND creation to work correctly when there are multiple uses of the condition. SimplifyDemandedBits forces the demanded mask to all 1s if the node has multiple uses, unless the AssumeSingleUse flag is set. So previously we were only really likely to simplify something if the condition had a single use. And on the off chance we did simplify with multiple uses the demanded mask being used was all ones so there was no reason to create a shrunkblend. This patch now checks that the condition is only used by selects first, and then sets the AssumeSingleUse flag for the simplifcation. Then we convert the selects to shrunkblend, and finally replace condition. Differential Revision: https://reviews.llvm.org/D43446 llvm-svn: 325604	2018-02-20 17:58:17 +00:00
Craig Topper	0558ecc62b	[SelectionDAG] Add LegalTypes flag to getShiftAmountTy. Use it to unify and simplify DAGCombiner and simplifySetCC code and fix a bug. DAGCombiner and SimplifySetCC both use getPointerTy for shift amounts pre-legalization. DAGCombiner uses a single helper function to hide this. SimplifySetCC does it in multiple places. This patch adds a defaulted parameter to getShiftAmountTy that can make it return getPointerTy for scalar types. Use this parameter to simplify the SimplifySetCC and DAGCombiner. Additionally, there were two places in SimplifySetCC that were creating shifts using the target's preferred shift amount pre-legalization. If the target uses a narrow type and the type is illegal, this can cause SimplfiySetCC to create a shift with an amount that can't represent all possible shift values for the type. To fix this we should use pointer type there too. Alternatively we could make getScalarShiftAmountTy for each target return a safe value for large types as proposed in D43445. And maybe we should still do that, but fixing the SimplifySetCC code keeps other targets from tripping over this in the future. Fixes PR36250. Differential Revision: https://reviews.llvm.org/D43449 llvm-svn: 325602	2018-02-20 17:41:05 +00:00
Craig Topper	198aed3c9e	[X86] Promote 16-bit cmovs to 32-bits This allows us to avoid an opsize prefix. And forcing some move immediates to i32 avoids a length changing prefix on those instructions. This mostly replaces the existing combine we had for zext/sext+cmov of constants. I left in a case for sign extending a 32 bit cmov of constants to 64 bits. Differential Revision: https://reviews.llvm.org/D43327 llvm-svn: 325601	2018-02-20 17:41:00 +00:00
Lei Huang	21235789c0	[PowerPC] Reduce stack frame for fastcc functions by only allocating parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581	2018-02-20 15:09:45 +00:00
Krzysztof Parzyszek	9eec84ff0c	[Hexagon] Fix alignment calculation of stack objects in Hexagon bit tracker llvm-svn: 325580	2018-02-20 14:29:43 +00:00
Simon Pilgrim	2e780a63b4	[X86] Regenerate XOR tests llvm-svn: 325579	2018-02-20 14:08:39 +00:00
David Green	25f3a586cf	[ARM] Mark -1 as cheap in xor's for thumb1 We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1 would not otherwise be considered a cheap constant. This prevents the -1's from being pulled out into constants and potentially hoisted. Differential Revision: https://reviews.llvm.org/D43451 llvm-svn: 325573	2018-02-20 11:07:35 +00:00
George Rimar	b4d7f82f75	[llvm-mc] - Produce R_X86_64_PLT32 for "call/jmp foo". For instructions like call foo and jmp foo patch changes relocation produced from R_X86_64_PC32 to R_X86_64_PLT32. Relocation can be used as a marker for 32-bit PC-relative branches. Linker will reduce PLT32 relocation to PC32 if function is defined locally. Differential revision: https://reviews.llvm.org/D43383 llvm-svn: 325569	2018-02-20 10:17:57 +00:00
Tim Renouf	1c9d5cdeab	[AMDGPU] stop buffer_store being moved illegally Summary: The machine instruction scheduler was illegally moving a buffer store past a buffer load with the same descriptor and offset. Fixed by marking buffer ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D43332 Change-Id: Iff3173d9e0653e830474546276ab9d30318b8ef7 llvm-svn: 325567	2018-02-20 10:03:38 +00:00
Craig Topper	2377f160f4	[X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics. The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select. llvm-svn: 325559	2018-02-20 07:28:14 +00:00
Amara Emerson	a99b25a021	[AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first. This is a follow on commit to r[x] where we fix the other direction of copy. For this case, after converting the source from gpr32 -> fpr32, we use a subregister copy, which is essentially what EXTRACT_SUBREG does in SDAG land. https://reviews.llvm.org/D43444 llvm-svn: 325550	2018-02-20 05:11:57 +00:00
Craig Topper	fc84bda806	[X86] Mark XOP vpmac* and vpmadc intrinsics as being commutative so that tablegen will generate patterns with the load in operand 0. This allows loads to be folded during isel without the peephole pass. llvm-svn: 325548	2018-02-20 03:58:14 +00:00
Craig Topper	6b23b7924a	[X86] Use vpmovq2m/vpmovd2m for truncate to vXi1 when possible. Previously we used vptestmd, but the scheduling data for SKX says vpmovq2m/vpmovd2m is lower latency. We already used vpmovb2m/vpmovw2m for byte/word truncates. So this is more consistent anyway. llvm-svn: 325534	2018-02-19 22:07:31 +00:00
Craig Topper	deb8d1bdf2	[X86] Stop swapping the operands of AVX512 setge. We swapped the operands and used setle, but I don't see any reason to do that. I think this is a holdover from SSE where we swap and the invert to use pcmpgt. But with AVX512 we don't want an invert so we won't use pcmpgt. So there's no need to swap. llvm-svn: 325527	2018-02-19 19:23:35 +00:00
Craig Topper	e96cad0b0e	[X86] Reduce the number of isel pattern variations needed for VPTESTM/VPTESTNM matching. Canonicalize EQ/NE PCMPM to have build vector all zeros on the RHS so we don't have to pattern match it in both locations. This significantly reduces the number of isel patterns needed since we also had to multiply it out with loads being in either operand of the 'and' input node and in the 'and' masking node. This removes over 24000 bytes from the isel table. llvm-svn: 325526	2018-02-19 19:23:31 +00:00
Mark Searles	70cba954aa	[AMDGPU] Make note of existing waitcnt instrs; this is add-on work related to suppression of redundant waitcnt instrs. It is necessary to make note of these existing waitcnt instrs so that we do not fall into an infinite loop when handling loops. Also, [NFC] some minor code clean-up. llvm-svn: 325524	2018-02-19 19:19:59 +00:00
Simon Pilgrim	e1f55f9c90	[SelectionDAG] ComputeKnownBits - add support for SMIN+SMAX clamp patterns If we have a clamp pattern, SMIN(SMAX(X, LO),HI) or SMAX(SMIN(X, HI),LO) then we can deduce that the number of signbits (zeros/ones) will be at least the minimum of the LO and HI constants. ComputeKnownBits equivalent of D43338. Differential Revision: https://reviews.llvm.org/D43463 llvm-svn: 325521	2018-02-19 18:08:16 +00:00
Mark Searles	bf29d8d265	[AMDGPU] Increased vector length for global/constant loads. Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D43275 llvm-svn: 325518	2018-02-19 16:42:49 +00:00

1 2 3 4 5 ...

23546 Commits