llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Craig Topper	604077c19d	[X86] Add a new format type for instructions that represent named prefix bytes like data16 and rep. Use it to make a simpler version of isPrefix. isPrefix was added to support the patches to align branches. it relies on a switch over instruction names. This moves those opcodes to a new format so the information is tablegen and we can just check for a specific value in some bits in TSFlags instead. I've left the other function in place for now so that the existing patches in phabricator will still work. I'll work with the owner to get them migrated.	2020-02-21 12:34:59 -08:00
Reid Kleckner	9dbb4c30c4	[IR] Update BasicBlock::validateInstrOrdering comments, NFC Pointed out by Jay Foad.	2020-02-21 12:33:16 -08:00
Francesco Petrogalli	147105c3eb	[llvm][CodeGen][aarch64] Add contiguous prefetch intrinsics for SVE. Summary: The patch covers both register/register and register/immediate addressing modes. Reviewers: efriedma, andwar, sdesmalen Reviewed By: sdesmalen Subscribers: sdesmalen, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74581	2020-02-21 20:22:25 +00:00
Sanjay Patel	dd89b4293b	[VectorCombine] refactor cost calcs to reduce duplication; NFC More cleanup is possible now, but we probably need to resolve the TODO about the existing difference between compares and binops.	2020-02-21 15:12:00 -05:00
Francesco Petrogalli	ab5eafb42f	[llvm][aarch64] SVE addressing modes. Summary: Added register + immediate and register + register addressing modes for the following intrinsics: 1. Masked load and stores: * Sign and zero extended load and truncated stores. * No extension or truncation. 2. Masked non-temporal load and store. Reviewers: andwar, efriedma Subscribers: cameron.mcinally, sdesmalen, tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74254	2020-02-21 20:02:34 +00:00
Krzysztof Parzyszek	11344326c1	[Hexagon] Simplify intrinsic (vandvrt (vandqrt q b) m) -> q if possible When each byte in b&m is non-zero, this conversion Q->V->Q is a no-op.	2020-02-21 13:56:04 -06:00
Cameron McInally	a8ae9715b8	[AArch64][SVE] Add backend support for splats of immediates This patch adds backend support for splats of both Int and FP immediates. Differential Revision: https://reviews.llvm.org/D74856	2020-02-21 13:21:47 -06:00
Fangrui Song	318004ba1e	[ARM] Change ARMAttributeParser::Parse to use support::endianness and simplify	2020-02-21 11:05:33 -08:00
Matt Arsenault	3f406340ef	AMDGPU/GlobalISel: Fix SALU mapping for v2s16 min/max The legalizer helper functions are unusably awkward to perform the 3-5 part legalization. This needs to be widened, scalarized, lowered, and we should avoid creating vector extends and truncates. Manually do all of this and expand.	2020-02-21 14:02:16 -05:00
Matt Arsenault	13ad7999a2	AMDGPU: Move dot intrinsic patterns to instruction def I tried to use some of the new tablegen features to avoid creating different operand list permutations, but I still don't see a way to programmatically build a source pattern dag. Also add GlobalISel tests, which now all import successfully. Some of the fneg fold tests are incorrect, which need to be fixed in a future commit	2020-02-21 13:35:40 -05:00
Matt Arsenault	4dec0b4740	AMDGPU/GlobalISel: Select llvm.amdgcn.fdot2 I'm slighly worried about the generated checks, since they won't catch incorrect modifiers being added at the end of the line.	2020-02-21 13:35:40 -05:00
Matt Arsenault	0c425ae0df	AMDGPU/GlobalISel: Select VOP3P instructions This only handles the basic cases. More work is needed to make better use of op_sel.	2020-02-21 13:35:40 -05:00
Matt Arsenault	a73f91a16b	AMDGPU/GlobalISel: Select G_SHUFFLE_VECTOR G_SHUFFLE_VECTOR is legal since it theoretically may help match op_sel for VOP3P instructions. Expand it in some other way in case it doesn't fold into the use instructions.	2020-02-21 13:35:40 -05:00
Simon Pilgrim	bf8e285e39	[LoopVectorize][X86] Regenerate tests. NFCI.	2020-02-21 18:23:55 +00:00
jasonliu	de676f52d6	[XCOFF][AIX] Put undefined symbol name into StringTable when neccessary Summary: When we have a long name for the undefined symbol, we would hit this assertion: Assertion failed: I != StringIndexMap.end() && "String is not in table!" This patch addresses that. Reviewed by: DiggerLin, daltenty Differential Revision: https://reviews.llvm.org/D74924	2020-02-21 18:18:31 +00:00
Francesco Petrogalli	07c06734bb	[llvm][CodeGen] DAG Combiner folds for vscale. Summary: This patch simplifies the DAGs generated when using the intrinsic `@llvm.vscale.` as follows: Fold (add (vscale * C0), (vscale * C1)) to (vscale * (C0 + C1)). * Canonicalize (sub X, (vscale * C)) to (add X, (vscale * -C)). * Fold (mul (vscale * C0), C1) to (vscale * (C0 * C1)). * Fold (shl (vscale * C0), C1) to (vscale * (C0 << C1)). The test `sve-gep-ll` have been updated to reflect the folding introduced by this patch. Reviewers: efriedma, sdesmalen, andwar, rengolin Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74782	2020-02-21 18:03:12 +00:00
Nikita Popov	01133bb6c1	[InstCombine] Improve simplify demanded bits worklist management This fixes a small mistake from D72944: The worklist add should happen before assigning the new operand, not after. In case an actual replacement happens, the old operand needs to be added for DCE. If no actual replacement happens, then old/new are the same, so it doesn't matter. This drops one iteration from the annotated test case.	2020-02-21 18:51:41 +01:00
Hiroshi Yamauchi	e2050248f6	[BFI] Fix missed BFI updates in MachineSink. Summary: This prevents BFI queries on new blocks (from MachineSinking::GetAllSortedSuccessors) and fixes a bunch of assert failures under -check-bfi-unknown-block-queries=true. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74511	2020-02-21 09:50:54 -08:00
Fangrui Song	a6c30bdac4	[Clang interpreter] Rename Block.{h,cpp} to InterpBlock.{h,cpp} The Blocks runtime provide a header named Block.h. It is generally preferable to avoid name collision with system headers (reducing reliance on -isystem order, more friendly when navigating files in an editor, etc). Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D74934	2020-02-21 09:47:28 -08:00
Nikita Popov	934fdf7731	[InstCombine] Use replaceOperand() in more places Followup to D73919 with another batch of replacements of setOperand() -> replaceOperand(), to make sure the old operand gets DCEd right away. Differential Revision: https://reviews.llvm.org/D74932	2020-02-21 18:41:16 +01:00
Florian Hahn	f3ccc20f98	[DSE,MSSA] Dbg counters required assertions. Mark test accordingly.	2020-02-21 17:34:34 +00:00
Florian Hahn	52f9d43bd8	[VectorUtils] Move ToVectorTy to VectorUtils.h (NFC). ToVectorTy is defined and used in multiple places. Hoist it to VectorUtils.h to avoid duplication and improve re-usability. Reviewers: rengolin, hsaito, Ayal, gilr, fpetrogalli Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D74959	2020-02-21 17:31:24 +00:00
Nikita Popov	1ff763a2aa	[X86] Fix SDLoc initialization Fixes -Wparentheses warning, in this case indicating a genuine bug.	2020-02-21 18:26:05 +01:00
Nikita Popov	055f58eb4f	[SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls This changes the SimplifyLibCalls utility to accept an IRBuilderBase, which allows us to pass through the IRBuilder used by InstCombine. This will ensure that new instructions get added to the worklist. The annotated test-case drops from 4 to 2 InstCombine iterations thanks to this. To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard, which is basically the same as the existing InsertPointGuard and FastMathFlagsGuard, but for operand bundles. Also add a setDefaultOperandBundles() method so these can be set outside the constructor. Differential Revision: https://reviews.llvm.org/D74792	2020-02-21 18:26:05 +01:00
LLVM GN Syncbot	45d2dc2fe4	[gn build] Port 23444edf30b	2020-02-21 17:21:54 +00:00
Jonas Paulsson	035c4568cc	[SystemZ] Return scalarized costs for vector instructions on older archs. A cost query for a vector instruction should return a cost even without target vector support, and not trigger an assert. VectorCombine does this with an input containing source code vectors. Review: Ulrich Weigand	2020-02-21 09:17:37 -08:00
Matt Arsenault	f6d46e3d59	AMDGPU: Use default operand for VOP3P clamp We don't use this, and matching from the def doesn't make much sense. There are multiple tablegen bugs with default operand handling. undef_tied_input should work to handle the vdst_in correctly, but this breaks the operand register class constraint which it should be able to infer.	2020-02-21 12:14:18 -05:00
Danilo Carvalho Grael	4362892483	[AArch64][SVE] Add intrinsics for SVE2 bitwise ternary operations Summary: Add intrinsics for the following operations: - eor3, bcax - bsl, bsl1n, bsl2n, nbsl Fix MC tests for bsl instructions. Reviewers: kmclaughlin, c-rhodes, sdesmalen, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74785	2020-02-21 12:15:51 -05:00
Sanjay Patel	e7f26faadd	[VectorCombine] refactor matching code to reduce duplication; NFC cmp/binop were already diverging even though they are largely the same logic.	2020-02-21 12:06:51 -05:00
Florian Hahn	65dfa4cc78	[DSE,MSSA] Add debug counter. Can be used like -debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20 Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72147	2020-02-21 17:04:37 +00:00
Cameron McInally	bb2f9a9478	[AArch64][SVE] Add +fullfp16 to sve-vector-splat.ll Add +fullfp16 to sve-vector-splat.ll so we can test folding of immediates into moves. This attribute can go away later when SVE has a full set of fp16 patterns in place. Differential Revision: https://reviews.llvm.org/D74965	2020-02-21 10:56:39 -06:00
Jay Foad	4e11e43581	GlobalISel: Fix narrowing of (G_ASHR i64:x, 32) Reviewers: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74950	2020-02-21 16:51:03 +00:00
Matt Arsenault	d53a4a6af8	AMDGPU/GlobalISel: Commit test changes I forgot to squash These should have been in ac7abe0ba9ae4c6a2248cc3ef4e4fe7e6d270105	2020-02-21 11:43:39 -05:00
Matt Arsenault	e4371f23e0	AMDGPU/GlobalISel: Fix xnor matching We should try the generated matchers before the manual selection. This means the patterns are now handling the common cases, but the manual selection code is not yet dead. It's still handling the non-s32/s64 cases (like v2s16 and v2s32). Currently tablegen doesn't have a nice way to have a single pattern that covers multiple types.	2020-02-21 11:42:49 -05:00
Simon Pilgrim	f58ec4a815	[TargetLowering] Apply basic shift combines before recursive SimplifyDemandedBits calls. Minor refactor/cleanup before we begin adding non-uniform support.	2020-02-21 16:31:20 +00:00
Matt Arsenault	a641278fb6	AMDGPU/GlobalISel: Precommit xnor matching test	2020-02-21 11:09:59 -05:00
David Green	cad3617e89	[ARM] Correct Formatting. NFC Also removed an unnecessary TODO that I don't believe is relevant for the instruction in question.	2020-02-21 16:08:56 +00:00
Matt Arsenault	2de80e174e	AMDGPU/GlobalISel: Manually select G_BUILD_VECTOR_TRUNC We have patterns for s_pack* selection, but they assume the inputs are a build_vector with 16-bit inputs, not a truncating build vector. Since there's still outstanding work for how to handle mismatched result and source element vector operations, and since I'm trying a different packed vector strategy than SelectionDAG, just manually select this for now.	2020-02-21 10:34:11 -05:00
Matt Arsenault	0d92d9e170	AMDGPU/GlobalISel: Legalize G_FPOW There are few differences from the DAG handling. First, the DAG handling uses a primitive selection pattern instead of custom legalizing it. Because of this, this makes use of source modifiers while the DAG does not. Also instead of promoting f16, try to use the f16 log/exp. There's no f16 fmul_legacy, so widen just for the multiply, although I'm not sure that's the best solution.	2020-02-21 10:31:13 -05:00
Matt Arsenault	7243852e77	AMDGPU/GlobalISel: Select llvm.amdgcn.fmul.legacy	2020-02-21 10:30:26 -05:00
Matt Arsenault	54857d1e2e	AMDGPU/GlobalISel: Fix constant bus violation with source modifiers This looked through copies to find the source modifiers, which may have been SGPR->VGPR copies added to avoid potential constant bus violations. Re-insert a copy to a VGPR if this happens.	2020-02-21 10:30:23 -05:00
Eric Astor	fff4a5e91f	Remove unused functions in llvm-ml On review, these functions will likely not be needed even in the final MasmParser.	2020-02-21 10:04:24 -05:00
Sean Fertile	8a3553276e	[PowerPC][NFC] Add a test for vrsave usage iinline asm. Add a lit test that that uses vrsave register in the clobber list, and tests the extended mnemonics mtvrsave and mfvrsave.	2020-02-21 09:56:15 -05:00
Sean Fertile	0974dd862a	[PowerPC][NFC] Remove Darwin specific logic in frame finalization. Remove some cumbersome Darwin specific logic for updating the frame offsets of the condition-register spill slots. The containing function has an early return if the subtarget is not ELF based which makes the Darwin logic dead.	2020-02-21 09:32:24 -05:00
Pavel Labath	ed1f03baaf	[Error/unittests] Add a FailedWithMessage gtest matcher Summary: We already have a "Failed" matcher, which can be used to check any property of the Error object. However, most frequently one just wants to check the error message, and while this is possible with the "Failed" matcher, it is also very convoluted (Failed<ErrorInfoBase>(testing::Property(&ErrorInfoBase::message, "the message"))). Now, one can just write: FailedWithMessage("the message"). I expect that most of the usages will remain this simple, but the argument of the matcher is not limited to simple strings -- the argument of the matcher can be any other matcher, so one can write more complicated assertions if needed (FailedWithMessage(ContainsRegex("foo\|bar"))). If one wants to match multiple error messages, he can pass multiple arguments to the matcher. If one wants to match the message list as a whole (perhaps to check the message count), I've also included a FailedWithMessageArray matcher, which takes a single matcher receiving a vector of error message strings. Reviewers: sammccall, dblaikie, jhenderson Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74898	2020-02-21 15:29:48 +01:00
Simon Pilgrim	9776683c03	[X86] Regenerate hi reg tests	2020-02-21 14:23:54 +00:00
Simon Pilgrim	b7aa20de0a	[TargetLowering] SimplifyDemandedBits - use getValidShiftAmountConstant helper. Use the SelectionDAG::getValidShiftAmountConstant helper to get const/constsplat shift amounts, which allows us to drop the out of range shift amount early-out. First step towards better non-uniform shift amount support in SimplifyDemandedBits.	2020-02-21 14:23:53 +00:00
Krzysztof Parzyszek	952f45fdef	[Hexagon] Introduce noop intrinsic to cast between vector predicate types The (overloaded) intrinsic is llvm.hexagon.V6.pred.typecast[.128B]. The types of the operand and the return value are HVX boolean vector types. For each cast, there needs to be a corresponding intrinsic declared, with different suffixes appended to the name, e.g. ; cast <128 x i1> to <32 x i1> declare <32 x i1> @llvm.hexagon.V6.pred.typecast.128B.s1(<128 x i1>) ; cast <32 x i1> to <64 x i1> declare <64 x i1> @llvm.hexagon.V6.pred.typecast.128B.s2(<32 x i1>) etc.	2020-02-21 07:37:59 -06:00
Evgeniy Brevnov	836e38e46a	[DependenceAnalysis] Memory dependence analysis internal caching mechanism is broken in presence of TBAA (PR42733). Summary: There is a flaw in memory dependence analysis caching mechanism when memory accesses with TBAA are involved. Assume we first analysed and cached results for access with TBAA. Later we request dependence for the same memory but without TBAA (or different TBAA). By design these two queries should share one entry in the internal cache which corresponds to a general access (without TBAA). Thus upon second request internal cached is cleared and we continue analysis for access as if there is no TBAA. The problem is that even though internal cache is cleared the set of visited nodes is not. That means we won't traverse visited nodes again and populate internal cache with the corresponding dependence results. So we end up with internal cache in an incomplete state. Current implementation tries to signal that situation by resetting CacheInfo->Pair at line 1104. But that doesn't actually help since later code ignores this invalidation and relies on 'Cache->empty()' property to decide on cache completeness. Reviewers: reames, hfinkel, chandlerc, fedor.sergeev, asbirlea, fhahn, john.brawn, Prazek, sunfish Reviewed By: john.brawn Subscribers: DaniilSuchkov, kosarev, jfb, dantrushin, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73032	2020-02-21 20:20:36 +07:00
Sanjay Patel	6fe2359aa7	[ConstantFold] fold fsub -0.0, undef to undef rather than NaN A question about this behavior came up on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html ...and as part of backend improvements in D73978, but this is an IR change first because we already have fairly thorough tests in place here. We decided not to implement a more general change that would have folded any FP binop with nearly arbitrary constant + undef operand to undef because that is not theoretically correct (even if it is practically correct). Differential Revision: https://reviews.llvm.org/D74713	2020-02-21 08:03:19 -05:00

1 2 3 4 5 ...

192339 Commits