llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Craig Topper	7638533126	[RISCV] Move pack instructions to Zbp extension only. Zext.h will need to come back to Zbb, but that only uses specific encodings of pack. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94742	2021-01-22 12:49:10 -08:00
Craig Topper	736051b98b	[RISCV] Change zext.w to be an alias of add.uw rd, rs1, x0 instead of pack. This didn't make it into the published 0.93 spec, but it was the intention. But it is in the tex source as of this commit `d172f029c0` This means zext.w now requires Zba. Not sure if we should still use pack if Zbp is enabled and Zba isn't. I'll leave that for the future when pack is closer to being final. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94736	2021-01-22 12:49:10 -08:00
Craig Topper	839a30e12c	[RISCV] Modify add.uw patterns to put the masked operand in rs1 to match 0.93 bitmanip spec. The 0.93 spec has this implementation for add.uw uint_xlen_t adduw(uint_xlen_t rs1, uint_xlen_t rs2) { uint_xlen_t rs1u = (uint32_t)rs1; return rs1u + rs2; } The 0.92 spec had the usages of rs1 and rs2 swapped. Reviewed By: frasercrmck, asb Differential Revision: https://reviews.llvm.org/D95090	2021-01-22 12:49:10 -08:00
Craig Topper	df178dabc0	[RISCV] Rename Zbs instructions to start with just 'b' instead of 'sb' to match 0.93 bitmanip spec. Also renamed Zbe instructions to resolve name conflict even though that change is in the 0.94 draft. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94653	2021-01-22 12:49:10 -08:00
Craig Topper	8d1bb37e69	[RISCV] Move Shift Ones instructions from Zbb to Zbp to match 0.93 bitmanip spec. It's not really clear in the spec that these are in Zbp now, but that's what I've gather from previous commits to the spec. I've file an issue to get it documented properly. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94652	2021-01-22 12:49:10 -08:00
Craig Topper	95de5f50bc	[RISCV] Add SH*ADD(.UW) instructions to Zba extension based on 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94637	2021-01-22 12:49:10 -08:00
Craig Topper	62d6d5270d	[RISCV] Add Zba feature and move add.uw and slli.uw to it. Still need to add SH*ADD instructions. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94617	2021-01-22 12:49:10 -08:00
Craig Topper	58b716fa26	[RISCV] Rename mnemonics slliu.w->slli.uw and addu.w->add.uw to match 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94582	2021-01-22 12:49:10 -08:00
Craig Topper	a81cc05f03	[RISCV] Swap encodings of max and minu to match 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94580	2021-01-22 12:49:10 -08:00
Craig Topper	aa862d5e8e	[RISCV] Remove addiwu, addwu, subwu, subuw, clmulw, clmulrw, clmulhw to match 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94577	2021-01-22 12:49:10 -08:00
Craig Topper	f81e804d7b	[RISCV] Rename pcnt->cpop to match 0.93 bitmanip spec. This is the first of multiple patches to bring our 0.92 implementation up to 0.93. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94568	2021-01-22 12:49:10 -08:00
Nikita Popov	0bf4855341	[Tests] Add willreturn to libcalls in some tests Willreturn would be inferred by FuncAttrs for these. Annotate them to preserve test behavior in the future.	2021-01-22 21:47:35 +01:00
Arthur Eubanks	219650917f	[NewPM][AMDGPU] Skip adding CGSCCOptimizerLate callbacks at O0 The legacy PM's EP_CGSCCOptimizerLate was only used under not-O0. Fixes clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp under the new PM. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D95250	2021-01-22 12:29:39 -08:00
Julian Lettner	5970515ef3	Remove obsolete TODOs Remove a few of my own TODOs that I will not have time to fix from lit code.	2021-01-22 12:03:03 -08:00
Nikita Popov	42d75dcee4	[SimplifyLibCalls] Skip unused calls in sincos transform If the call result is unused, we should let it get DCEd rather than replacing it. Also, don't try to replace an existing sincos with another one (unless it's as part of combining sin and cos). This avoids an infinite combine loop if the calls are not DCEd as expected, which can happen with D94106 and lack of willreturn annotation in hand-crafted IR.	2021-01-22 20:57:13 +01:00
Abhina Sreeskantharajan	f2b2e1c1b6	[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued This is a continuation of https://reviews.llvm.org/D94239. I missed some other spellings of the same error. Reviewed By: muiez Differential Revision: https://reviews.llvm.org/D95246	2021-01-22 13:54:25 -05:00
Sanjay Patel	7246890381	[InstCombine] narrow abs with sign-extended input In the motivating cases from https://llvm.org/PR48816 , we have a trailing trunc. But that is not required to reduce the abs width: https://alive2.llvm.org/ce/z/ECaz-p ...as long as we clear the int-min-is-poison bit (nsw). We have some existing tests that are affected, and I'm not sure what the overall implications are, but in general we favor narrowing operations over preserving nsw/nuw. If that causes problems, we could restrict this transform based on type (shouldChangeType() and/or vector vs. scalar). Differential Revision: https://reviews.llvm.org/D95235	2021-01-22 13:36:04 -05:00
Sanjay Patel	762a8136aa	[InstCombine] add tests for abs(sext X); NFC https://llvm.org/PR48816	2021-01-22 13:36:04 -05:00
Wolfgang Pieb	6d6459c7b0	[llvm-mca] Adding local lit config file for X86 targets	2021-01-22 09:52:57 -08:00
Yaxun (Sam) Liu	2e16ddcdc8	[HIP] Support __managed__ attribute This patch implements codegen for __managed__ variable attribute for HIP. Diagnostics will be added later. Differential Revision: https://reviews.llvm.org/D94814	2021-01-22 11:43:58 -05:00
Abhina Sreeskantharajan	ac97d5be5f	[SystemZ][z/OS] Fix No such file or directory expression error On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match the end period successfully. ``` EDC5129I No such file or directory. ``` Differential Revision: https://reviews.llvm.org/D94239	2021-01-22 11:41:40 -05:00
Simon Pilgrim	1534b21902	[X86][AVX] canonicalizeLaneShuffleWithRepeatedOps - handle vperm2x128(movddup(x),movddup(y)) cases Fold vperm2x128(movddup(x),movddup(y)) -> movddup(vperm2x128(x,y))	2021-01-22 16:05:19 +00:00
Simon Pilgrim	281f78cb65	[X86][AVX] canonicalizeLaneShuffleWithRepeatedOps - handle unary vperm2x128(permute/shift(x,c),undef) cases Fold vperm2x128(permute/shift(x,c),undef) -> permute/shift(vperm2x128(x,undef),c)	2021-01-22 15:47:23 +00:00
Simon Pilgrim	eae789a0fd	[X86][AVX] combineTargetShuffle - simplify the X86ISD::VPERM2X128 subvector matching Simplify vperm2x128(concat(X,Y),concat(Z,W)) folding. Use collectConcatOps / ISD::INSERT_SUBVECTOR to find the source subvectors instead of hardcoded immediate matching.	2021-01-22 15:47:22 +00:00
Florian Hahn	8a51053b69	[LoopUnswitch] Fix logic to avoid unswitching with atomic loads. The existing code did not deal with atomic loads correctly. Such loads are represented as MemoryDefs. Bail out on any MemoryAccess that is not a MemoryUse.	2021-01-22 15:10:12 +00:00
Florian Hahn	a9840f15ae	[LoopUnswitch] Add test cases with atomic loads & call	2021-01-22 15:10:12 +00:00
Arnold Schwaighofer	2d05bf4095	[coro.async] Make sure we process async coroutines Because we were not looking for the llvm.coro.id.async intrinsic in the early coro pass which triggers follow-up passes we relied on the llvm.coro.end intrinsic being present. This might not be the case in functions that end in unreachable code. Differential Revision: https://reviews.llvm.org/D95144	2021-01-22 07:04:01 -08:00
Roman Lebedev	ce83439d2e	Revert "[NFCI-ish][SimplifyCFG] FoldBranchToCommonDest(): really don't deal with uncond branches" Does not build in XCode: http://green.lab.llvm.org/green/job/clang-stage1-RA/17963/consoleFull#-1704658317a1ca8a51-895e-46c6-af87-ce24fa4cd561 This reverts commit aabed3718ae25476c0f6b7e70c83ba4658f00e5c.	2021-01-22 17:37:11 +03:00
Roman Lebedev	3f348fc827	[InstCombine] Fold `(~x) \| y` --> `~(x & (~y))` iff it is free to do so Iff we know we can get rid of the inversions in the new pattern, we can thus get rid of the inversion in the old pattern, this decreasing instruction count. Note that we could position this transformation as just hoisting of the `not` (still, iff y is freely negatible), but the test changes show a number of regressions, so let's not do that.	2021-01-22 17:23:54 +03:00
Roman Lebedev	d6f3b62789	[InstCombine] Fold `(~x) & y` --> `~(x \| (~y))` iff it is free to do so Iff we know we can get rid of the inversions in the new pattern, we can thus get rid of the inversion in the old pattern, this decreasing instruction count.	2021-01-22 17:23:54 +03:00
Roman Lebedev	cb398b05ee	[NFC][InstCombine] Add tests for `(~x) &/\| y` --> `~(x \|/& (~y))` fold Iff y is free to invert, and the users of the expression can be updated, we can undo De-Morgan fold, and immediately get rid of the `not` op.	2021-01-22 17:23:54 +03:00
Roman Lebedev	1cff9b2a83	[NFC][InstCombine] Extract freelyInvertAllUsersOf() out of canonicalizeICmpPredicate() I'd like to use it in an upcoming fold.	2021-01-22 17:23:53 +03:00
Roman Lebedev	80ea20bdfe	[NFC][SimplifyCFG] FoldBranchToCommonDest(): extract the actual transform into helper function I'm intentionally structuring it this way, so that the actual fold only does the fold, and no legality/correctness checks, all of which must be done by the caller. This allows for the fold code to be more compact and more easily grokable.	2021-01-22 17:23:53 +03:00
Roman Lebedev	9c198ed411	[NFC][SimplifyCFG] FoldBranchToCommonDest(): extract check for destination sharing into a helper function As a follow-up, i'll extract the actual transform into a function, and this helper will be called from both places, so this avoids code duplication.	2021-01-22 17:23:53 +03:00
Roman Lebedev	c98c76746d	[NFC][SimplifyCFG] FoldBranchToCommonDest(): somewhat better structure weight updating code Hoist the successor updating out of the code that deals with branch weight updating, and hoist the 'has weights' check from the latter, making code more consistent and easier to follow.	2021-01-22 17:23:41 +03:00
Roman Lebedev	dac9414f63	[NFC][SimplifyCFG] FoldBranchToCommonDest(): unclutter Cond/CondInPred handling We don't need those variables, we can just get the final value directly.	2021-01-22 17:23:11 +03:00
Roman Lebedev	c40f533654	[NFCI-ish][SimplifyCFG] FoldBranchToCommonDest(): really don't deal with uncond branches While we already ignore uncond branches, we could still potentially end up with a conditional branches with identical destinations due to the visitation order, or because we were called as an utility. But if we have such a disguised uncond branch, we still probably shouldn't deal with it here.	2021-01-22 17:23:10 +03:00
Roman Lebedev	e607e559fc	[SimplifyCFG] FoldBranchToCommonDest(): don't deal with unconditional branches The case where BB ends with an unconditional branch, and has a single predecessor w/ conditional branch to BB and a single successor of BB is exactly the pattern SpeculativelyExecuteBB() transform deals with. (and in this case they both allow speculating only a single instruction) Well, or FoldTwoEntryPHINode(), if the final block has only those two predecessors. Here, in FoldBranchToCommonDest(), only a weird subset of that transform is supported, and it's glued on the side in a weird way. In particular, it took me a bit to understand that the Cond isn't actually a branch condition in that case, but just the value we allow to speculate (otherwise it reads as a miscompile to me). Additionally, this only supports for the speculated instruction to be an ICmp. So let's just unclutter FoldBranchToCommonDest(), and leave this transform up to SpeculativelyExecuteBB(). As far as i can tell, this shouldn't really impact optimization potential, but if it does, improving SpeculativelyExecuteBB() will be more beneficial anyways. Notably, this only affects a single test, but EarlyCSE should have run beforehand in the pipeline, and then FoldTwoEntryPHINode() would have caught it. This reverts commit rL158392 / commit d33f4efbfdef6ffccf212ab3e40a7673589085fd.	2021-01-22 17:22:49 +03:00
David Green	1fb0315b7d	[ARM] Disable sign extended SSAT pattern recognition. I may have given bad advice, and skipping sext_inreg when matching SSAT patterns is not valid on it's own. It at least needs to sext_inreg the input again, but as far as I can tell is still only valid based on demanded bits. For the moment disable that part of the combine, hopefully reimplementing it in the future more correctly.	2021-01-22 14:07:48 +00:00
Moritz Sichert	64b62b62ba	Avoid fragile type lookups in GDB pretty printer Instead of using the type llvm::StringMapEntry<{stringified_value_type}> use only the base class llvm::StringMapEntryBase and calculate the offsets of the member variables manually. The approach with stringifying the name of the value type is pretty fragile as it can easily break with local and dependent types. Differential Revision: https://reviews.llvm.org/D94431	2021-01-22 14:56:32 +01:00
Florian Hahn	962b770084	[LTO] Add support for existing Config::Freestanding option. lto::Config has a field to control whether the build is "freestanding" (no builtins) or not, but it is not hooked up to the code actually running the passes. This patch adds support for the flag to both the code that runs optimization with the new and old pass managers, by explicitly adding a TargetLibraryInfo instance. If Freestanding is true, all library functions are disabled. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D94630	2021-01-22 13:45:39 +00:00
Simon Pilgrim	484ce58d87	[X86][AVX] combineX86ShufflesRecursively - attempt to constant fold before widening shuffle inputs combineX86ShufflesConstants/canonicalizeShuffleMaskWithHorizOp can both handle/earlyout shuffles with inputs of different widths, so delay widening as late as possible to make it easier to match constant folds etc. The plan is to eventually move the widening inside combineX86ShuffleChain so that we don't create any new nodes unless we successfully combine the shuffles.	2021-01-22 13:19:35 +00:00
Anton Rapetov	4ca1cdd110	[SLP] do not traverse constant uses Walking the use list of a Constant (particularly, ConstantData) is not scalable, since a given constant may be used by many instructinos in many functions in many modules. Differential Revision: https://reviews.llvm.org/D94713	2021-01-22 08:14:09 -05:00
Simon Pilgrim	82a418b65b	[DAG] Commute shuffle(splat(A,u), shuffle(C,D)) -> shuffle'(shuffle(C,D), splat(A,u)) We only merge shuffles if the inner (LHS) shuffle is a non-splat, so commute these shuffles to improve merging of multiple shuffles.	2021-01-22 11:43:18 +00:00
Simon Pilgrim	bac2c0740e	[X86][SSE] Don't fold shuffle(binop(),binop()) -> binop(shuffle(),shuffle()) if the shuffle are splats rGbe69e66b1cd8 added the fold, but DAGCombiner.visitVECTOR_SHUFFLE doesn't merge shuffles if the inner shuffle is a splat, so we need to bail. The non-fast-horiz-ops paths see some minor regressions, we might be able to improve on this after lowering to target shuffles. Fix PR48823	2021-01-22 11:31:38 +00:00
David Green	8a26ebe126	[ARM] Adjust isSaturatingConditional to return a new SDValue. NFC This replaces the isSaturatingConditional function with LowerSaturatingConditional that directly returns a new SSAT or USAT SDValue, instead of returning true and the components of it.	2021-01-22 11:11:36 +00:00
David Green	977fa165e2	[ARM] Add new and regenerate SSAT tests. NFC Some of these new tests should be creating SSAT. They will be fixed in a followup.	2021-01-22 10:42:36 +00:00
Nikita Popov	8c54b2d6f0	[IR] Optimize adding attribute to AttributeList (NFC) When adding an enum attribute to an AttributeList, avoid going through an AttrBuilder and instead directly add the attribute to the correct set. Going through AttrBuilder is expensive, because it requires all string attributes to be reconstructed. This can be further improved by inserting the attribute at the right position and using the AttributeSetNode::getSorted() API. This recovers the small compile-time regression from D94633.	2021-01-22 11:30:21 +01:00
LLVM GN Syncbot	17857e2d95	[gn build] Port 8214982b5042	2021-01-22 10:24:45 +00:00
Sebastian Neubauer	5d9dc9db26	[AMDGPU] Implement mir parseCustomPseudoSourceValue Allow parsing generated mir with custom pseudo source value tokens. Also rename pseudo source values to have more meaningful names. Relands ba7dcd8542ab, which had memory leaks. Differential Revision: https://reviews.llvm.org/D95215	2021-01-22 11:24:08 +01:00

1 2 3 4 5 ...

210084 Commits