llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Matt Arsenault	590537b1fa	AMDGPU/GlobalISel: Fix generated wave64 checks This inexplicably managed to pass locally without the updated wave64 checks.	2020-01-22 22:05:54 -05:00
Matt Arsenault	2b8da4f459	AMDGPU/GlobalISel: Remove redundant or patterns These ended up with higher priority than or3 patterns in a future patch. This also fixes the using VOP2 forms.	2020-01-22 21:45:51 -05:00
Florian Hahn	fd7efcd785	[LV] Fix predication for branches with matching true and false succs. Currently due to the edge caching, we create wrong predicates for branches with matching true and false successors. We will cache the condition for the edge from the true successor, and then lookup the same edge (src and dst are the same) for the edge to the false successor. If both successors match, the condition should always be true. At the moment, we cannot really create constant VPValues, but we can just create a true condition as X \| !X. Later passes will clean that up. Fixes PR44488. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D73079	2020-01-22 18:34:11 -08:00
Jonas Devlieghere	eca51d5b38	[llvm/Transforms] Fix warning: private field 'MSSA' is not used	2020-01-22 18:07:53 -08:00
James Clarke	cab1120454	[RISCV] Fix evaluating %pcrel_lo against global and weak symbols Summary: Previously, we would erroneously turn %pcrel_lo(label), where label has a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as evaluatePCRelLo would believe the target independent logic was going to fold it. Moreover, even if that were fixed, shouldForceRelocation lacks an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value and check the symbol, so we would then erroneously constant-fold the %pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same sequence also occurs for symbols with global binding, which is triggered in real-world code. Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to avoid these kinds of issues. All the resolution logic happens in one place, with no coordination required between RISCAsmBackend and RISCVMCExpr to ensure they implement the same logic twice. Although the implementation of %pcrel_hi can be left as target independent, we make it target dependent to ensure that they are handled identically to %pcrel_lo, otherwise we risk one of them being constant folded but the other being preserved. This also allows us to properly support fixup pairs where the instructions are in different fragments. Reviewers: asb, lenary, efriedma Reviewed By: efriedma Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73211	2020-01-23 02:05:48 +00:00
Florian Hahn	bba5d4f164	[AArch64TTI] AArch64 supports NT vector stores through STNP. This patch adds a custom implementation of isLegalNTStore to AArch64TTI that supports vector types that can be directly stored by STNP. Note that the implementation may not catch all valid cases (e.g. because the vector is a multiple of 256 and could be broken down to multiple valid 256 bit stores), but it is good enough for LV to vectorize loops with NT stores, as LV only passes in a vector with 2 elements to check. LV seems to also be the only user of isLegalNTStore. We should also do the same for NT loads, but before that we need to ensure that we properly lower LDNP of vectors, similar to D72919. Reviewers: dmgreen, samparker, t.p.northover, ab Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D73158	2020-01-22 16:45:24 -08:00
Alina Sbirlea	9a6a527d6d	[IndVarSimplify] Teach IndVarSimplify to preserve MemorySSA.	2020-01-22 16:33:17 -08:00
Alina Sbirlea	62b7c992ce	[IndVarSimplify] Cleanup spaces and reduce variable scope [NFCI] Minor clean-ups + clang-format.	2020-01-22 15:32:20 -08:00
Alina Sbirlea	b7c31f38bb	[LoopIdiomRecognize] Reduce variable scope. [NFCI]	2020-01-22 15:30:08 -08:00
Nikita Popov	342aad9ec5	[InstCombine] Combine neg of shl of sub (PR44529) Fixes https://bugs.llvm.org/show_bug.cgi?id=44529. We already have a combine to sink a negation through a left-shift, but it currently only works if the shift operand is negatable without creating any instructions. This patch introduces freelyNegateValue() as a more powerful extension of dyn_castNegVal(), which allows negating a value as long as this doesn't end up increasing instruction count. Specifically, this patch adds support for negating A-B to B-A. This mechanism could in the future be extended to handle general negation chains that a) start at a proper 0-X negation and b) only require one operand to be freely negatable. This would end up as a weaker form of D68408 aimed at the most obviously profitable subset that eliminates a negation entirely. Differential Revision: https://reviews.llvm.org/D72978	2020-01-22 23:03:58 +01:00
Nikita Popov	0a60e06e3f	[InstCombine] Add test for PR44529; NFC	2020-01-22 23:03:58 +01:00
Nikita Popov	6d326a1c6b	[PatternMatch] Make m_c_ICmp swap the predicate (PR42801) This addresses https://bugs.llvm.org/show_bug.cgi?id=42801. The m_c_ICmp() matcher is changed to provide the swapped predicate if the operands are swapped. Existing uses of m_c_ICmp() fall in one of two categories: Working on equality predicates only, where swapping is irrelevant. Or performing a manual swap, in which case this patch removes it. The only exception is the foldICmpWithLowBitMaskedVal() fold, which does not swap the predicate, and instead reasons about whether a swap occurred or not for each predicate. Getting the swapped predicate allows us to merge the logic for pairs of predicates, instead of duplicating it. Differential Revision: https://reviews.llvm.org/D72976	2020-01-22 22:56:26 +01:00
Sean Fertile	6f14a89dbb	[PowerPC] Collect some CallLowering arguments into a struct. [NFC] Collect the calling convention and a number of boolean arguments into a structure to slightly reduces the number of arguments passed around between LowerCall_<Subtarget>, FinishCall and a few of the helpers. Also calulates if a call is indirect once using the exisitng helper and caches the result replacing several instances where we duplicated the logic determining if a call is indirect.	2020-01-22 16:55:27 -05:00
Nikita Popov	4c0510dfc2	[PatternMatch] Add m_APInt/m_APFloat matchers accepting undef The current m_APInt() and m_APFloat() matchers do not accept splats that include undefs (unlike m_Zero() and other matchers for specific values). We can't simply change the default behavior, as there are existing transforms that would not be safe with undefs. For this reason, I'm introducing new m_APIntAllowUndef() and m_APFloatAllowUndef() matchers, that allow splats with undefs. Additionally, m_APIntForbidUndef() and m_APFloatForbidUndef() are added. These have the same behavior as the existing m_APInt() and m_APFloat(), but serve as an explicit indication that undefs were considered and found unsound for this transform. This helps distinguish them from existing uses of m_APInt() where we do not know whether undefs can or cannot be allowed without additional review. Differential Revision: https://reviews.llvm.org/D72975	2020-01-22 22:49:32 +01:00
Matt Arsenault	8e4d20c2f9	R600: Fix failing testcase	2020-01-22 16:01:35 -05:00
Keith Smiley	31fc1398fa	[llvm-cov] Add support for -skip-functions to lcov Summary: This flag was added for the json format to exclude functions from the output. This mirrors that behavior in lcov (where it was previously accepted but ignored). This makes the output file smaller which can be beneficial depending on how you consume it, especially if you don't use this data anyways. Patch by Keith Smiley (@keith). Reviewers: kastiglione, Dor1s, vsk, allevato Reviewed By: Dor1s, allevato Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73160	2020-01-22 12:49:00 -08:00
Sanjay Patel	2591fb635b	[x86] fold vperm2x128 to concat of 128-bit high half vectors vperm (ins ?, X, C), (ins ?, Y, C), 0x31 --> concat X, Y This is another shuffle problem seen with PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 We have this small crack in legalization/lowering/combining/demanded that allows forming a vperm2f128 of high halves with AVX1 when we could do better by peeking through the insert_subvector nodes. AFAICT, it requires IR as shown in the diffs - much larger than legal vectors - to avoid all of the usual folds. Another option would prevent forming the 256-bit vperm in lowering. Differential Revision: https://reviews.llvm.org/D73197	2020-01-22 15:35:50 -05:00
Chris Tetreault	d82a7e0d5a	[SVE] Pass Scalable argument to VectorType::get in Bitcode Reader Pass the Scalability test to VectorType::get in order to be able to deserialize bitcode that contains scalable vector operations Differential Revision: https://reviews.llvm.org/D73144	2020-01-22 12:29:25 -08:00
Alina Sbirlea	c178679b6e	[LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available. If MemorySSA analysis is analysis, LoopDeletion now preserves it.	2020-01-22 11:38:38 -08:00
Jan Vesely	6999f6e94d	AMDGPU/R600: Emit rodata in text segment R600 relies on this behaviour. Fixes: 6e18266aa4dd78953557b8614cb9ff260bad7c65 ('Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0"') Fixes ~100 piglit regressions since 6e18266 Differential Revision: https://reviews.llvm.org/D72991	2020-01-22 14:31:51 -05:00
Aaron Ballman	047716a57a	Add LLVM_VALUE_FUNCTION to Optional::map(); NFC This is for future-proofing when compiling with MSVC once we drop support for 2017.	2020-01-22 14:21:08 -05:00
Nico Weber	068346b51e	[gn build] reformat all build files again Run `git ls-files '.gn' '.gni' \| xargs llvm/utils/gn/gn.py format` after recent fixes to formatting of comments after single-element lists.	2020-01-22 14:04:20 -05:00
Aaron Ballman	08a193ec33	Add a comment about when we can remove this construct; NFC.	2020-01-22 13:17:38 -05:00
Simon Pilgrim	308dbd2e26	[X86][SSE] combineExtractWithShuffle - extract(bitcast(broadcast(x))) --> x Removes some unnecessary gpr<-->fpu traffic	2020-01-22 18:02:58 +00:00
David Green	2ac2dd62a3	[ARM] Mark MVE loads/store as not having side effects The hasSideEffect parameter is usually automatically inferred from instruction patterns. For some of our MVE instructions, we do not have patterns though, such as for the pre/post inc loads and stores. This instead specifies the flag manually on the base MVE_VLDRSTR_base tablegen class, making sure we get this correct. This can help with scheduling multiple loads more optimally. Here I've added a unittest as a more direct form of testing. Differential Revision: https://reviews.llvm.org/D73117	2020-01-22 17:56:55 +00:00
Nico Weber	948bd41ef1	Revert "[DA][TTI][AMDGPU] Add option to select GPUDA with TTI" This reverts commit a90a6502ab35d3c15c7d56772e409c5632ce6cfb. Broke tests on Windows: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/13808	2020-01-22 12:56:19 -05:00
Nico Weber	9a5d53ffb2	Revert "[gn build] [win] produce symbolized stack frames in release builds too" This reverts commit fd98eccf984f203e39452da238a142f83f61d368. Seems to have no effect, need to try it locally for a bit first.	2020-01-22 12:54:19 -05:00
Aaron Ballman	3be16c9bda	Revert "Unconditionally enable lvalue function designators; NFC" This reverts commit 968561bcdc34c7d74482fe3bb69a045abf08d2c1	2020-01-22 12:40:39 -05:00
Nico Weber	aa258c1b77	[gn build] [win] produce symbolized stack frames in release builds too	2020-01-22 12:36:38 -05:00
Florian Hahn	7c3fe23f5d	[AArch64] Don't rename registers with pseudo defs in Ld/St opt. If the root def of for renaming is a noop-pseudo instruction like kill, we would end up without a correct def for the renamed register, causing miscompiles. This patch conservatively bails out on any pseudo instruction. This fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1037912#c70	2020-01-22 09:26:25 -08:00
Matt Arsenault	2ff7e671d5	AMDGPU/GlobalISel: Handle 16-bank LDS llvm.amdgcn.interp.p1.f16 The pattern is also mishandled by the generated matcher, so workaround this as in the DAG path. The existing DAG tests aren't particularly targeted to just this one intrinsic. These also end up differing in scheduling from SGPR->VGPR operand constraint copies.	2020-01-22 12:10:59 -05:00
David Tenty	71acb7b4cf	[NFC][XCOFF] Refactor Csect creation into TargetLoweringObjectFile Summary: We create a number of standard types of control sections in multiple places for things like the function descriptors, external references and the TOC anchor among others, so it is possible for their properties to be defined inconsistently in different places. This refactor moves their creation and properties into functions in the TargetLoweringObjectFile class hierarchy, where functions for retrieving various special types of sections typically seem to reside. Note: There is one case in PPCISelLowering which is specific to function entry points which we don't address since we don't have access to the TLOF there. Reviewers: DiggerLin, jasonliu, hubert.reinterpretcast Reviewed By: jasonliu, hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72347	2020-01-22 12:09:11 -05:00
Stanislav Mekhanoshin	b9c92f5ddd	Precommit NFC part of DAGCombiner change. NFC. This is NFC part of DAGCombiner::visitEXTRACT_SUBVECTOR() change in the D73132.	2020-01-22 09:01:22 -08:00
Stanislav Mekhanoshin	50d8a4f0bd	Regenerate test/CodeGen/ARM/vext.ll. NFC. This is to pre-commit whitespace only changes before D73132.	2020-01-22 08:56:08 -08:00
Matt Arsenault	a0fa14096f	AMDGPU/GlobalISel: Select llvm.amdgcn.mov.dpp This is deprecated, but easy to support.	2020-01-22 11:43:53 -05:00
Matt Arsenault	517f60a00f	AMDGPU/GlobalISel: Select llvm.amdgcn.mov.dpp8	2020-01-22 11:43:40 -05:00
Hiroshi Yamauchi	023bdc26f9	[PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst. Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146	2020-01-22 08:36:54 -08:00
Pablo Barrio	aa9d997415	[AArch64] Add test for DWARF return address signing Summary: Patch by LukeCheeseman and pbarrio Reviewers: samparker, chill Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72835	2020-01-22 16:36:21 +00:00
Matt Arsenault	42c5f78e96	AMDGPU: Fix element size assertion The GlobalISel usage called this with bits, but the DAG usage was incorrectly using bytes.	2020-01-22 11:18:45 -05:00
Matt Arsenault	b889862fe2	AMDGPU/GlobalISel: Keep G_BITCAST out of waterfall loop The waterfall utility function blindly inserts a phi for every def in the loop. We don't need this one to be preserved for every iteration. Saves an extra phi and copy inside the loop body.	2020-01-22 11:16:19 -05:00
Zakk Chen	e95eb656e4	[RISCV] Support ABI checking with per function target-features 1. if users don't specific -mattr, the default target-feature come from IR attribute. 2. fixed bug and re-land this patch Reviewers: lenary, asb Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D70837	2020-01-22 08:12:28 -08:00
Simon Pilgrim	822a5c1afe	[X86][SSE] combineExtractWithShuffle - extract(bictcast(scalar_to_vector(x))) --> x Removes some unnecessary gpr<-->fpu traffic	2020-01-22 16:11:08 +00:00
Matt Arsenault	ef09775aa0	AMDGPU/GlobalISel: Fold add of constant into G_INSERT_VECTOR_ELT Move the subregister base like in the extract case.	2020-01-22 11:09:15 -05:00
Nico Weber	7de08a66c5	[gn build] (manually) port a174f0da62f	2020-01-22 11:08:34 -05:00
Matt Arsenault	49336cf1f0	AMDGPU/GlobalISel: Select G_INSERT_VECTOR_ELT	2020-01-22 11:00:49 -05:00
Matt Arsenault	5341a3bed9	AMDGPU/GlobalISel: Fix RegBankSelect for G_INSERT_VECTOR_ELT The result and source vector are going to be tied, so these need to be the same bank. The inserted value also needs to be broken down based on the result bank, not the inserted value itself.	2020-01-22 10:57:50 -05:00
Matt Arsenault	45831d88f1	AMDGPU/GlobalISel: Fold constant offset vector extract indexes Handle dynamic vector extracts that use an index that's an add of a constant offset into moving the base subregister of the indexing operation. Force the add into the loop in regbankselect, which will be recognized when selected.	2020-01-22 10:50:59 -05:00
Kazushi (Jam) Marukawa	7acc2658ba	[VE] select and selectcc patterns Summary: select and selectcc isel patterns and tests for i32/i64 and fp32/fp64. Includes optimized selectcc patterns for fmin/fmax/maxs/mins. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73195	2020-01-22 16:30:38 +01:00
Matt Arsenault	68415db929	AMDGPU: Fix typo	2020-01-22 10:17:46 -05:00
Matt Arsenault	9c41dd4ce6	AMDGPU: Look through casted selects to constant fold bin ops The promotion of the uniform select to i32 interfered with this fold.	2020-01-22 10:16:39 -05:00

1 2 3 4 5 ...

190510 Commits