llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Florian Hahn	bf411cb019	[DSE] Hoist partial store merging code into function (NFC). Hoist the general logic into a new function, because it can be re-used by the MemorySSA backed DSE as well.	2020-06-15 17:44:24 +01:00
Jessica Paquette	36ed6c0076	[GlobalISel] Simplify G_ADD when it has (0-X) on the LHS or RHS This implements the following combines: ((0-A) + B) -> B-A (A + (0-B)) -> A-B Porting over the basic algebraic combines from the DAGCombiner. There are several combines which fold adds away into subtracts. This is just the simplest one. I noticed that add combines are some of the most commonly hit across CTMark, (via print statements when they fire), so I'm porting over some of the obvious ones. This gives some minor code size improvements on CTMark at -O3 on AArch64. Differential Revision: https://reviews.llvm.org/D77453	2020-06-15 09:43:24 -07:00
Francesco Petrogalli	68be5c9b07	[llvm][SVE] IR intrinsics for quadword permutation instructions. Summary: Adding intrinsics and codegen patterns for: * trn1 <Zd>.q, <Zm>.q, <Zn>.q * trn2 <Zd>.q, <Zm>.q, <Zn>.q * zip1 <Zd>.q, <Zm>.q, <Zn>.q * zip2 <Zd>.q, <Zm>.q, <Zn>.q * uzp1 <Zd>.q, <Zm>.q, <Zn>.q * uzp2 <Zd>.q, <Zm>.q, <Zn>.q These instructions are defined in Armv8.6-A. Reviewers: sdesmalen, efriedma, kmclaughlin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80850	2020-06-15 16:21:56 +00:00
Matt Arsenault	6842150224	AMDGPU/GlobalISel: Fix 8-byte aligned, 96-bit scalar loads These are legal since we can do a 96-bit load on some subtargets, but this is only for vector loads. If we can't widen the load, it needs to be broken down once known scalar. For 16-byte alignment, widen to a 128-bit load.	2020-06-15 11:33:16 -04:00
Wouter van Oortmerssen	2a3fc0c8b2	[WebAssembly] Adding 64-bit versions of all load & store ops. Context: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md This is just a first step, adding the new instruction variants while keeping the existing 32-bit functionality working. Some of the basic load/store tests have new wasm64 versions that show that the basics of the target are working. Further features need implementation, but these will be added in followups to keep things reviewable. Differential Revision: https://reviews.llvm.org/D80769	2020-06-15 08:31:56 -07:00
Florian Hahn	32f1a435a4	[DSE,MSSA] Delete instructions after printing it. Also enables a now-passing test case, that exposed a crash caused by the wrong order.	2020-06-15 16:01:36 +01:00
Simon Pilgrim	7fc239ff4e	[X86][SSE] Add LowerVectorAllZero helper for checking if all bits of a vector are zero. Pull the lowering code out of LowerVectorAllZeroTest (and rename it MatchVectorAllZeroTest). We should be able to reuse this in combineVectorSizedSetCCEquality as well. Another cleanup to simplify D81547.	2020-06-15 15:54:38 +01:00
Stefan Pintilie	462120580a	[PowerPC] Do not add the relocation addend to the instruction encoding We should not be adding the relocation addend to the instruction encoding. This patch removes that and sets those bits to zero. Differential Revision: https://reviews.llvm.org/D81082	2020-06-15 09:51:34 -05:00
Florian Hahn	748ec7bf78	[DSE,MSSA] Add additional merging test cases (NFC). Additional tests added ahead of partial overlapping store merging.	2020-06-15 15:45:07 +01:00
Dominik Montada	e95e4419fe	[NFC] Remove unnecessary require global-isel from tests	2020-06-15 16:35:18 +02:00
Dominik Montada	c141103726	[NFC] Add braces to if-statement in MachineVerifier	2020-06-15 16:33:56 +02:00
Simon Pilgrim	085e5b1dfa	[X86][SSE] LowerVectorAllZeroTest - add support for >256-bit vectors Reduce by splitting the vector until we reach the target size for PTEST/MOVMSK_PCMPEQ. There might be some cases where AVX512 can perform this with 512-bit vectors but so far I haven't encountered any such pattern that reaches LowerVectorAllZeroTest. Prep work for D81547	2020-06-15 15:30:24 +01:00
Hans Wennborg	35f84c1504	Revert "[X86] Separate imm from relocImm handling." > relocImm was a complexPattern that handled both ConstantSDNode > and X86Wrapper. But it was only applied selectively because using > it would cause patterns to be not importable into FastISel or > GlobalISel. So it only got applied to flag setting instructions, > stores, RMW arithmetic instructions, and rotates. > > Most of the test changes are a result of making patterns available > to GlobalISel or FastISel. The absolute-cmp.ll change is due to > this fixing a pattern ordering issue to make an absolute symbol > match to an 8-bit immediate before trying a 32-bit immediate. > > I tried to use PatFrags to reduce the repetition, but I was getting > errors from TableGen. This caused "Invalid EmitNode" assertions, see the llvm-commits thread for discussion.	2020-06-15 16:14:59 +02:00
Simon Pilgrim	04f43be16a	[X86][SSE] LowerVectorAllZeroTest - remove unnecessary bitcasts matchScalarReduction should return all its source vectors with the same type, so we can safely perform the OR reduction with the original type. So we just need to bitcast for PTEST/PCMPEQB with the final reduced vector.	2020-06-15 15:13:13 +01:00
Yvan Roux	bd9277fece	[ARM][MachineOutliner] Fix no-lr-save testcase. Now that saving LR into a register is handled, some register constraints are needed to keep machine-outliner-no-lr-save.mir meaningful.	2020-06-15 16:09:31 +02:00
Kevin P. Neal	6090d8a44a	[strictfp] Replace dangling strictfp attrs with nobuiltin In preparation for a patch that will enforce new rules for the usage of the strictfp attribute, this patch introduces auto-upgrade behavior that will replace the strictfp attribute on callsites with nobuiltin if the enclosing function declaration doesn't also have the strictfp attribute. This auto-upgrade isn't being performed on .ll files because that would prevent us from writing a test for the forthcoming verifier behavior. Differential Revision: https://reviews.llvm.org/D70096	2020-06-15 10:05:35 -04:00
Yvan Roux	e595a6ec98	[ARM][MachineOutliner] Add LR RegSave mode. Outline chunks of code which need to save and restore the link register when a spare register can be used to it. Differential Revision: https://reviews.llvm.org/D80127	2020-06-15 15:22:08 +02:00
Daniel Kiss	6a4188bf64	[AArch64] Fix BTI instruction emission. Summary: SCTLR_EL1.BT[01] controls the PACI[AB]SP compatibility with PBYTE 11 (see [1]) This bit will be set to zero so PACI[AB]SP are equal to BTI C instruction only. [1] https://developer.arm.com/docs/ddi0595/b/aarch64-system-registers/sctlr_el1 Reviewers: chill, tamas.petz, pbarrio, ostannard Reviewed By: tamas.petz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81746	2020-06-15 15:04:36 +02:00
Matt Arsenault	192bd6b78c	AMDGPU/GlobalISel: Workaround some load/store type selection patterns The logic is written for what loads/stores should be selectable. There are a set of cases that should be selectable, but due to missing MVTs and/or selection patterns, will fail to select. I think eventually load/store select patterns should ignore the type and only look at the value size, but until that happens, bitcast these to equivalent i32 vectors.	2020-06-15 07:42:20 -04:00
Matt Arsenault	fdc405fac4	AMDGPU/GlobalISel: Use less artifical example to avoid abort=0 These were failing due to an unlegalizable G_CONCAT_VECTORS due to registers with types that are naturally illegal.	2020-06-15 07:37:15 -04:00
Matt Arsenault	e2f7032ceb	GlobalISel: Support lowering vector->vector G_BITCAST Extract subvectors and cast to the result element type before remerging.	2020-06-15 07:36:30 -04:00
James Henderson	2227cf704f	[DebugInfo] Report errors for truncated debug line standard opcode Standard opcodes usually have ULEB128 arguments, so it is generally not possible to recover from such errors. This patch causes the parser to stop parsing the table in such situations. Also don't emit the operands or add data to the table if there is an error reading these opcodes. Reviewed by: JDevlieghere Differential Revision: https://reviews.llvm.org/D81470	2020-06-15 11:50:12 +01:00
Georgii Rymar	5a6ecb6df8	[yaml2obj] - Introduce the "NoHeaders" key for "SectionHeaderTable" We have an issue currently. The following YAML piece just ignores the `Excluded` key. ``` SectionHeaderTable: Sections: [] Excluded: - Name: .foo ``` Currently the meaning is: exclude the whole table. The code checks that the `Sections` key is empty and doesn't catch/check invalid/duplicated/missed `Excluded` entries. Also there is no way to exclude all sections except the first null section, because `Sections: []` currently just excludes the whole the sections header table. To fix it, I suggest a change of the behavior. 1) A new `NoHeaders` key is added. It provides an explicit syntax to drop the whole table. 2) The meaning of the following is changed: ``` SectionHeaderTable: Sections: [] Excluded: - Name: .foo ``` Assuming there are 2 sections in the object (a null section and `.foo`), with this patch it means: exclude the `.foo` section, keep the null section. The null section is an implicit section and I think it is reasonable to make "Sections: []" to mean it is implicitly added. It will be consistent with the global "Sections" tag that is used to describe sections. 3) `SectionHeaderTable->Sections` is now optional. No `Sections` is the same as `Sections: []` (I think it avoids a confusion). 4) Using of `NoHeaders` together with `Sections`/`Excluded` is not allowed. 5) It is possible to use the `Excluded` key without the `Sections` key now (in this case `Excluded` must contain all sections). 6) `SectionHeaderTable:` or `SectionHeaderTable: []` is not allowed. 7) When the `SectionHeaderTable` key is present, we still require all sections to be present in `Sections` and `Excluded` lists. No changes here, we are still strict. Differential revision: https://reviews.llvm.org/D81655	2020-06-15 12:43:16 +03:00
Simon Pilgrim	9e1e48547c	[X86][SSE] Add tests for and/or reduction results compared to zero These should fold to memcmp/ptest/movmsk+cmpeq patterns	2020-06-15 10:40:45 +01:00
Kazushi (Jam) Marukawa	26169930ef	[VE] Support relocation information in MC layer Summary: Change VEAsmParser to support identification with relocation information in assmebler. Change VEAsmBackend to support relocation information in MC layer. Change VEDisassembler and VEMCCodeEmitter to support binary generation of branch target operands. Add REFLONG fixup and variant kind to support new R_VE_REFLONG ELF symbol. And, add regression test in both MC and CodeGen to check binary genaration with relocation information. Differential Revision: https://reviews.llvm.org/D81553	2020-06-15 11:24:53 +02:00
Dominik Montada	1fcb94df38	[MachineVerifier][GlobalISel] Check that branches have a MBB operand or are declared indirect. Add missing properties to G_BRJT, G_BRINDIRECT Summary: Teach MachineVerifier to check branches for MBB operands if they are not declared indirect. Add `isBarrier`, `isIndirectBranch` to `G_BRINDIRECT` and `G_BRJT`. Without these, `MachineInstr.isConditionalBranch()` was giving a false-positive for those instructions. Reviewers: aemerson, qcolombet, dsanders, arsenm Reviewed By: dsanders Subscribers: hiraditya, wdng, simoncook, s.egerton, arsenm, rovka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81587	2020-06-15 11:17:09 +02:00
Max Kazantsev	ecfcde54b7	[Test] Add an example of unprofitable PR Phi insertion This test demonstrates weird behavior of SimplifyCFG: seems that bigger size of block leads to worse optimization choice.	2020-06-15 15:56:06 +07:00
Kirill Bobyrev	697d39cf6f	NFC: Make sure function arguments have the same name in declaration and definition This code generates Clang-Tidy warnings otherwise.	2020-06-15 10:45:08 +02:00
Sam Parker	8b9ad8cd92	[CostModel] getCFInstrCost in getUserCost. Have BasicTTI call the base implementation so that both agree on the default behaviour, which the default being a cost of '1'. This has required an X86 specific implementation as it seems to be very reliant on those instructions being free. Changes are also made to AMDGPU so that their implementations distinguish between cost kinds, so that the unrolling isn't affected. PowerPC also has its own implementation to prevent changes to the reg-usage vectorizer test. The cost model test changes now reflect that ret instructions are not generally free. Differential Revision: https://reviews.llvm.org/D79164	2020-06-15 09:28:46 +01:00
Kristina Bessonova	621bfa7a69	[CMake][runtimes] Skip adding 2nd set of the same variables for a generic target No need to parse and add the same variables twice if runtimes is being built for a generic target (i.e. w/o multilib). Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D81574	2020-06-15 09:59:27 +02:00
Sam Parker	0b700579c7	[NFCI][CostModel] Unify FNeg cost Enable TTIImpl::getUserCost to handle FNeg so that getInstructionThroughput can call that instead. This means we can remove the code in the AMDGPU backend too. Differential Revision: https://reviews.llvm.org/D81635	2020-06-15 08:33:04 +01:00
Nikita Popov	a6b0d19b37	[IR] Prefer hasFnAttribute() where possible (NFC) When checking for an enum function attribute, use hasFnAttribute() rather than hasAttribute() at FunctionIndex, because it is significantly faster (and more concise to boot).	2020-06-15 09:30:35 +02:00
Sam Parker	3cdcaa128d	[CostModel] Unify ExtractElement cost. Move the cost modelling, with the reduction pattern matching, from getInstructionThroughput into generic TTIImpl::getUserCost. The modelling in the AMDGPU backend can now be removed. Differential Revision: https://reviews.llvm.org/D81643	2020-06-15 08:27:14 +01:00
Max Kazantsev	386e5fca32	[NFC] Bail early simplifying unconditional branches	2020-06-15 13:59:53 +07:00
Sam Parker	00ff45a99f	Revert "Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"" This reverts commit 23291b9863c8af7ad348c4a7d85d8d784df88eb1. This caused performance regressions.	2020-06-15 07:46:28 +01:00
Vitaly Buka	22b9601a83	[SafeStack,NFC] Make StackColoring read-only Move core which removes markers out of StackColoring.	2020-06-14 23:05:43 -07:00
Vitaly Buka	3c4075434b	[SafeStack,NFC] Remove unneded branch	2020-06-14 23:05:43 -07:00
Vitaly Buka	e294a694ad	[SafeStack,NFC] Fix naming style	2020-06-14 23:05:42 -07:00
Vitaly Buka	b33eb1350f	[SafeStack,NFC] Cleanup LiveRange interface	2020-06-14 23:05:42 -07:00
Vitaly Buka	31d8a45d02	[SafeStack,NFC] "const" cleanup	2020-06-14 23:05:42 -07:00
Vitaly Buka	ed2fa49015	[SafeStack,NFC] Add BlockLifetimeInfo constructor	2020-06-14 23:05:42 -07:00
Vitaly Buka	b833df85b9	[SafeStack,NFC] Use IntrinsicInst instead of Instruction	2020-06-14 23:05:41 -07:00
Vitaly Buka	6289daf91f	[SafeStack,NFC] Move ClColoring into SafeStack.cpp This allows to reuse the code in other components.	2020-06-14 23:05:41 -07:00
Vitaly Buka	caee47ac16	[SafeStack,NFC] Move unconditional code into constructor Prepare to move ClColoring from SafeStackCode to SafeStackLayout. This will allow to reuse the code in other components.	2020-06-14 23:05:41 -07:00
Max Kazantsev	83ea79c99d	[Test] Update test with check script, add two more motivating cases	2020-06-15 12:41:46 +07:00
Chen Zheng	34605b3847	[PowerPC] fma chain break to expose more ILP This patch tries to reassociate two patterns related to FMA to expose more ILP on PowerPC. // Pattern 1: // A = FADD X, Y (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMA X, M21, M22 // B = FMA Y, M31, M32 // C = FADD A, B // Pattern 2: // A = FMA X, M11, M12 (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMUL M11, M12 // B = FMA X, M21, M22 // D = FMA A, M31, M32 // C = FADD B, D Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D80175	2020-06-15 00:00:04 -04:00
Wenlei He	f6c8ca4756	[NewPM] Avoid redundant CGSCC run for updated SCC Summary: When an SCC got split due to inlining, we have two mechanisms for reprocessing the updated SCC, first is UR.UpdatedC that repeatedly rerun the new, current SCC; second is a worklist for all newly split SCCs. We can avoid rerun of the same SCC when the SCC is set to be processed by both mechanisms back to back. In pathological cases, such redundant rerun could cause exponential size growth due to inlining along cycles, even when there's no SCC mutation and hence convergence is not a problem. Note that it's ok to have SCC updated and rerun immediately, and also in the work list if we have actually moved an SCC to be topologically "below" the current one due to merging. In that case, we will need to revisit the current SCC after those moved SCCs. For that reason, the redundant avoidance here only targets back to back rerun of the same SCC - the case described by the now removed FIXME comment. Reviewers: chandlerc, wmi Subscribers: llvm-commits, hoy Tags: #llvm Differential Revision: https://reviews.llvm.org/D80589	2020-06-14 19:54:52 -07:00
Kang Zhang	e63e26dab3	[PowerPC] Add some InstAlias for mtspr/mfspr instructions Summary: We have defined MTSPR/MFSPR and MTSPR8/MFSPR8, but we only defined mtspr/mfspr InstAlias for some MTSPR/MFSPR. This patch is to add the InstAlias definitions for MTSPR8/MFSPR8, and add the some new mtspr/mfspr InstAlias we may use. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77531	2020-06-15 02:43:13 +00:00
Chen Zheng	066d3ffc05	[PowerPC] fold a bug for rlwinm folding when with full mask. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81006	2020-06-14 21:27:01 -04:00
Simon Pilgrim	1ecee09e7c	[X86][SSE] Fold BITOP(MOVMSK(X),MOVMSK(Y)) -> MOVMSK(BITOP(X,Y)) Reduce XMM->GPR traffic by performing bitops on the vectors, and using a single MOVMSK call. This requires us to use vectors of the same size and element width, but we can mix fp/int type equivalents with suitable bitcasting.	2020-06-14 21:37:58 +01:00

1 2 3 4 5 ...

198403 Commits