llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Roman Lebedev	942c801f82	[Reduce] Argument reduction: do properly handle invoke insts (PR46819) replaceFunctionCalls() is very non-exhaustive, it only handles CallInst's. Which means, by the time we drop old function, there may still be uses of it lurking around. Let's instead whack-a-mole them by all by replacing with undef. I'm not sure this is the best handling, especially for calls, but IMO poorly reduced input is much better than crashing reduction tool. A (previously-crashing!) test added. Fixes https://bugs.llvm.org/show_bug.cgi?id=46819	2020-07-26 01:29:00 +03:00
Roman Lebedev	c62f97f67d	[Reduce] Basic block reduction: do properly handle invoke insts (PR46818) Terminator may have returned value, so we need to replace uses, and in general handle invoke as a branch inst. I'm not sure this is the best handling, but IMO poorly reduced input is much better than crashing reduction tool. A (previously-crashing!) test added. Fixes https://bugs.llvm.org/show_bug.cgi?id=46818	2020-07-26 01:28:59 +03:00
Lang Hames	ef607c6171	[ORC] Rename TargetProcessControl DynamicLibraryHandle and loadLibrary. The new names, DylibHandle and loadDylib, are more concise and make clear that these utilities are for loading dynamic libraries, not static ones.	2020-07-25 15:21:43 -07:00
Lang Hames	8dd38c568c	[ORC] Don't require PageSize or Triple during TargetProcessControl construction Subclasses will commonly gather that information from a remote during construction, in which case they won't have meaningful values to pass to TargetProcessControl's constructor.	2020-07-25 15:21:43 -07:00
Philip Reames	dfcbe0aef2	[Statepoints] Support lowering gc relocations to virtual registers (Disabled under flag for the moment) This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator. Today, we force spill all operands in SelectionDAG. The code to do so is distinctly non-optimal. The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to. In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots. This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above. In particular, the first N are lowered to variadic tied def/use pairs. So new statepoint looks like this: reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ... N is limited by the maximal number of tied registers machine instruction can have (15 at the moment). The current patch is restricted to handling relocations within a single basic block. Cross block relocations (e.g. invokes) are handled via the legacy mechanism. This restriction will be relaxed in future patches. Patch By: dantrushin Differential Revision: https://reviews.llvm.org/D81648	2020-07-25 14:26:05 -07:00
Craig Topper	a376bad472	[X86] Add llvm.roundeven test cases. Add f80 tests cases for constrained intrinsics that lower to libcalls. NFC	2020-07-25 13:29:47 -07:00
Craig Topper	12c5aa4515	[X86] Fix intrinsic names in strict fp80 tests to use f80 in their names instead of x86_fp80. The type is called x86_fp80, but when it is printed in the intrinsic name it should be f80. The parser doesn't seem to care that the name was wrong.	2020-07-25 13:12:49 -07:00
LLVM GN Syncbot	91051ca099	[gn build] Port 136c8f50e96	2020-07-25 18:51:58 +00:00
Roman Lebedev	f9cbb7144f	[Reduce] Try turning function definitions into declarations first, NFCI-ish ReduceFunctions could do it, but it also replaces all calls with undef, so if any of undef replacements makes reduction uninteresting, it won't work. ReduceBasicBlocks also could do it, but well, it may take many guesses for all the blocks of a function to happen to be out-of-chunk, which is not a very efficient way to go about it. So let's just do this first.	2020-07-25 21:43:36 +03:00
Florian Hahn	2f84723455	[X86] Remove stress-scheduledagrrlist.ll. This test seems to take quite a long time with EXPENSIVE_CHECKS. Remove it.	2020-07-25 15:45:24 +01:00
Nikita Popov	23a372d0f2	[LVI] Don't require operand number for range (NFC) Pass the Value* instead of the operand number, rename I to CxtI. This makes the function a bit more generally useful.	2020-07-25 16:33:45 +02:00
Matt Arsenault	4251a6ded6	AMDGPU/GlobalISel: Don't assert on G_INSERT > 128-bits Just fallback for now. Really tablegen needs to generate all of the subregister index handling we need.	2020-07-25 10:05:44 -04:00
Nikita Popov	004e042624	[SCCP] Add assume non null test (NFC)	2020-07-25 16:02:15 +02:00
Nikita Popov	86610c55e0	[SCCP] Restore the change reporting as well Reapply 5db5b4bc4394ca247c9eb665e03b851848aa2fbf.	2020-07-25 15:11:30 +02:00
Nikita Popov	74624d8649	Reapply [SCCP] Directly remove non-feasible edges Reapply with DTU update moved after CFG update, which is a requirement of the API. ----- Non-feasible control-flow edges are currently removed by replacing the branch condition with a constant and then calling ConstantFoldTerminator. This happens in a rather roundabout manner, by inspecting the users (effectively: predecessors) of unreachable blocks, and further complicated by the need to explicitly materialize the condition for "forced" edges. I would like to extend SCCP to discard switch conditions that are non-feasible based on range information, but this is incompatible with the current approach (as there is no single constant we could use.) Instead, this patch explicitly removes non-feasible edges. It currently only needs to handle the case where there is a single feasible edge. The llvm_unreachable() branch will need to be implemented for the aforementioned switch improvement. Differential Revision: https://reviews.llvm.org/D84264	2020-07-25 14:52:35 +02:00
Simon Pilgrim	b71064b07e	SimplifyLibCalls - remove unnecessary header and forward declaration. NFC. We include TargetLibraryInfo.h so don't need to forward declare it, and we don't need to include TargetLibraryInfo.h in SimplifyLibCalls.cpp as well.	2020-07-25 12:58:39 +01:00
Simon Pilgrim	85fa3b1299	[X86][SSE] combineX86ShufflesRecursively - move all Root node asserts to the same location. NFCI. Minor tidyup for some upcoming shuffle combine improvements.	2020-07-25 12:48:14 +01:00
Simon Pilgrim	a3ff65cfbb	SymbolRemappingReader.h - pass Twine by reference not value. NFCI.	2020-07-25 12:48:14 +01:00
Florian Hahn	c1dc5ad58a	[IPSCCP] Drop argmemonly after replacing pointer argument. This patch updates IPSCCP to drop argmemonly and inaccessiblemem_or_argmemonly if it replaces a pointer argument. Fixes PR46717. Reviewers: efriedma, davide, nikic, jdoerfert Reviewed By: efriedma, jdoerfert Differential Revision: https://reviews.llvm.org/D84432	2020-07-25 11:52:14 +01:00
Nathan James	edd124eebe	Fix C2975 error under MSVC Apparantly a constexpr value isn't a compile time constant under certain versions of MSVC.	2020-07-25 11:03:59 +01:00
Simon Pilgrim	5a6feccd46	[X86][SSE] getFauxShuffle - ignore undemanded sources for PACKSS/PACKUS faux shuffles If we don't care about an entire LHS/RHS of the PACK op, then can just treat it the same as undef (we don't care if it saturates) and is safe to treat as a shuffle. This can happen if we attempt to decode as a faux shuffle before SimplifyDemandedVectorElts has been called on the PACK which should replace the source with UNDEF entirely.	2020-07-25 10:51:14 +01:00
Nathan James	8b5de07c6a	[ADT] Add a range-based version of std::move Adds a range-based version of `std::move`, the version that moves a range, not the one that creates r-value references. Reviewed By: dblaikie, gamesh411 Differential Revision: https://reviews.llvm.org/D83902	2020-07-25 10:37:34 +01:00
Jessica Paquette	0266a8b395	[AArch64][GlobalISel] Look through constants when selection stores of 0 Very minor code size improvements (hits 8 times in Bullet at -O3), but still something. Also very minor NFC change to make sure we only search for a 0 constant when selecting a store. Before, we'd do this for loads as well. Differential Revision: https://reviews.llvm.org/D84573	2020-07-24 22:46:14 -07:00
Amy Kwan	2f09b60413	[PowerPC] Exploit the High Order Vector Multiply Instructions on Power10 This patch aims to exploit the following vector multiply high instructions on Power10. vmulhsw VRT, VRA, VRB vmulhsd VRT, VRA, VRB vmulhuw VRT, VRA, VRB vmulhud VRT, VRA, VRB Differential Revision: https://reviews.llvm.org/D82584	2020-07-24 20:57:57 -05:00
Rong Xu	f48156accc	[PGO] Fix incorrect function entry count Function entry count might be zero after the profile counts reset and before reentry to the function. Zero profile entry count is very bad as the profile count from BFI will be wrong. A simple fix is to set the profile entry count to 1 if there are non-zero profile counts in this function. Differential Revision: https://reviews.llvm.org/D84378	2020-07-24 17:39:55 -07:00
Rong Xu	873a89587d	[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction Skip profile count promotion if any of the ExitBlocks contains a ret instruction. This is to prevent dumping of incomplete profile -- if the the loop is a long running loop and dump is called in the middle of the loop, the result profile is incomplete. ExitBlocks containing a ret instruction is an indication of a long running loop -- early exit to error handling code. Differential Revision: https://reviews.llvm.org/D84379	2020-07-24 17:38:31 -07:00
Rong Xu	ab6b85942c	Revert "[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction" This reverts commit 6fdc6f6c7d34af60c45c405f448370a684ef6f2a.	2020-07-24 17:35:44 -07:00
Rong Xu	84b04d6437	Revert "[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction" This reverts commit 867ef4472d8e57384c929e4f06b74d1ac8883a99.	2020-07-24 17:33:49 -07:00
Rong Xu	ce1bcf4d0a	[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction Forgot including the tests in the commit 6fdc6f6c7d34af60c4.	2020-07-24 17:23:33 -07:00
Amy Kwan	dedffec69e	[PowerPC] Implement Truncate and Store VSX Vector Builtins This patch implements the `vec_xst_trunc` function in altivec.h in order to utilize the Store VSX Vector Rightmost [byte \| half \| word \| doubleword] Indexed instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82467	2020-07-24 19:22:39 -05:00
Jessica Paquette	b6d2a38769	[AArch64][GlobalISel] Use wzr/xzr for 16 and 32 bit stores of zero We weren't performing this optimization on 16 and 32 bit stores. SDAG happily does this though. e.g. https://godbolt.org/z/cWocKr This saves about 0.2% in code size on CTMark at -O3. Differential Revision: https://reviews.llvm.org/D84568	2020-07-24 17:15:20 -07:00
Rong Xu	27560ce987	[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction Skip profile count promotion if any of the ExitBlocks contains a ret instruction. This is to prevent dumping of incomplete profile -- if the the loop is a long running loop and dump is called in the middle of the loop, the result profile is incomplete. ExitBlocks containing a ret instruction is an indication of a long running loop -- early exit to error handling code. Differential Revision: https://reviews.llvm.org/D84379	2020-07-24 17:13:58 -07:00
Matt Arsenault	16ca0e0369	GlobalISel: Define mulfix/divfix opcodes The full expansion involves the funnel shifts, which depend on another patch to expand those.	2020-07-24 20:02:20 -04:00
Amara Emerson	8f9bed84e6	[AArch64][GlobalISel] Promote G_UITOFP vector operands to same elt size as result. Fixes legalization failures.	2020-07-24 17:00:50 -07:00
Alina Sbirlea	0651a23551	Reapply "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff." This is the part of the patch that's moving the Updates to a CFGDiff object. Splitting off from the clean-up work merging the two branches when BUI is null. Differential Revision: https://reviews.llvm.org/D77341	2020-07-24 14:10:50 -07:00
Matt Arsenault	26921b3bb5	AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z PHIElimination/createPHISourceCopy inserts non-branch terminators after the control flow pseudo if a successor phi reads a register defined by the control flow pseudo. If this happens, we need to split the expansion of the control flow pseudo to ensure all the branches are after all of the other mask management instructions. GlobalISel hit this in testscases that happened to be tail duplicated. The original testcase still does not work, since the same problem appears to be present in a later pass.	2020-07-24 16:51:59 -04:00
Eli Friedman	d2a7f965f0	[AArch64][SVE] Add "fast" fcmp operations. dacf8d3 added support for most fcmp operations, but there are some extra variations I hadn't considered: SelectionDAG supports float comparisons that are neither ordered nor unordered. Add support for the missing operations. Differential Revision: https://reviews.llvm.org/D84460	2020-07-24 13:22:41 -07:00
Johannes Doerfert	b8680170b1	[SROA] Teach promote to register about droppable instructions This is the second of two patches to address PR46753. We basically allow SROA to promote allocas that are used in doppable instructions, for now that means `llvm.assume`. The (transitive) uses are replaced by `undef` in the droppable instructions. See also D83976. Reviewed By: Tyker Differential Revision: https://reviews.llvm.org/D83978	2020-07-24 15:15:39 -05:00
Johannes Doerfert	ac3ceab3a2	[Mem2Reg] Teach promote to register about droppable instructions This is the first of two patches to address PR46753. We basically allow mem2reg to promote allocas that are used in doppable instructions, for now that means `llvm.assume`. The uses of the alloca (or a bitcast or zero offset GEP from there) are replaced by `undef` in the droppable instructions. Reviewed By: Tyker Differential Revision: https://reviews.llvm.org/D83976	2020-07-24 15:15:38 -05:00
Johannes Doerfert	e393443e6f	[SROA][Mem2Reg] Do not crash on alloca + addrspacecast SROA knows that it can look through addrspacecast but PromoteMemoryToRegister did not handle them. This caused an assertion error for the test case, exposed while running `Transforms/PhaseOrdering/inlining-alignment-assumptions.ll` with D83978 applied. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D84085	2020-07-24 15:15:38 -05:00
Gui Andrade	519182b1ba	[MSAN] Allow inserting array checks Flattens arrays by ORing together all their elements. Differential Revision: https://reviews.llvm.org/D84446	2020-07-24 20:12:58 +00:00
Nemanja Ivanovic	fe8cc25a97	[PowerPC] Fix computation of offset for load-and-splat for permuted loads Unfortunately this is another regression from my canonicalization patch (1fed131660b2). The patch contained two implicit assumptions: 1. That we would have a permuted load only if we are loading a partial vector 2. That a partial vector load would necessarily be as wide as the splat However, assumption 2 is not correct since it is possible to do a wider load and only splat a half of it. This patch corrects this assumption by simply checking if the load is permuted and adjusting the offset if it is.	2020-07-24 15:38:46 -04:00
Martin Storsjö	1d7926a981	[MC] [COFF] Make sure that weak external symbols are undefined symbols For comdats (e.g. caused by -ffunction-sections), Section is already set here; make sure it's null, for the weak external symbol to be undefined. This fixes PR46779. Differential Revision: https://reviews.llvm.org/D84507	2020-07-24 22:15:08 +03:00
Martin Storsjö	e8a94ed276	[llvm-lib] Support adding short import library objects with llvm-lib This fixes PR 42837. Differential Revision: https://reviews.llvm.org/D84465	2020-07-24 22:15:08 +03:00
Arthur Eubanks	b8cfb14c48	Rename scoped-noalias -> scoped-noalias-aa Summary: To match NewPM name. Also the new name is clearer and more consistent. Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84542	2020-07-24 12:14:27 -07:00
Valentin Clement	d92fc8cf98	[openmp] Clean up OMPKinds.def remove OMP_DIRECTIVE This patch removes the OMP_DIRECTIVE definition from OMPKinds.def since they are now defined in OMP.td and OMP_DIRECTIVE is not used anymore in the code. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D84329	2020-07-24 15:06:54 -04:00
madhur13490	2e4d158121	[AMDGPU] Fix incorrect arch assert while setting up FlatScratchInit Reviewers: arsenm, foad, rampitec, scott.linder Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84391	2020-07-24 18:19:04 +00:00
Craig Topper	a1cd135fcb	[X86] Move the implicit enabling of sse2 for 64-bit mode from X86Subtarget::initSubtargetFeatures to X86_MC::ParseX86Triple. ParseX86Triple already checks for 64-bit mode and produces a static string. We can just add +sse2 to the end of that static string. This avoids a potential reallocation when appending it to the std::string at runtime. This is a slight change to the behavior of tools that only use MC layer which weren't implicitly enabling sse2 before, but will now. I don't think we check for sse2 explicitly in any MC layer components so this shouldn't matter in practice. And if it did matter the new behavior is more correct.	2020-07-24 11:14:20 -07:00
Francesco Petrogalli	5f0a55fcfc	[llvm][sve] Reg + Imm addressing mode for ld1ro. Reviewers: kmclaughlin, efriedma, sdesmalen Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83357	2020-07-24 17:48:47 +00:00
Craig Topper	ab68f0ac7c	[X86] Use X86_MC::ParseX86Triple to add mode features to feature string in X86Subtarget::initSubtargetFeatures. Remove mode flags from constructor and remove calls to ToggleFeature for the mode bits. By adding them to the feature string we handle initializing the mode member variables in X86Subtarget and the feature bits in MCSubtargetInfo in one shot.	2020-07-24 10:48:22 -07:00

1 2 3 4 5 ...

200851 Commits