llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Nico Weber	6d9082f392	[gn build] (port) 64bc44f5dd and f8de9aaef2f some more	2021-04-28 09:59:07 -04:00
Paul C. Anagnostopoulos	545a4d4a19	[TableGen] Add the !find bang operator !find searches a source string for a target string and returns the position. Differential Revision: https://reviews.llvm.org/D101318	2021-04-28 09:51:00 -04:00
Tres Popp	e6ae255924	Silence unused variable warning	2021-04-28 15:46:09 +02:00
Alexey Bataev	581f1564fc	[SLP]Try to vectorize tiny trees with shuffled gathers. If the first tree element is vectorize and the second is gather, it still might be profitable to vectorize it if the gather node contains less scalars to vectorize than the original tree node. It might be profitable to use shuffles. Differential Revision: https://reviews.llvm.org/D101397	2021-04-28 06:35:31 -07:00
Roman Lebedev	607fd6e27f	[NFC][InlineCost] Add tests for D101228	2021-04-28 16:21:14 +03:00
Matt Arsenault	bb6d824b20	GlobalISel: Relax verification of physical register copy types This was picking a concrete size for a physical register, and enforcing exact match on the virtual register's type size. Some targets add multiple types to a register class, and some are smaller than the full bit width. For example x86 adds f32 to 128-bit xmm registers, and AMDGPU adds i16/f16 to 32-bit registers. It might be better to represent these cases as a copy of the full register and an extraction of the subpart, but a lot of code assumes you can directly copy. This will help fix the current usage of the DAG calling convention infrastructure which is incompatible with how GlobalISel is now using it. The API is somewhat cumbersome here, but I just mirrored the existing functions, except now with LLTs (and allow returning null on failure, unlike the MVT version). I think the concept of selecting register classes based on type is flawed to begin with, but I'm trying to keep this compatible with the existing handling.	2021-04-28 08:45:41 -04:00
David Sherwood	c7b7c36b44	[LoopVectorize] Simplify scalar cost calculation in getInstructionCost This patch simplifies the calculation of certain costs in getInstructionCost when isScalarAfterVectorization() returns a true value. There are a few places where we multiply a cost by a number N, i.e. unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1; return N * TTI.getArithmeticInstrCost(... After some investigation it seems that there are only these cases that occur in practice: 1. VF is a scalar, in which case N = 1. 2. VF is a vector. We can only get here if: a) the instruction is a GEP/bitcast/PHI with scalar uses, or b) this is an update to an induction variable that remains scalar. I have changed the code so that N is assumed to always be 1. For GEPs the cost is always 0, since this is calculated later on as part of the load/store cost. PHI nodes are costed separately and were never previously multiplied by VF. For all other cases I have added an assert that none of the users needs scalarising, which didn't fire in any unit tests. Only one test required fixing and I believe the original cost for the scalar add instruction to have been wrong, since only one copy remains after vectorisation. I have also added a new test for the case when a pointer PHI feeds directly into a store that will be scalarised as we were previously never testing it. Differential Revision: https://reviews.llvm.org/D99718	2021-04-28 13:41:07 +01:00
Sander de Smalen	defd6742c0	[LV] Calculate max feasible scalable VF. This patch also refactors the way the feasible max VF is calculated, although this is NFC for fixed-width vectors. After this change scalable VF hints are no longer truncated/clamped to a shorter scalable VF, nor does it drop the 'scalable flag' from the suggested VF to vectorize with a similar VF that is fixed. Instead, the hint is ignored which means the vectorizer is free to find a more suitable VF, using the CostModel to determine the best possible VF. Reviewed By: c-rhodes, fhahn Differential Revision: https://reviews.llvm.org/D98509	2021-04-28 12:30:00 +01:00
Alex Richardson	80024975b4	[llvm-objdump] Fix dumping dynamic relative relocations for SHT_REL Previously printing R_386_RELATIVE relocations would trigger `error: can't read an entry at 0x40: it goes past the end of the section (0x40)` I found this while writing a test case for LLD (D100490). This also includes some minor cleanup in the elf-dynamic-relcos.test llvm-objdump test based on the newly added test. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D100489	2021-04-28 12:23:00 +01:00
Alex Richardson	14442d2503	[update_(llc_)test_checks.py] Support pre-processing commands This has been rather useful in our downstream CHERI target where we want to run tests both with addrspace(0) and addrspace(200) pointers. With this patch we can prefix the opt command with `sed -e 's/addrspace(200)/addrspace(0)/g' -e 's/-A200-P200-G200//g'` to test both cases using the same IR input. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95137	2021-04-28 12:19:19 +01:00
Tres Popp	f893fcdd67	Revert "[loop-idiom] Hoist loop memcpys to loop preheader" This reverts commit 75d6b8bb4056d518d06b72e6411ce3749455e2e3. The reasoning is mentioned in https://reviews.llvm.org/D97667	2021-04-28 13:16:34 +02:00
Roman Lebedev	ea38d76270	[NFC][SimplifyCFG] Move sink-common-code.ll into X86 There are post-commit notest for e4c61d5 that suggest the test is failing on certain bots. It looks like the code there isn't being moved, which suggests cost-model involvement, which suggests that we need to hardcode the target triple. Hopefully this helps?	2021-04-28 14:10:25 +03:00
Roman Lebedev	ea880acf5f	[NFC][Verifier] Split token1.ll into two, assert/non-assert versions	2021-04-28 13:58:38 +03:00
Kerry McLaughlin	d7d3aead75	[LoopVectorize] Prevent multiple Phis being generated with in-order reductions When using the -enable-strict-reductions flag where UF>1 we generate multiple Phi nodes, though only one of these is used as an input to the vector.reduce.fadd intrinsics. The unused Phi nodes are removed later by instcombine. This patch changes widenPHIInstruction/fixReduction to only generate one Phi, and adds an additional test for unrolling to strict-fadd.ll Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D100570	2021-04-28 11:29:01 +01:00
Jingu Kang	8d3d46f61b	[IRCE] Add tests for conservative bound check Prevent cases in which the start value of IV is bigger than bound for increasing. Prevent cases in which the start value of IV is smaller than bound for decreasing. Differential Revision: https://reviews.llvm.org/D101174	2021-04-28 11:14:21 +01:00
Benjamin Kramer	aa22c800d0	[ADT] Make TrackingStatistic's ctor constexpr This lets clang diagnose unused statistics, so remove them.	2021-04-28 12:00:17 +02:00
Qiu Chaofan	162a7c0f22	[PowerPC] Fix SELECT_CC with i64 operand on PPC32 This patch fixes the infinite loop in legalization of PPC32 SELECT_CC with 64-bit operand.	2021-04-28 17:48:33 +08:00
Stephen Tozer	eb1b6f2103	[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands This patch fixes a crash in LiveDebugVariables for inputs where a DBG_VALUE_LIST had 64 or more debug operands. This was triggering an assert, which was added under the assumption that only bad CodeGen would result in such a limit being hit, but relatively simple source files that result in these incredibly long debug values have been found, so this assert has been changed to a condition that drops the debug value if it is not met. Differential Revision: https://reviews.llvm.org/D101373	2021-04-28 10:39:02 +01:00
Joe Ellis	6dcca4e712	[AArch64] Add missing UINT_TO_FP promotions for v16i8 Differential Revision: https://reviews.llvm.org/D101042	2021-04-28 08:49:15 +00:00
Wang, Pengfei	448cd98138	[X86][AMX][NFC] Add more comments and remove unnecessary check found by Clocwork	2021-04-28 16:35:17 +08:00
Hans Wennborg	be614cd1fc	Require asserts for llvm/test/Verifier/token1.ll The test expects and assert, and that only works in asserts-enabled builds.	2021-04-28 09:58:36 +02:00
RamNalamothu	31bdab3eb7	[NFC] Refactor how CFI section types are represented in AsmPrinter In terms of readability, the `enum CFIMoveType` didn't better document what it intends to convey i.e. the type of CFI section that gets emitted. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D76519	2021-04-28 09:04:04 +05:30
Nico Weber	1d5c16abd4	[clang/Basic] Make TargetInfo.h not use DataLayout again Reverts parts of https://reviews.llvm.org/D17183, but keeps the resetDataLayout() API and adds an assert that checks that datalayout string and user label prefix are in sync. Approach 1 in https://reviews.llvm.org/D17183#2653279 Reduces number of TUs build for 'clang-format' from 689 to 575. I also implemented approach 2 in D100764. If someone feels motivated to make us use DataLayout more, it's easy to revert this change here and go with D100764 instead. I don't plan on doing more work in this area though, so I prefer going with the smaller, more self-consistent change. Differential Revision: https://reviews.llvm.org/D100776	2021-04-27 22:26:10 -04:00
Nico Weber	9b37a570c6	[gn build] (manually) port 82d3c0759fa0	2021-04-27 22:25:55 -04:00
Jim Radford	ba54030614	[CMake][llvm] add missing include to LLVMCheckLinkerFlag Differential Revision: https://reviews.llvm.org/D101417	2021-04-27 18:48:52 -07:00
Dávid Bolvanský	4d8906fa83	[DSE] Eliminate zero memset after calloc Solves PR11896 As noted, this can be improved futher (calloc -> malloc) in some cases. But for know, this is the first step. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101391	2021-04-28 03:30:52 +02:00
Hongtao Yu	92586868ec	[CSSPGO] Fix an AV caused by a block that has only pseudo pseudo instructions. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D101415	2021-04-27 17:54:34 -07:00
Han Zhu	41672a71c6	[loop-idiom] Hoist loop memcpys to loop preheader For a simple loop like: ``` struct S { int x; int y; char b; }; unsigned foo(S* __restrict__ a, S* b, int n) { for (int i = 0; i < n; i++) a[i] = b[i]; return sizeof(a[0]); } ``` We could eliminate the loop and convert it to a large memcpy of 12n bytes. Currently this is not handled. Output of `opt -loop-idiom -S < memcpy_before.ll` ``` %struct.S = type { i32, i32, i8 } define dso_local i32 @_Z3fooP1SS0_i(%struct.S noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr { entry: %cmp7 = icmp sgt i32 %n, 0 br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup for.body.preheader: ; preds = %entry br label %for.body for.cond.cleanup.loopexit: ; preds = %for.body br label %for.cond.cleanup for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry ret i32 12 for.body: ; preds = %for.body, %for.body.preheader %i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ] %idxprom = zext i32 %i.08 to i64 %arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom %0 = bitcast %struct.S* %arrayidx2 to i8* %1 = bitcast %struct.S* %arrayidx to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 4 dereferenceable(12) %0, i8* nonnull align 4 dereferenceable(12) %1, i64 12, i1 false) %inc = add nuw nsw i32 %i.08, 1 %cmp = icmp slt i32 %inc, %n br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit } ; Function Attrs: argmemonly nofree nosync nounwind willreturn declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0 attributes #0 = { argmemonly nofree nosync nounwind willreturn } ``` The loop idiom pass currently only handles load and store instructions. Since struct S is too big to fit in a register, the loop body contains a memcpy intrinsic. With this change, re-run `opt -loop-idiom -S < memcpy_before.ll`. The loop memcpy is promoted to loop preheader. For this trivial case, the loop is dead and will be removed by another pass. ``` %struct.S = type { i32, i32, i8 } define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr { entry: %a1 = bitcast %struct.S* %a to i8* %b2 = bitcast %struct.S* %b to i8* %cmp7 = icmp sgt i32 %n, 0 br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup for.body.preheader: ; preds = %entry %0 = zext i32 %n to i64 %1 = mul nuw nsw i64 %0, 12 call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a1, i8* align 4 %b2, i64 %1, i1 false) br label %for.body for.cond.cleanup.loopexit: ; preds = %for.body br label %for.cond.cleanup for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry ret i32 12 for.body: ; preds = %for.body, %for.body.preheader %i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ] %idxprom = zext i32 %i.08 to i64 %arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom %2 = bitcast %struct.S* %arrayidx2 to i8* %3 = bitcast %struct.S* %arrayidx to i8* %inc = add nuw nsw i32 %i.08, 1 %cmp = icmp slt i32 %inc, %n br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit } ; Function Attrs: argmemonly nofree nosync nounwind willreturn declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0 attributes #0 = { argmemonly nofree nosync nounwind willreturn } ``` Reviewed By: zino Differential Revision: https://reviews.llvm.org/D97667	2021-04-27 17:37:51 -07:00
David Tenty	1a6d5eb0da	[AIX] Add %pluginext and update tests to use proper pluginext As a follow on to D96282, since bug point passes is built as a module the proper file extension to use is LLVM_PLUGIN_EXT, rather than SHLIBEXT. Using SHLIBEXT causes the tests to load a non-existent file on AIX. We also adjust the PluginsTest unittest to use LLVM_PLUGIN_EXT for similar reasons. This change should hopefully make little difference to other platforms, since generally `SHLIBEXT=LTDL_SHLIB_EXT=CMAKE_SHARED_LIBRARY_SUFFIX` and `LLVM_PLUGIN_EXT=CMAKE_SHARED_LIBRARY_SUFFIX` on every platform except AIX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101412	2021-04-27 20:34:54 -04:00
Jim Radford	8377f04809	[CMake][llvm] avoid conflict w/ (and use when available) new builtin check_linker_flag Match the API for the new check_linker_flag and use it directly when available, leaving the old code as a fallback. Differential Revision: https://reviews.llvm.org/D100901	2021-04-27 16:41:28 -07:00
Alexander Shaposhnikov	8d7527eb07	Revert "[llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD" This reverts commit 4dfddf715b94857998601aa79c25e4f327d44dfa since it breaks some build bots (e.g. clang-ppc64be-linux)	2021-04-27 16:19:59 -07:00
Heejin Ahn	0a60d3f451	[WebAssembly] Error when wasm EH is used with Emscripten EH/SjLj - Error out when both Emscripten EH and wasm EH are used together, i.e., both `-enable-emscripten-cxx-exceptions` and `-exception-model=wasm` are given together. This will not happen if you use Emscripten, but this can happen when you call `llc` manually with wrong set of arguments. - Currently we don't yet support using wasm EH with Emscripten SjLj. Unlike `-enable-emscripten-cxx-exceptions` which is turned on only when you use `emcc -s DISABLE_EXCEPTION_CATCHING=0`, `-enable-emscripten-sjlj` is turned on by Emscripten by default. So we error out only when it is turned on and `setjmp` or `longjmp` is actually used. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D101403	2021-04-27 16:07:53 -07:00
Alexander Shaposhnikov	501b460768	[llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD Add support for LC_THREAD/LC_UNIXTHREAD (these load commands can be copied over without any modifications). Test plan: make check-all Differential revision: https://reviews.llvm.org/D101384	2021-04-27 15:54:51 -07:00
Joseph Huber	b73245a627	[OpenMP] Remove legacy pass manager run lines Summary: Two tests in OpenMPOpt currently fail using the legacy pass manager. Remove these run lines to prevent tests from failing.	2021-04-27 18:03:28 -04:00
Craig Topper	50539da1c1	[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer. This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101262	2021-04-27 14:38:42 -07:00
Craig Topper	d473172d42	[RISCV] Select 5 bit immediate for VSETIVLI during isel rather than peepholing in the custom inserter. This adds a special operand type that is allowed to be either an immediate or register. By giving it a unique operand type the machine verifier will ignore it. This perturbs a lot of tests but mostly it is just slightly different instruction orders. Something bad did happen to some min/max reduction tests. We're spilling vector registers when we weren't before. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D101246	2021-04-27 14:38:16 -07:00
Reid Kleckner	2cd68dac41	[NFC][SimplifyCFG] Precommit SimplifyCFG tests from D29428	2021-04-28 00:35:44 +03:00
Roman Lebedev	6624460bfc	[NFC][SimplifyCFG] Autogenerate check lines in few more tests	2021-04-28 00:35:44 +03:00
Arthur Eubanks	6c8d16d78d	[ConstFold] Use const-folded operands in more places Previously we were const folding operands but not passing them. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101394	2021-04-27 14:30:19 -07:00
Dávid Bolvanský	5efc03c236	[DSE] Added testcases for 11896, NFC	2021-04-27 22:56:10 +02:00
Han Zhu	3d67f27527	[loop-idiom][NFC] Extract processLoopStoreOfLoopLoad into a helper function Differential Revision: https://reviews.llvm.org/D100979	2021-04-27 13:42:30 -07:00
Nikita Popov	e7bbddfa9f	[SCEV] Handle uge/ugt predicates in applyLoopGuards() These can be handled the same way as ule/ult, just using umax instead of umin. This is useful in cases where the umax prevents the upper bound from overflowing. Differential Revision: https://reviews.llvm.org/D101196	2021-04-27 22:41:05 +02:00
Alexey Bataev	2f428c693f	[SLP]Add a test for possibly vectorized tiny tree, NFC.	2021-04-27 13:39:02 -07:00
Nikita Popov	0fd5fddf4c	[SCEV] Improve loop guard tests (NFC) Invert the branch order to make the predicate more obvious. Add tests with two predicates, to show that rewrites are combined.	2021-04-27 22:34:56 +02:00
Arthur Eubanks	08189b14ea	[test] Fix some func-attrs tests under the legacy PM The new PM doesn't visit declarations in CGSCC passes. These tests aren't testing that detail, so just run them against the new PM.	2021-04-27 13:07:56 -07:00
Sanjay Patel	a7d424d173	[InstCombine] fold clamp to 2 values from min/max intrinsics The "select" versions of these folds is also missing and can cause infinite loops as shown in: https://llvm.org/PR48900 ...but it seems easier to match these as max/min as a first fix. https://alive2.llvm.org/ce/z/wv-_dT	2021-04-27 15:35:49 -04:00
Sanjay Patel	c9aa2ef02f	[InstCombine] add tests for clamp patterns using min/max intrinsics; NFC	2021-04-27 15:35:49 -04:00
Andy Kaylor	64bce9007c	[Dependence Analysis] Fix ExactSIV producing wrong analysis Patch by Artem Radzikhovskyy! Symptom: ExactSIV test produced incorrect analysis of dependencies see LIT tests Bug: At the end of the algorithm when determining dependence direction original author forgot to divide intermediate results by gcd and round result toward zero Although this bug can be fixed with significantly fewer changes I opted to write the code in such a way that reflects the original algorithm that Banerjee proposed, for easier reference in the future. This surprisingly results in shorter code, and fewer quotient and max/min calculations. Changes Summary: - fixed findGCD to return valid x and y so that they match the function description where: ax - by = gcd(a,b) - Fixed ExactSIV test, to produce proper results - Documented the extension of Banerjee's algorithm that the original code author introduced. Banerjee's original algorithm only tested whether Dst depends on Src, the extension also allows us to test whether Src depends on Dst, in one pass. - ExactRDIV test worked fine. Since it uses findGCD(), it needed to be updated.Since ExactRDIV test has very few changes from the core algorithm of ExactSIV I modified the test to have consistent format as ExactSIV. - Updated the LIT tests to be testing for correct values. Differential Revision: https://reviews.llvm.org/D100331	2021-04-27 12:24:00 -07:00
Jay Foad	63b9806743	[AMDGPU] GCNHazardRecognizer: ignore all meta instructions This is hopefully NFC, but should be more robust in ignoring all instructions that should be ignored, instead of just some of them. Differential Revision: https://reviews.llvm.org/D101372	2021-04-27 20:17:15 +01:00
Roman Lebedev	c2509ff480	[NFC][SimplifyCFG] Autogenerate check lines in many test files These are potentially being affected by an upcoming patch.	2021-04-27 22:05:42 +03:00

1 2 3 4 5 ...

214881 Commits