llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Lang Hames	9ee7d4ffa3	[JITLink] Suppress expect-death test in release mode.	2021-05-24 22:57:10 -07:00
Max Kazantsev	701a7ca441	[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration This patch handles one particular case of one-iteration loops for which SCEV cannot straightforwardly prove BECount = 1. The idea of the optimization is to symbolically execute conditional branches on the 1st iteration, moving in topoligical order, and only visiting blocks that may be reached on the first iteration. If we find out that we never reach header via the latch, then the backedge can be broken. Differential Revision: https://reviews.llvm.org/D102615 Reviewed By: reames	2021-05-25 12:43:31 +07:00
Max Kazantsev	bebc3be883	[Test] Add test for unreachable backedge with duplicating predecessors	2021-05-25 12:43:31 +07:00
Christudasan Devadasan	022be2495f	AMDGPU/GlobalISel: Legalize G_[SU]DIVREM instructions Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D100726	2021-05-25 10:51:07 +05:30
Lang Hames	5ce7249a28	[JITLink] Enable creation and management of mutable block content. This patch introduces new operations on jitlink::Blocks: setMutableContent, getMutableContent and getAlreadyMutableContent. The setMutableContent method will set the block content data and size members and flag the content as mutable. The getMutableContent method will return a mutable copy of the existing content value, auto-allocating and populating a new mutable copy if the existing content is marked immutable. The getAlreadyMutableMethod asserts that the existing content is already mutable and returns it. setMutableContent should be used when updating the block with totally new content backed by mutable memory. It can be used to change the size of the block. The argument value should not be shared with any other block. getMutableContent should be used when clients want to modify the existing content and are unsure whether it is mutable yet. getAlreadyMutableContent should be used when clients want to modify the existing content and know from context that it must already be immutable. These operations reduce copy-modify-update boilerplate and unnecessary copies introduced when clients couldn't me sure whether the existing content was mutable or not.	2021-05-24 22:09:36 -07:00
Arthur Eubanks	05ebdcd268	Making Instrumentation aware of LoopNest Pass Intrumentation callbacks are not made aware of LoopNest passes. From the loop pass manager, we can pass the outermost loop of the LoopNest to instrumentation in case of LoopNest passes. The current patch made the change in two places in StandardInstrumentation.cpp. I will submit a proper patch where the OuterMostLoop is passed from the LoopPassManager to the call backs. That way we will avoid making changes at multiple places in StandardInstrumentation.cpp. A testcase also will be submitted. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102463	2021-05-24 20:25:52 -07:00
maekawatoshiki	b5c1f57fc4	Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass" This reverts commit d65c32fb41b03a35a2a16330ba1ea15cf6818f04.	2021-05-25 11:39:49 +09:00
David Blaikie	8e3f8bcb4e	Add a range-based wrapper for std::unique(begin, end, binary_predicate)	2021-05-24 17:26:46 -07:00
Vitaly Buka	cebde3ea57	[NFC][OMP] Fix 'unused' warning	2021-05-24 17:14:38 -07:00
Jonas Devlieghere	8e74c099a0	[dsymutil] Emit an error when the Mach-O exceeds the 4GB limit. The Mach-O object file format is limited to 4GB because its used of 32-bit offsets in the header. It is possible for dsymutil to (silently) emit an invalid binary. Instead of having consumers deal with this, emit an error instead.	2021-05-24 16:29:06 -07:00
Jonas Devlieghere	8f5c83a0a1	[dsymutil] Use EXIT_SUCCESS and EXIT_FAILURE (NFC)	2021-05-24 16:29:05 -07:00
Jonas Devlieghere	5e6dceac4f	[dsymutil] Compute the output location once per input file (NFC) Compute the location of the output file just once outside the loop over the different architectures.	2021-05-24 16:29:05 -07:00
Anton Afanasyev	114349058c	[SLP] Fix "gathering" of insertelement instructions For rare exceptional case vector tree node (insertelements for now only) is marked as `NeedToGather`, this case is processed by patch. Follow-up of D98714 to fix bug reported here https://reviews.llvm.org/D98714#2764135. Differential Revision: https://reviews.llvm.org/D102675	2021-05-25 01:35:43 +03:00
Hongtao Yu	68eaf84013	[NFC][CSSPGO]llvm-profge] Fix Build warning dueo to an attrbute usage.	2021-05-24 12:59:02 -07:00
Hongtao Yu	9a044d4e9b	[CSSPGO][llvm-profgen] Report samples for untrackable frames. Fixing an issue where samples collected for an untrackable frame is not reported. An untrackable frame refers to a frame whose caller is untrackable due to missing debug info or pseudo probe. Though the frame is connected to its parent frame through the frame pointer chain at runtime, the compiler cannot build the connection without debug info or pseudo probe. In such case we just need to report the untrackable frame as the base frame and all of its child frames. With more samples reported I'm seeing this improves the performance of an internal benchmark by 2.5%. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D102961	2021-05-24 12:39:12 -07:00
Nick Desaulniers	6d2f5ef7fe	fix up test from D102742 In D102742, I mistakenly put the split file designator above a bunch of CHECK lines, which unintentionally removed the CHECKs from actually being verified. This can be verified by observing: <build dir>/test/CodeGen/X86/Output/stack-protector-3.ll.tmp/main.ll	2021-05-24 12:09:02 -07:00
LLVM GN Syncbot	604164d943	[gn build] Port b510e4cf1b96	2021-05-24 18:48:17 +00:00
Craig Topper	9ed3d57a44	[RISCV] Add a vsetvli insert pass that can be extended to be aware of incoming VL/VTYPE from other basic blocks. This is a replacement for D101938 for inserting vsetvli instructions where needed. This new version changes how we track the information in such a way that we can extend it to be aware of VL/VTYPE changes in other blocks. Given how much it changes the previous patch, I've decided to abandon the previous patch and post this from scratch. For now the pass consists of a single phase that assumes the incoming state from other basic blocks is unknown. A follow up patch will extend this with a phase to collect information about how VL/VTYPE change in each block and a second phase to propagate this information to the entire function. This will be used by a third phase to do the vsetvli insertion. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102737	2021-05-24 11:47:27 -07:00
LLVM GN Syncbot	90d57fab4e	[gn build] Port a64ebb863727	2021-05-24 18:36:50 +00:00
Heejin Ahn	9c4b79c3d5	[WebAssembly] Add NullifyDebugValueLists pass `WebAssemblyDebugValueManager` does not currently handle `DBG_VALUE_LIST`, which is a recent addition to LLVM. We tried to nullify them within the constructor of `WebAssemblyDebugValueManager` in D102589, but it made the class error-prone to use because it deletes instructions within the constructor and thus invalidates existing iterators within the BB, so the user of the class should take special care not to use invalidated iterators. This actually caused a bug in ExplicitLocals pass. Instead of trying to fix ExplicitLocals pass to make the iterator usage correct, which is possible but error-prone, this adds NullifyDebugValueLists pass that nullifies all `DBG_VALUE_LIST` instructions before we run WebAssembly specific passes in the backend. We can remove this pass after we implement handlers for `DBG_VALUE_LIST`s in `WebAssemblyDebugValueManager` and elsewhere. Fixes https://github.com/emscripten-core/emscripten/issues/14255. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D102999	2021-05-24 11:36:01 -07:00
serge-sans-paille	73bc91a5e6	Revert "[NFC] remove explicit default value for strboolattr attribute in tests" This reverts commit bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247. See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance	2021-05-24 19:43:40 +02:00
serge-sans-paille	1f63b26006	[NFC] remove explicit default value for strboolattr attribute in tests Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to setting attribute to false. This is a preliminary commit for https://reviews.llvm.org/D99080	2021-05-24 19:31:04 +02:00
Craig Topper	42cfce04a7	[X86] Call insertDAGNode on trunc/zext created in tryShiftAmountMod. This puts the new nodes in the proper place in the topologically sorted list of nodes. Fixes PR50431, which was introduced recently in D101944.	2021-05-24 10:23:22 -07:00
LLVM GN Syncbot	1a857f7158	[gn build] Port 095e91c9737b	2021-05-24 17:18:43 +00:00
Jon Roelofs	66a4976b23	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Re-landing now that the crasher this patch previously uncovered has been fixed in: https://reviews.llvm.org/D102935 Differential revision: https://reviews.llvm.org/D102452	2021-05-24 10:10:44 -07:00
Roman Lebedev	7cf7fe8908	[X86][Costmodel] getMaskedMemoryOpCost(): don't scalarize non-power-of-two vectors with legal element type This follows in steps of similar `getMemoryOpCost()` changes, D100099/D100684. Intel SDM, `VPMASKMOV — Conditional SIMD Integer Packed Loads and Stores`: ``` Faults occur only due to mask-bit required memory accesses that caused the faults. Faults will not occur due to referencing any memory location if the corresponding mask bit for that memory location is 0. For example, no faults will be detected if the mask bits are all zero. ``` I.e., if mask is all-zeros, any address is fine. Masked load/store's prime use-case is e.g. tail masking the loop remainder, where for the last iteration, only first some few elements of a vector exist. So much similarly, i don't see why must we scalarize non-power-of-two vectors, iff the element type is something we can masked- store/load. We simply need to legalize it, widen the mask, and be done with it. And we even already count the cost of widening the mask. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D102990	2021-05-24 20:09:54 +03:00
luxufan	8f940c64cb	[RISCV] Optimize getVLENFactoredAmount function. If the local variable `NumOfVReg` isPowerOf2_32(NumOfVReg - 1) or isPowerOf2_32(NumOfVReg + 1), the ADDI and MUL instructions can be replaced with SLLI and ADD(or SUB) instructions. Based on original patch by StephenFan. Reviewed By: frasercrmck, StephenFan Differential Revision: https://reviews.llvm.org/D100577	2021-05-24 10:04:37 -07:00
Jon Roelofs	7b288b1820	[Remarks] Look through inttoptr/ptrtoint for -ftrivial-auto-var-init remarks. The crasher is a related problem that @aemerson found broke speck2k6/403.gcc when I landed https://reviews.llvm.org/D102452. It has been reduced & modified to reproduce without that patch. Differential revision: https://reviews.llvm.org/D102935	2021-05-24 09:23:22 -07:00
Adrian Prantl	b831b39c12	CoroSplit: Replace ad-hoc implementation of reachability with API from CFG.h The current ad-hoc implementation used to determine whether a basic block is unreachable doesn't work correctly in the general case (for example it won't detect successors of unreachable blocks as unreachable). This patch replaces it with the correct API that uses a DominatorTree to answer the question correctly and quickly. rdar://77181156 Differential Revision: https://reviews.llvm.org/D102963	2021-05-24 09:18:33 -07:00
Steven Wu	8faff52e7f	[llvm] Revert align attr test in test/Bitcode/attribute-3.3.ll Revert testcase changed in D87304 now the upgrader can correctly handle the align attribute in upgrader. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D102880	2021-05-24 09:15:27 -07:00
Nikita Popov	1868c30048	[CVP] Add additional test for phi common val transform (NFC)	2021-05-24 17:28:38 +02:00
Nikita Popov	3a4e6d31f4	[LoopUnroll] Add additional trip multiple test (NFC) This uses a trip multiple on a (unique) non-latch exit.	2021-05-24 17:26:07 +02:00
Nikita Popov	e8e8fcc47f	[LoopUnroll] Regenerate test checks (NFC)	2021-05-24 17:26:07 +02:00
Simon Pilgrim	f196122112	[CostModel][X86] Add missing SSE41 v2iX sext/zext costs Also fix existing v4i8->v4i16 sext cost to match the equivalents	2021-05-24 15:53:43 +01:00
thomasraoux	4be5038918	[NVPTX] Fix lowering of frem for negative values to match fmod frem result must have the dividend sign. Previous implementation had the wrong sign when passing negative numbers. For ex: frem(-16, 7) was returning 5 instead of -2. We should just a ftrunc instead of floor when lowering to get the right behavior. Differential Revision: https://reviews.llvm.org/D102528	2021-05-24 07:45:03 -07:00
Simon Pilgrim	2618dfbd6b	[CostModel][X86] Regenerate sse-itoi.ll test checks	2021-05-24 15:41:01 +01:00
Sanjay Patel	f29a31526e	[ConstProp] propagate poison from vector reduction element(s) to result This follows from the underlying logic for binops and min/max. Although it does not appear that we handle this for min/max intrinsics currently. https://alive2.llvm.org/ce/z/Kq9Xnh	2021-05-24 10:34:40 -04:00
Sanjay Patel	72e90c5ead	[ConstProp] add tests for vector reductions with poison elements; NFC	2021-05-24 10:34:40 -04:00
Florian Hahn	45eacd3210	[VPlan] Add first VPlan version of sinkScalarOperands. This patch adds a first VPlan-based implementation of sinking of scalar operands. The current version traverse a VPlan once and processes all operands of a predicated REPLICATE recipe. If one of those operands can be sunk, it is moved to the block containing the predicated REPLICATE recipe. Continue with processing the operands of the sunk recipe. The initial version does not re-process candidates after other recipes have been sunk. It also cannot partially sink induction increments at the moment. The VPlan only contains WIDEN-INDUCTION recipes and if the induction is used for example in a GEP, only the first lane is used and in the lowered IR the adds for the other lanes can be sunk into the predicated blocks. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D100258	2021-05-24 15:29:58 +01:00
Simon Pilgrim	6129236a8f	[CostModel][X86] Improve accuracy of vector non-uniform shift costs on XOP/AVX2 targets By llvm-mca analysis, Haswell/Broadwell has a non-uniform vector shift recip-throughput cost of the AVX2 targets at 2 for both 128 and 256-bit vectors - XOP capable targets have better 128-bit vector shifts so improve the fallback in those cases.	2021-05-24 14:18:21 +01:00
Florian Hahn	4a3bc8358c	[VectorCombine] Fix load extract scalarization tests with assumes. The input IR for @load_extract_idx_var_i64_known_valid_by_assume and @load_extract_idx_var_i64_not_known_valid_by_assume_after_load has been swapped. This patch fixes the test so that @load_extract_idx_var_i64_known_valid_by_assume has the assume before the load and the other test has it after.	2021-05-24 13:14:13 +01:00
Florian Hahn	db4bd974f6	[VPlan] Add mayReadOrWriteMemory & friends. This patch adds initial implementation of mayReadOrWriteMemory, mayReadFromMemory and mayWriteToMemory to VPRecipeBase. Used by D100258.	2021-05-24 13:11:32 +01:00
Bradley Smith	55f27a8faa	[AArch64][SVE] Add fixed length codegen for FP_ROUND/FP_EXTEND Depends on D102498 Differential Revision: https://reviews.llvm.org/D102607	2021-05-24 13:02:30 +01:00
Bradley Smith	88c1d64450	[AArch64][SVE] Improve codegen for fixed length vector concat Differential Revision: https://reviews.llvm.org/D102498	2021-05-24 12:56:02 +01:00
David Green	35e013cb3d	[ARM] Allow findLoopPreheader to return headers with multiple loop successors The findLoopPreheader function will currently not find a preheader if it branches to multiple different loop headers. This patch adds an option to relax that, allowing ARMLowOverheadLoops to process more loops successfully. This helps with WhileLoopStart setup instructions that can branch/fallthrough to the low overhead loop and to branch to a separate loop from the same preheader (but I don't believe it is possible for both loops to be low overhead loops). Differential Revision: https://reviews.llvm.org/D102747	2021-05-24 12:22:15 +01:00
Florian Hahn	5f9b78cce6	Recommit "[VectorCombine] Scalarize vector load/extract." This reverts commit 94d54155e2f38b56171811757044a3e6f643c14b. This fixes a sanitizer failure by moving scalarizeLoadExtract(I) before foldSingleElementStore(I), which may remove instructions.	2021-05-24 11:35:07 +01:00
David Green	56d0e7bedd	[ARM] Ensure WLS preheader blocks have branches during memcpy lowering This makes sure that the blocks created for lowering memcpy to loops end up with branches, even if they fall through to the successor. Otherwise IfCvt is getting confused with unanalyzable branches and creating invalid block layouts. The extra branches should be removed as the tail predicated loop is finalized in almost all cases.	2021-05-24 11:26:45 +01:00
David Green	5c433e70b6	[ARM] Fix inline memcpy trip count sequence The trip count for a memcpy/memset will be n/16 rounded up to the nearest integer. So (n+15)>>4. The old code was including a BIC too, to clear one of the bits, which does not seem correct. This remove the extra BIC. Note that ideally this would never actually be generated, as in the creation of a tail predicated loop we will DCE that setup code, letting the WLSTP perform the trip count calculation. So this doesn't usually come up in testing (and apparently the ARMLowOverheadLoops pass does not do any sort of validation on the tripcount). Only if the generation of the WLTP fails will it use the incorrect BIC instructions. Differential Revision: https://reviews.llvm.org/D102629	2021-05-24 11:01:58 +01:00
Fraser Cormack	f5b495a5df	[RISCV] Prevent store combining from infinitely looping RVV code generation does not successfully custom-lower BUILD_VECTOR in all cases. When it resorts to default expansion it may, on occasion, be expanded to scalar stores through the stack. Unfortunately these stores may then be picked up by the post-legalization DAGCombiner which merges them again. The merged store uses a BUILD_VECTOR which is then expanded, and so on. This patch addresses the issue by overriding the `mergeStoresAfterLegalization` hook. A lack of granularity in this method (being passed the scalar type) means we opt out in almost all cases when RVV fixed-length vector support is enabled. The only exception to this rule are mask vectors, which are always either custom-lowered or are expanded to a load from a constant pool. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102913	2021-05-24 10:19:32 +01:00
Roman Lebedev	53a13ec861	[NFCI][LoopIdiom] 'left-shift until bittest': assert that BaseX is loop-invariant Given that BaseX is an incoming value when coming from the preheader, it should be loop-invariant, but let's just document this assumption.	2021-05-24 12:15:06 +03:00

1 2 3 4 5 ...

216272 Commits