llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Adrian Prantl	dc45fffd37	Support stripping indirectly referenced DILocations from !llvm.loop metadata in stripDebugInfo(). This patch fixes an oversight in https://reviews.llvm.org/D96181 and also takes into account loop metadata pointing to other MDNodes that point into the debug info. rdar://78487175 Differential Revision: https://reviews.llvm.org/D103220	2021-05-27 13:23:33 -07:00
Craig Topper	ef10193abb	[RISCV] Add a test showing missed opportunity to avoid a vsetvli in a loop. This is another case we need to look through a phi to prove.	2021-05-27 11:30:25 -07:00
Saleem Abdulrasool	20ebf3a03d	MC: mark `dump` with `LLVM_DUMP_METHOD` Mark the `ELFRelocationEntry::dump` method as `LLVM_DUMP_METHOD` to annotate it properly as used to prevent the function being dead stripped away. This allows use of `dump` in the debugger. This is purely to improve the developer experience.	2021-05-27 10:47:39 -07:00
Roman Lebedev	3714bd50d1	[NFC][X86][Codegen] Re-autogenerate check lines in a few tests to remove noise from future changes	2021-05-27 20:29:50 +03:00
Simon Pilgrim	0c78aae183	[CostModel][X86] Improve accuracy of sext/zext to 256-bit vector costs on AVX1 targets Determined from llvm-mca analysis (btver2 vs bdver2 vs sandybridge), the split+extends+concat sequence on AVX1 capable targets are cheaper than the #ops that the cost was previously based on.	2021-05-27 18:17:50 +01:00
Craig Topper	4734aad183	[RISCV] Teach vsetvli insertion to use vsetvl x0, x0 form when we can tell that VLMAX and AVL haven't changed. This can help avoid needing a virtual register for the vsetvl output when the AVL is X0. For other register AVLs it can shorter the live range of the AVL register if it isn't needed later. There's probably no advantage when AVL is a 5 bit immediate that can use vsetivli. But do it anyway for consistency. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D103215	2021-05-27 10:11:38 -07:00
Craig Topper	3180a39f2d	[X86] Fold (shift undef, X)->0 for vector shifts by immediate. We could previously do this by accident through the later call to getTargetConstantBitsFromNode I think, but that only worked if N0 had a single use. This patch makes it explicit for undef and doesn't have a use count check. I think this is needed to move the (shl X, 1)->(add X, X) fold to isel for PR50468. We need to be sure X won't be IMPLICIT_DEF which might prevent the same vreg from being used for both operands. Differential Revision: https://reviews.llvm.org/D103192	2021-05-27 09:31:47 -07:00
Craig Topper	aa51b77f4e	[X86] Pre-commit tests for D103192. NFC	2021-05-27 09:31:47 -07:00
maekawatoshiki	45a93367af	[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass. The next patch will utilize LoopNest to effectively handle loop nests. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D99149	2021-05-28 01:17:23 +09:00
Fraser Cormack	56f2b0fa7f	[RISCV] Add a test case showing incorrect call-conv lowering @HsiangKai helped find a bug in the lowering of indirect split scalable-vector types in our calling convention. An imminent patch will fix this.	2021-05-27 16:55:48 +01:00
Matt Arsenault	6a84ed648e	GlobalISel: Do not change register types in lowerLoad Adjusting the load register type is a widenScalar type action, not a lowering. lowerLoad should be reserved for operations that change the memory access size, such as unaligned load decomposition. With this trying to adjust the register type, it was hard to avoid infinite loops in the legalizer. Adds a bandaid to avoid regressing a few AArch64 tests, but I'm not sure what the exact condition is and there's probably a cleaner way to do this. For AMDGPU this regresses handling of some cases for unaligned loads, but the way this is currently working is a pretty ugly hack.	2021-05-27 11:49:37 -04:00
Nico Weber	7a5ca2da76	Revert "Emit correct location lists with basic block sections." Breaks check-llvm on non-linux, see comments on https://reviews.llvm.org/D85085 This reverts commit caae570978c490a137921b9516162a382831209e and follow-up commit 1546c52d971292ed4145b6d41aaca0d02229ebff.	2021-05-27 11:42:04 -04:00
Matt Arsenault	2756663c2a	AMDGPU/GlobalISel: Use IncomingValueAssigner for implicit return This makes no real difference since we assign the same register either way.	2021-05-27 11:28:52 -04:00
Matt Arsenault	70444fac16	AMDGPU/GlobalISel: Fix broken test run line	2021-05-27 11:28:51 -04:00
Simon Pilgrim	154058d3b3	[CostModel][X86] AVX512 truncation ops are slower than cost models indicate. The SkylakeServer model (and later IceLake/TigerLake targets according to Agner) have the PMOV truncations as uops=2, rthroughput=2 instructions. Noticed while trying to reduce the diffs between cost tables and llvm-mca analysis.	2021-05-27 16:07:42 +01:00
Simon Pilgrim	1ae38dad2f	[X86][SSE] Regenerate some tests to expose the rip relative vector/broadcast loads	2021-05-27 16:07:42 +01:00
Matt Arsenault	776d715ebe	VirtRegMap: Preserve LiveDebugVariables This avoids recomputing it between regalloc runs when allocation is split, and also avoids a debug info test regression.	2021-05-27 10:40:14 -04:00
Fraser Cormack	58a6d02787	[VP][SelectionDAG] Add a target-configurable EVL operand type This patch adds a way for the target to configure the type it uses for the explicit vector length operands of VP SDNodes. The type must be a legal integer type (there is still no target-independent legalization of this operand) and must currently be at least as big as i32, the type used by the IR intrinsics. An implicit zero-extension takes place on targets which choose a larger type. All VP nodes should be created with this type used for the EVL operand. This allows 64-bit RISC-V to avoid custom legalization of all VP nodes, keeping them in their target-independent form for that bit longer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103027	2021-05-27 15:27:36 +01:00
Mats Petersson	caa14ae743	[OpenMP]Add support for workshare loop modifier in lowering When lowering the dynamic, guided, auto and runtime types of scheduling, there is an optional monotonic or non-monotonic modifier. This patch adds support in the OMP IR Builder to pass this down to the runtime functions. Also implements tests for the variants. Differential Revision: https://reviews.llvm.org/D102008	2021-05-27 15:33:05 +01:00
Jamie Schmeiser	5abbd606d8	Reuse temporary files for print-changed=diff Summary: Make the file name and descriptors static so that they are reused by print-changed=diff. This avoids errors about being unable to create temporary files when doing the later comparisons in a large compile. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D100116	2021-05-27 10:19:13 -04:00
Matt Arsenault	25c57425bc	AMDGPU/GlobalISel: Lower constant-32-bit zextload/sextload consistently We were accidentally leaning on code in lowerLoad which expands extending loads which should be removed.	2021-05-27 09:49:13 -04:00
Matt Arsenault	30d04a5923	AMDGPU/GlobalISel: Remove redundant parameter from function	2021-05-27 09:49:13 -04:00
Fraser Cormack	5a44777de2	[DAGCombine][RISCV] Don't try to trunc-store combined vector stores DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173	2021-05-27 14:16:32 +01:00
Fraser Cormack	3491156985	[RISCV] Allow passing fixed-length vectors via the stack The vector calling convention dictates that when the vector argument registers are exhaused, GPRs are used to pass the address via the stack. When the GPRs themselves are exhausted, at best we would previously crash with an assertion, and at worst we'd generate incorrect code. This patch addresses this issue by passing fixed-length vectors via the stack with their full fixed-length size and aligned to their element type size. Since the calling convention lowering can't yet handle scalable vector types, this patch adds a fatal error to make it clear that we are lacking in this regard. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102422	2021-05-27 14:14:07 +01:00
Florian Hahn	d6200896bf	[VPlan] Do not sink uniform recipes in sinkScalarOperands. For uniform ReplicateRecipes, only the first lane should be used, so sinking them would mean we have to compute the value of the first lane multiple times. Also, at the moment, sinking them causes a crash because the value of the first lane is re-used by all users. Reported post-commit for D100258.	2021-05-27 14:07:48 +01:00
Simon Giesecke	f35a5b956e	Add --quiet option to llvm-gsymutil to suppress output of warnings. Differential Revision: https://reviews.llvm.org/D102829	2021-05-27 12:36:34 +00:00
Mats Petersson	ffafbe5131	Revert "[OpenMP]Add support for workshare loop modifier in lowering" This reverts commit ea4c5fb04c6d9618d451fb2d2c360dc95c6d9131.	2021-05-27 13:09:47 +01:00
Mats Petersson	ae07366301	[OpenMP]Add support for workshare loop modifier in lowering When lowering the dynamic, guided, auto and runtime types of scheduling, there is an optional monotonic or non-monotonic modifier. This patch adds support in the OMP IR Builder to pass this down to the runtime functions. Also implements tests for the variants. Differential Revision: https://reviews.llvm.org/D102008	2021-05-27 12:28:27 +01:00
David Green	fd59827ea3	[ARM] Extra test for reverted WLS memset. NFC	2021-05-27 12:20:19 +01:00
Benjamin Kramer	bcc28eb207	Add triples to a bunch of x86-specific tests that currently fail on PPC	2021-05-27 12:32:04 +02:00
James Henderson	5c474e1dad	[lit][test] Improve testing of use_llvm_tool Reviewed by: MaskRay Differential Revision: https://reviews.llvm.org/D103154	2021-05-27 11:25:43 +01:00
Florian Hahn	20c74f6c60	[Matrix] Include matrix pipeline for new PM in new-pm-defaults.ll. -enable-matrix just adds a single pass, so it's easier to just check in new-pm-default.ll rather than duplicating the full checks for -O3 with the new pass manager. Suggested post-commit by @aeubanks.	2021-05-27 10:57:39 +01:00
Fraser Cormack	ab12d83795	[SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs This patch extends the cases in which the legalizer is able to express VSELECT in terms of XOR/AND/OR. When dealing with a VSELECT between boolean vector types, the mask itself is an all-ones or all-ones value of the operand type, so a 0/1 boolean type behaves identically to a 0/-1 type. This greatly helps RISC-V which relies on expansion for these nodes. It also allows scalable-vector bool VSELECTs to use the default expansion, where before it would crash in SelectionDAG::UnrollVectorOp. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103147	2021-05-27 10:08:57 +01:00
Sebastian Neubauer	79ec3080a2	[AMDGPU][GlobalISel] Allow amdgpu_gfx calling conv Calling functions from shaders already works with the SelectionDAG. Differential Revision: https://reviews.llvm.org/D103183	2021-05-27 10:41:40 +02:00
Max Kazantsev	fb6b06d635	[NFCI][LoopDeletion] Do not call complex analysis for known non-zero BTC	2021-05-27 15:29:37 +07:00
Max Kazantsev	8529f1644f	[NFC] Reuse existing variables instead of re-requesting successors	2021-05-27 15:29:37 +07:00
Amara Emerson	d5383816bc	[GlobalISel] Implement splitting of G_SHUFFLE_VECTOR. Thhis is a port from the DAG legalization. We're still missing some of the canonicalizations of shuffles but it's a start. Differential Revision: https://reviews.llvm.org/D102828	2021-05-27 00:28:38 -07:00
Fangrui Song	06613cf382	[docs] llvm-objdump: Mention -M no-aliases is supported on AArch64	2021-05-26 23:57:32 -07:00
Max Kazantsev	b201cc808f	[NFCI] Lazily evaluate SCEVs of PHIs Eager evaluation has cost of compile time. Only query them if they are required for proving predicates.	2021-05-27 13:35:31 +07:00
Max Kazantsev	fe711d6dd1	[NFC] Formatting fix	2021-05-27 12:50:54 +07:00
Max Kazantsev	bb1a1653ea	[NFCI][LoopDeletion] Only query SCEV about loop successor if another successor is also in loop	2021-05-27 12:44:22 +07:00
Esme-Yi	54468e15fa	[llvm-objdump] Print the DEBUG type under `--section-headers`. Summary: Under the option --section-headers, we can only print the section types of TEXT, DATA, and BSS for now. This patch adds the DEBUG type. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D102603	2021-05-27 04:53:14 +00:00
LLVM GN Syncbot	75be5c89d2	[gn build] Port 857fa7b7b187	2021-05-27 04:42:56 +00:00
LLVM GN Syncbot	878593c3e2	[gn build] Port 0dc7fd1bc167	2021-05-27 04:42:55 +00:00
Hasyimi Bahrudin	aa98e6ea8a	Fix non-global-value-max-name-size not considered by LLParser `non-global-value-max-name-size` is used by `Value` to cap the length of local value name. However, this flag is not considered by `LLParser`, which leads to unexpected `use of undefined value error`. The fix is to move the responsibility of capping the length to `ValueSymbolTable`. The test is the one provided by [[ https://bugs.llvm.org/show_bug.cgi?id=45899 \| Mikael in the bug report ]]. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D102707	2021-05-27 04:20:03 +00:00
Yevgeny Rouban	0a4cf978a7	[RS4GC] Introduce intrinsics to get base ptr and offset There can be a need for some optimizations to get (base, offset) for any GC pointer. The base can be calculated by generating needed instructions as it is done by the RewriteStatepointsForGC::findBasePointer() function. The offset can be calculated in the same way. Though to not expose the base calculation and to make the offset calculation as simple as ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal outside RS4GC, this patch introduces 2 intrinsics: @llvm.experimental.gc.get.pointer.base(%derived_ptr) @llvm.experimental.gc.get.pointer.offset(%derived_ptr) These intrinsics are inlined by RS4GC along with generation of statepoint sequences. With these new intrinsics the GC parseable lowering for atomic memcpy intrinsics (6ec2c5e402a724ba99bce82a9cac7a3006d660f4) could be implemented as a separate pass. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D100445	2021-05-27 09:14:14 +07:00
Jessica Paquette	044ed9b7c9	Fix unit test after 324af79dbc6066 Needed to add in an extra parameter to calls to `libcall`.	2021-05-26 17:50:53 -07:00
Jessica Paquette	d821abe3ce	[GlobalISel] Don't emit lost debug location remarks when legalizing tail calls There were a bunch of lost debug location remarks that show up when legalizing tail calls on AArch64. This would happen because we drop the return in the block where we emit the tail call. So, we end up dropping the debug location, which makes the LostDebugLocObserver report a missing debug location. Although it's true that we lose these debug locations, this isn't a particularly useful remark. We expect to drop these debug locations when emitting tail calls. Suppressing remarks in this case is preferable, since the amount of noise could hide actual debug location related bugs. To do this, I just plumbed the LostDebugLocObserver through the relevant LegalizerHelper functions. This is the only case I can think of where we need the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it in the LegalizerHelper proper and mucking around with the constructors, I figured it'd be cleanest to take the simplest path for now. This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at -Os. Differential Revision: https://reviews.llvm.org/D103128	2021-05-26 17:16:11 -07:00
Sriraman Tallam	40b2a440c5	Emit correct location lists with basic block sections. This patch addresses multiple things: 1) It ensures that const_value is emitted when possible with basic block sections. 2) It emits location lists such that the labels are always within the section boundary. 3) It fixes a bug when the parameter is first used in a non-entry block which is in a different section from the entry block. Differential Revision: https://reviews.llvm.org/D85085	2021-05-26 17:12:31 -07:00
Amara Emerson	5e0b929619	[AArch64][GlobalISel] Legalize non-power-of-2 vector elements for G_STORE. The rules were already there, it just needed re-ordering so the odd case didn't bail out too early.	2021-05-26 17:01:02 -07:00

1 2 3 4 5 ...

216464 Commits