llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Bill Wendling	77dae6a102	Revert "Allow output constraints on "asm goto"" This reverts commit 52366088a8e42c2f1e96e8430b84b8b65ec3f7bc. I accidentally pushed this before supporting changes.	2020-01-07 13:44:08 -08:00
Bill Wendling	1e81c3e696	Allow output constraints on "asm goto" Summary: Remove the restrictions that preventing "asm goto" from returning non-void values. The values returned by "asm goto" are only valid on the "fallthrough" path. Reviewers: jyknight, nickdesaulniers, hfinkel Reviewed By: jyknight, nickdesaulniers Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69876	2020-01-07 13:40:26 -08:00
Matt Arsenault	bd58701287	AMDGPU/GlobalISel: Fix scalar G_SELECT for arbitrary pointers 4e85ca9562a588eba491e44bcbf73cb2f419780f missed updating the legal condition type set for pointers with any unrecognized address space.	2020-01-07 16:36:31 -05:00
Matt Arsenault	36122a39f4	AMDGPU/GlobalISel: Add some missing G_SELECT testcases	2020-01-07 16:36:31 -05:00
Matt Arsenault	3553840899	AMDGPU/GlobalISel: Fix missing test for s16 icmp	2020-01-07 16:36:31 -05:00
Matt Arsenault	b591d25615	AMDGPU: Apply i16 add->sub pattern with zext to i32 This was only applying the deeper nested zext pattern, and missing the special case code size fold.	2020-01-07 16:36:31 -05:00
Craig Topper	c9ee289053	[X86] Enable v2i64->v2f32 uint_to_fp code in ReplaceNodeResults on SSE4.1 target Now that we generate decent code for (v2i64 (setlt zero, X)) on pre-sse4.2 targets I think we can use this now. Differential Revision: https://reviews.llvm.org/D72354	2020-01-07 13:25:29 -08:00
Daniel Sanders	1119adcb10	[gicombiner] Correct 64f1bb5cd2c to account for MSVC's %p format	2020-01-07 12:50:05 -08:00
Bill Wendling	c185d37c5e	Remove extraneous semicolon.	2020-01-07 12:49:09 -08:00
Sanjay Patel	1d67a5be15	[x86] add tests for extract-of-concat; NFC	2020-01-07 15:48:54 -05:00
Matt Arsenault	90a59ac8d4	AMDGPU: Add baseline test for missing pattern The optimization to turn an add into a sub isn't triggering when the pattern to use the zeroed high bits is used.	2020-01-07 15:10:08 -05:00
Matt Arsenault	b1381cb95e	AMDGPU: Remove VOP3Mods0Clamp0OMod Now that overridable default operands work, there's no reason to use complex patterns to just produce 0s.	2020-01-07 15:10:08 -05:00
Matt Arsenault	7940691f1f	AMDGPU: Fix misleading, misplaced end block comments	2020-01-07 15:10:08 -05:00
Matt Arsenault	f098ecdbb2	AMDGPU: Use ImmLeaf	2020-01-07 15:10:07 -05:00
Matt Arsenault	9b420937e1	AMDGPU: Fix not using v_cvt_f16_[iu]16 We weren't treating i16->f16 casts as legal on targets with these instructions, and always using a pair of casts through i32.	2020-01-07 15:10:07 -05:00
Michael Kruse	bcbdac6c54	[cmake] Use relative cmake binary dir for processing pass plugins. https://reviews.llvm.org/D61446 introduced a new function to process pass plugins that used CMAKE_BINARY_DIR. This is problematic when LLVM is a subproject. Instead use LLVM_BINARY_DIR to get the right relative directory for cmake. Patch by Alan Baker <alanbaker@google.com> Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D72109	2020-01-07 20:42:35 +01:00
Fangrui Song	3cceefd641	[PowerPC][Triple] Use elfv2 on freebsd>=13 and linux-musl Summary: Every powerpc64le platform uses elfv2. For powerpc64, the environments "elfv1" and "elfv2" were added for FreeBSD ELFv1->ELFv2 migration in D61950. FreeBSD developers have decided to use OS versions to select ABI, and no one is relying on the environments. Also use elfv2 on powerpc64-linux-musl. Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default ABI. Reviewed By: adalava Differential Revision: https://reviews.llvm.org/D72352	2020-01-07 11:40:56 -08:00
Jessica Paquette	7d3262ad06	[MachineOutliner][AArch64] Save + restore LR in noreturn functions Conservatively always save + restore LR in noreturn functions. These functions do not end in a RET, and so they aren't guaranteed to have an instruction which uses LR in any way. So, as a result, you can end up in unfortunate situations where you can't backtrace out of these functions in a debugger. Remove the old noreturn test, and add a new one which is more descriptive. Remove the restriction that we can't outline from noreturn functions as well since we now do the right thing.	2020-01-07 11:27:25 -08:00
Craig Topper	f60f03ecae	[X86] Improve lowering of (v2i64 (setgt X, -1)) on pre-SSE2 targets. Enable v2i64 in foldVectorXorShiftIntoCmp. Similar to D72302 but for the canonical form for the opposite case. I've changed foldVectorXorShiftIntoCmp to form a target independent setcc node instead of PCMPGT now and enabled its for v2i64 on pre-SSE4.2 targets. The setcc should eventually get lowered to PCMPGT or the new v2i64 sequence. Differential Revision: https://reviews.llvm.org/D72318	2020-01-07 11:22:04 -08:00
Craig Topper	8548edec3f	[X86] Improve lowering of v2i64 sign bit tests on pre-sse4.2 targets Without sse4.2 a v2i64 setlt needs to expand into a pcmpgtd, pcmpeqd, 3 shuffles, and 2 logic ops. But if we're only interested in the sign bit of the i64 elements, we can just use one pcmpgtd and shuffle the odd elements to the even elements. Differential Revision: https://reviews.llvm.org/D72302	2020-01-07 11:22:03 -08:00
LLVM GN Syncbot	f330417d5f	[gn build] Port 1d94fb21118	2020-01-07 19:13:41 +00:00
Daniel Sanders	728c2ac3b1	[gicombiner] Add GIMatchTree and use it for the code generation Summary: GIMatchTree's job is to build a decision tree by zipping all the GIMatchDag's together. Each DAG is added to the tree builder as a leaf and partitioners are used to subdivide each node until there are no more partitioners to apply. At this point, the code generator is responsible for testing any untested predicates and following any unvisited traversals (there shouldn't be any of the latter as the getVRegDef partitioner handles them all). Note that the leaves don't always fit into partitions cleanly and the partitions may overlap as a result. This is resolved by cloning the leaf into every partition it belongs to. One example of this is a rule that can match one of N opcodes. The leaf for this rule would end up in N partitions when processed by the opcode partitioner. A similar example is the getVRegDef partitioner where having rules (add $a, $b), and (add ($a, $b), $c) will result in the former being in the partition for successfully following the vreg-def and failing to do so as it doesn't care which happens. Depends on D69151 Fixed the issues with the windows bots which were caused by stdout/stderr interleaving. Reviewers: bogner, volkan Reviewed By: volkan Subscribers: lkail, mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69152	2020-01-07 11:12:53 -08:00
Alexandre Ganea	68b66f4353	Fix issues reported by -Wrange-loop-analysis when building with latest Clang (trunk). NFC. Fixes warning: loop variable 'E' of type 'const llvm::StringRef' creates a copy from type 'const llvm::StringRef' [-Wrange-loop-analysis]	2020-01-07 13:58:26 -05:00
Simon Pilgrim	4d8923b959	[ARM] Regenerate bfi.ll test cases	2020-01-07 16:51:11 +00:00
Simon Pilgrim	3a85fc99b9	[X86] Pull out repeated SrcVT.getVectorNumElements() call. NFCI.	2020-01-07 16:51:10 +00:00
diggerlin	b64ae01f8a	[AIX][XCOFF]Implement mergeable const SUMMARY: In this patch, we map mergeable const objects to the read-only section in the same manner as const objects that are not mergeable. Reviewers: hubert.reinterpretcast,jasonliu Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D71551	2020-01-07 11:20:51 -05:00
Sjoerd Meijer	def1541b7f	[ARM][MVE] Renamed VPT Block tests and files to something more informative. NFC	2020-01-07 16:16:54 +00:00
Matt Arsenault	ea85aa62ab	AMDGPU/GlobalISel: Fix readfirstlane pattern import The imm folding optimization pattern failed to import. The instruction pattern was already working, but failing to fail on SGPR inputs.	2020-01-07 11:07:08 -05:00
Sanjay Patel	9486d02241	[InstCombine] try to pull 'not' of select into compare operands not (select ?, (cmp TPred, ?, ?), (cmp FPred, ?, ?) --> select ?, (cmp TPred', ?, ?), (cmp FPred', ?, ?) If both sides of the select are cmps, we can remove an instruction. The case where only side is a cmp is deferred to a possible follow-on patch. We have a more general 'isFreeToInvert' analysis, but I'm not seeing a way to use that more widely without inducing infinite looping (opposing transforms). Here, we flip the compare predicates directly, so we should not have any danger by creating extra intermediate 'not' ops. Alive proofs: https://rise4fun.com/Alive/jKa Name: both select values are compares - invert predicates %tcmp = icmp sle i32 %x, %y %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = icmp sgt i32 %x, %y %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Name: false val is compare - invert/not %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = xor i1 %tcmp, -1 %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Differential Revision: https://reviews.llvm.org/D72007	2020-01-07 10:44:23 -05:00
Matt Arsenault	b41c50fc25	AMDGPU/GlobalISel: Fix import of s_abs_i32 pattern	2020-01-07 10:32:07 -05:00
Matt Arsenault	6c48c02faf	AMDGPU/GlobalISel: Select llvm.amdgcn.wqm.vote	2020-01-07 10:15:29 -05:00
Tim Northover	4f6226ff36	OpaquePtr: print byval types containing anonymous types correctly. Attribute::getAsString doesn't have enough information to print anonymous Module-level types correctly, so they come back as "%type 0xabcd". This results in broken IR when printing as text. Instead, print type-attributes (currently just byval) using the TypePrinting infrastructure available in AsmWriter. This only applies to function argument attributes.	2020-01-07 15:11:43 +00:00
Matt Arsenault	2346ede1ad	llc: Change behavior of -mcpu with existing attribute Don't overwrite existing target-cpu attributes. I've often found the replacement behavior annoying, and this is inconsistent with how the fast math command line flags interact with the function attributes. Does not yet change target-features, since I think that should behave as a concatenation.	2020-01-07 10:10:25 -05:00
Matt Arsenault	a26ffcf2f6	AMDGPU/GlobalISel: Partially fix llvm.amdgcn.kill pattern import Tests deferred since the existing DAG test depends on some other operations, but isn't far from working as-is.	2020-01-07 10:09:59 -05:00
Hans Wennborg	c7ebd85525	[docs] NFC: Fix typos in documents "the the" -> "the" "an" -> "a" Patch by Kazuaki Ishizaki <ishizaki@jp.ibm.com>! Differential revision: https://reviews.llvm.org/D72091	2020-01-07 16:06:14 +01:00
Sam Parker	e8c8bfd37b	[TypePromotion] Use SetVectors instead of PtrSets Remove the chance of non-deterministic insertion of zexts of the sources by using a SetVector instead of SmallPtrSet. Do the same for sinks for consistency and to negate the small issue from possibly happening. The SafeWrap instructions are now also stored in a SmallVector. The IRPromoter members of these structures have been changed to references. Differential Revision: https://reviews.llvm.org/D72322	2020-01-07 14:51:54 +00:00
Sanjay Patel	e06518d1cc	[DAGCombiner] reduce shuffle of concat of same vector This is possibly a small part towards solving PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 The vectorizer is creating shuffles of concat like this: %63 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %64 = shufflevector <8 x i64> %63, <8 x i64> undef, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> That might be fixable in the vectorizers, but we're not allowed to fold that into a single shuffle in instcombine, so we should have a backend backstop to convert that into the likely simpler form: %64 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3> Differential Revision: https://reviews.llvm.org/D72300	2020-01-07 09:48:59 -05:00
Sjoerd Meijer	038287e275	[ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS This is a recommit of D71330, but with a few things fixed and changed: 1) ReachingDefAnalysis: this was not running with optnone as it was checking skipFunction(), which other analysis passes don't do. I guess this is a copy-paste from a codegen pass. 2) VPTBlockPass: here I've added skipFunction(), because like most/all optimisations, we don't want to run this with optnone. This fixes the issues with the initial/previous commit: the VPTBlockPass was running with optnone, but ReachingDefAnalysis wasn't, and so VPTBlockPass was crashing querying ReachingDefAnalysis. I've added test case mve-vpt-block-optnone.mir to check that we don't run VPTBlock with optnone. Differential Revision: https://reviews.llvm.org/D71470	2020-01-07 13:54:47 +00:00
Simon Pilgrim	71c2db1510	[X86] Standardize shuffle match/lowering function names. NFC. We mainly use lowerShuffle/matchShuffle - replace the (few) lowerVectorShuffle/matchVectorShuffle cases to be consistent.	2020-01-07 13:41:52 +00:00
Victor Campos	3c58da7fc9	[ARM] Improve codegen of volatile load/store of i64 Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. Reviewers: dmgreen, efriedma, john.brawn, nickdesaulniers Reviewed By: efriedma, nickdesaulniers Subscribers: nickdesaulniers, vvereschaka, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70072	2020-01-07 13:16:18 +00:00
Simon Pilgrim	2583e8dbb5	Fix "use of uninitialized variable" static analyzer warning. NFCI.	2020-01-07 12:06:54 +00:00
Ulrich Weigand	784552367c	[SystemZ] Extend fp-strict-alias test case Explicitly add test for fpexcept.maytrap intrinsics.	2020-01-07 12:44:51 +01:00
LLVM GN Syncbot	6bd74663bc	[gn build] Port c69ae835d0e	2020-01-07 11:41:46 +00:00
Luís Marques	9c9c4dfafa	[RISCV][Docs] Add RISC-V asm template argument modifiers Adds the RISC-V asm template argument modifiers currently supported by LLVM. Additional ones supported by GCC will be added to the documentation when we start supporting them.	2020-01-07 11:06:46 +00:00
Simon Pilgrim	1e0278fea9	Fix Wdocumentation warnings. NFCI.	2020-01-07 10:55:38 +00:00
Simon Pilgrim	dd43b4ee29	Fix "use of uninitialized variable" static analyzer warnings. NFCI.	2020-01-07 10:55:38 +00:00
Simon Pilgrim	39a6803ba2	Fix "use of uninitialized variable" static analyzer warnings. NFCI.	2020-01-07 10:55:37 +00:00
James Henderson	9361ef9442	[DebugInfo] Fix infinite loop caused by reading past debug_line end If the claimed unit length of a debug line program is such that the line table would finish past the end of the .debug_line section, an infinite loop occurs because the data extractor will continue to "read" zeroes without changing the offset. This previously didn't hit an error because the line table program handles a series of zeroes as a bad extended opcode. This patch fixes the inifinite loop and adds a warning if the program doesn't fit in the available data. Reviewed by: JDevlieghere Differential Revision: https://reviews.llvm.org/D72279	2020-01-07 10:22:35 +00:00
Jim Lin	dbb448542d	[NFC] Use isX86() instead of getArch() Summary: This is a clean up for https://reviews.llvm.org/D72247. Reviewers: MaskRay, craig.topper, jhenderson Reviewed By: MaskRay Subscribers: hiraditya, rupprecht, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72320	2020-01-07 17:35:44 +08:00
Ulrich Weigand	cf9231a7a8	[SystemZ] Fix python failure in test case With recent Python the Large/spill-02.py test failed with an error: TypeError: can't multiply sequence by non-int of type 'float'	2020-01-07 10:26:37 +01:00

1 2 3 4 5 ...

189723 Commits