llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Sanjay Patel	aa51f59c1a	[DAGCombiner] clean up extract-of-concat fold; NFC This hopes to improve readability and adds an assert. The functional change noted by the TODO comment is proposed in: D72361	2020-01-08 10:15:33 -05:00
Kazu Hirata	787afd8fb4	[JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-01-08 06:57:36 -08:00
Simon Tatham	0f41bbd87f	[ARM,MVE] Intrinsics for variable shift instructions. This batch of intrinsics fills in all the shift instructions that take a variable shift distance in a register, instead of an immediate. Some of these instructions take a single shift distance in a scalar register and apply it to all lanes; others take a vector of per-lane distances. These instructions are all basically one family, varying in whether they saturate out-of-range values, and whether they round when bits are shifted off the bottom. I've implemented them at the IR level by a much smaller family of IR intrinsics, which take flag parameters to indicate saturating and/or rounding (along with the usual one to specify signed/unsigned integers). An oddity is that all of them are //left// shift instructions – but if you pass a negative shift count, they'll shift right. So the vector shift distances are always vectors of //signed// integers, regardless of whether you're considering the other input vector to be of signed or unsigned. Also, even the simplest `vshlq` instruction in this family (neither saturating nor rounding) has to be implemented as an IR intrinsic, because the ordinary LLVM IR `shl` operation would consider an out-of-range shift count to be undefined behavior. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72329	2020-01-08 14:42:24 +00:00
Simon Tatham	5371ba4ab1	[ARM,MVE] Intrinsics for partial-overwrite imm shifts. This batch of intrinsics covers two sets of immediate shift instructions, which have in common that they only overwrite part of their output register and so they need an extra input giving its previous value. The VSLI and VSRI instructions shift each lane of the input vector left or right just as if they were normal immediate VSHL/VSHR, but then they only overwrite the output bits that correspond to actual shifted bits of the input. So VSLI will leave the low n bits of each output lane unchanged, and VSRI the same with the top n bits. The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input vector of 2n-bit integers, shift each lane right by a constant, and then narrowing the shifted result to only n bits. So they only overwrite half of the n-bit lanes in the output register, and the B/T suffix indicates whether it's the bottom or top half of each 2n-bit lane. I've implemented the whole of the latter family using a single IR intrinsic `vshrn`, which takes a lot of i32 parameters indicating which instruction it expands to (by specifying signedness of the input and output types, whether it saturates and/or rounds, etc). Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72328	2020-01-08 14:42:24 +00:00
Bevin Hansson	21be0de34d	[Intrinsic] Add fixed point division intrinsics. Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division: llvm.sdiv.fix.* llvm.udiv.fix.* These intrinsics perform scaled division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang. Patch by: ebevhan Reviewers: bjope, leonardchan, efriedma, craig.topper Reviewed By: craig.topper Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70007	2020-01-08 15:17:46 +01:00
Qiu Chaofan	e9f8f15265	[NFC] Move InPQueue into arguments of releaseNode This patch moves `InPQueue` into function arguments instead of template arguments of `releaseNode`, which is a cleaner approach. Differential Revision: https://reviews.llvm.org/D72125	2020-01-08 22:15:32 +08:00
LLVM GN Syncbot	4861c0838e	[gn build] Port 346f6b54bd1	2020-01-08 13:43:29 +00:00
Anna Welker	4b7c61b6c9	[ARM][MVE] Enable masked gathers from vector of pointers Adds a pass to the ARM backend that takes a v4i32 gather and transforms it into a call to MVE's masked gather intrinsics. Differential Revision: https://reviews.llvm.org/D71743	2020-01-08 13:43:12 +00:00
Nico Weber	4710b1b396	[gn build] (manually) merge 1cf11a4c67a15	2020-01-08 07:44:33 -05:00
Alexey Lapshin	076ea9fc78	[Dsymutil][Debuginfo][NFC] Reland: Refactor dsymutil to separate DWARF optimizing part. #2 . Summary: This patch relands D71271. The problem with D71271 is that it has cyclic dependency: CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker. Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into DWARFLinker and it`s files are put into lib/DWARFLinker. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71839	2020-01-08 14:15:31 +03:00
Sam Parker	dbbd596280	[NFC][ARM] Update tests Run the update_mir_test on some of the low-overhead loop tests.	2020-01-08 06:08:30 -05:00
Xuanda Yang	a5bc6713d8	[llvm-symbolizer]Fix printing of malformed address values not passed via stdin Summary: relates https://bugs.llvm.org/show_bug.cgi?id=44443 Adding missing newline when printing bad input values. Fix testcase Reviewers: jhenderson Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72313	2020-01-08 18:37:41 +08:00
Kadir Cetinkaya	95287dad6e	Revert "[InstCombine] fold zext of masked bit set/clear" This reverts commit a041c4ec6f7aa659b235cb67e9231a05e0a33b7d. This looks like a non-trivial change and there has been no code reviews (at least there were no phabricator revisions attached to the commit description). It is also causing a regression in one of our downstream integration tests, we haven't been able to come up with a minimal reproducer yet.	2020-01-08 11:21:21 +01:00
Tim Northover	0916132710	AArch64: add missing Apple CPU names and use them by default. Apple's CPUs are called A7-A13 in official communication, occasionally with weird suffixes which we probably don't need to care about. This adds each one and describes its features. It also switches the default CPU to the canonical name for Cyclone, but leaves legacy support in so that existing bitcode still compiles.	2020-01-08 09:24:06 +00:00
QingShan Zhang	5b470c2b5a	[NFC][Test] Add the option -enable-no-signed-zeros-fp-math for test fma-combine.ll	2020-01-08 06:48:51 +00:00
Wang, Pengfei	424f235504	[X86] Adding fp128 support for strict fcmp Summary: Adding fp128 support for strict fcmp Reviewers: craig.topper, LiuChen3, andrew.w.kaylor, RKSimon, uweigand Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71897	2020-01-08 12:59:31 +08:00
James Clarke	0e3b53e64e	[RISCV] Fix evalutePCRelLo for symbols at the end of a fragment Summary: This is analogous to D58943, which correctly finds the corresponding fixup. However, when linker relaxations are disabled and we evaluate the fixup, we need to also ensure we use an offset of 0 rather than the size of the previous fragment. Reviewers: asb, efriedma, lenary Reviewed By: efriedma Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71978	2020-01-08 04:32:06 +00:00
Matt Arsenault	6aa7378d41	AMDGPU: Annotate EXTRACT_SUBREGs with source register classes This partially fixes GlobalISel import of the patterns, but removes a lot of entriess from the end of the skipped pattern log.	2020-01-07 21:56:16 -05:00
Jim Lin	d805544c8c	[docs] Fix duplicate explicit target name: developer policy	2020-01-08 10:44:44 +08:00
czhengsz	d7d21ed611	[SCEV] get more accurate range for AddExpr with wrap flag. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D64869	2020-01-07 20:58:04 -05:00
Jim Lin	f984a8d893	[docs] Improve HowTo commit changes from git Summary: As a novice here I tried to `git push` my changes for a while before figuring out the correct workflow which is described on other pages. This small change doesn't reduce redundancy between those pages, but at least readers can follow the links now. Reviewers: Kokan, Jim Reviewed By: Kokan, Jim Subscribers: riccibruno, kiszk, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72077	2020-01-08 09:48:01 +08:00
Eric Christopher	c61e6c8784	XFAIL load_extension.ll for all targets currently - it's failing on additional platforms than just darwin.	2020-01-07 17:00:55 -08:00
Philip Reames	dbfbd50aa3	[GVN/FP] Considate logic for reasoning about equality vs equivalance for floats Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't. Differential Revision: https://reviews.llvm.org/D67126	2020-01-07 16:05:04 -08:00
Daniel Sanders	518303cb4e	Fix warnings as errors that occur on sanitizer-x86_64-linux	2020-01-07 16:02:31 -08:00
Amara Emerson	52c3b98b3d	[AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets. E.g. %addr1 = G_PTR_ADD %base, G_CONSTANT 20 %addr2 = G_PTR_ADD %addr1, G_CONSTANT 8 --> %addr2 = G_PTR_ADD %base, G_CONSTANT 28 Differential Revision: https://reviews.llvm.org/D72351	2020-01-07 14:12:42 -08:00
Craig Topper	23e0f7572d	[X86] Add SSE4.1 command lines to vec-strict-inttofp-128.ll to cover the v2i64->v2f32 strict_uitofp codegen. NFC	2020-01-07 13:53:38 -08:00
Bill Wendling	77dae6a102	Revert "Allow output constraints on "asm goto"" This reverts commit 52366088a8e42c2f1e96e8430b84b8b65ec3f7bc. I accidentally pushed this before supporting changes.	2020-01-07 13:44:08 -08:00
Bill Wendling	1e81c3e696	Allow output constraints on "asm goto" Summary: Remove the restrictions that preventing "asm goto" from returning non-void values. The values returned by "asm goto" are only valid on the "fallthrough" path. Reviewers: jyknight, nickdesaulniers, hfinkel Reviewed By: jyknight, nickdesaulniers Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69876	2020-01-07 13:40:26 -08:00
Matt Arsenault	bd58701287	AMDGPU/GlobalISel: Fix scalar G_SELECT for arbitrary pointers 4e85ca9562a588eba491e44bcbf73cb2f419780f missed updating the legal condition type set for pointers with any unrecognized address space.	2020-01-07 16:36:31 -05:00
Matt Arsenault	36122a39f4	AMDGPU/GlobalISel: Add some missing G_SELECT testcases	2020-01-07 16:36:31 -05:00
Matt Arsenault	3553840899	AMDGPU/GlobalISel: Fix missing test for s16 icmp	2020-01-07 16:36:31 -05:00
Matt Arsenault	b591d25615	AMDGPU: Apply i16 add->sub pattern with zext to i32 This was only applying the deeper nested zext pattern, and missing the special case code size fold.	2020-01-07 16:36:31 -05:00
Craig Topper	c9ee289053	[X86] Enable v2i64->v2f32 uint_to_fp code in ReplaceNodeResults on SSE4.1 target Now that we generate decent code for (v2i64 (setlt zero, X)) on pre-sse4.2 targets I think we can use this now. Differential Revision: https://reviews.llvm.org/D72354	2020-01-07 13:25:29 -08:00
Daniel Sanders	1119adcb10	[gicombiner] Correct 64f1bb5cd2c to account for MSVC's %p format	2020-01-07 12:50:05 -08:00
Bill Wendling	c185d37c5e	Remove extraneous semicolon.	2020-01-07 12:49:09 -08:00
Sanjay Patel	1d67a5be15	[x86] add tests for extract-of-concat; NFC	2020-01-07 15:48:54 -05:00
Matt Arsenault	90a59ac8d4	AMDGPU: Add baseline test for missing pattern The optimization to turn an add into a sub isn't triggering when the pattern to use the zeroed high bits is used.	2020-01-07 15:10:08 -05:00
Matt Arsenault	b1381cb95e	AMDGPU: Remove VOP3Mods0Clamp0OMod Now that overridable default operands work, there's no reason to use complex patterns to just produce 0s.	2020-01-07 15:10:08 -05:00
Matt Arsenault	7940691f1f	AMDGPU: Fix misleading, misplaced end block comments	2020-01-07 15:10:08 -05:00
Matt Arsenault	f098ecdbb2	AMDGPU: Use ImmLeaf	2020-01-07 15:10:07 -05:00
Matt Arsenault	9b420937e1	AMDGPU: Fix not using v_cvt_f16_[iu]16 We weren't treating i16->f16 casts as legal on targets with these instructions, and always using a pair of casts through i32.	2020-01-07 15:10:07 -05:00
Michael Kruse	bcbdac6c54	[cmake] Use relative cmake binary dir for processing pass plugins. https://reviews.llvm.org/D61446 introduced a new function to process pass plugins that used CMAKE_BINARY_DIR. This is problematic when LLVM is a subproject. Instead use LLVM_BINARY_DIR to get the right relative directory for cmake. Patch by Alan Baker <alanbaker@google.com> Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D72109	2020-01-07 20:42:35 +01:00
Fangrui Song	3cceefd641	[PowerPC][Triple] Use elfv2 on freebsd>=13 and linux-musl Summary: Every powerpc64le platform uses elfv2. For powerpc64, the environments "elfv1" and "elfv2" were added for FreeBSD ELFv1->ELFv2 migration in D61950. FreeBSD developers have decided to use OS versions to select ABI, and no one is relying on the environments. Also use elfv2 on powerpc64-linux-musl. Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default ABI. Reviewed By: adalava Differential Revision: https://reviews.llvm.org/D72352	2020-01-07 11:40:56 -08:00
Jessica Paquette	7d3262ad06	[MachineOutliner][AArch64] Save + restore LR in noreturn functions Conservatively always save + restore LR in noreturn functions. These functions do not end in a RET, and so they aren't guaranteed to have an instruction which uses LR in any way. So, as a result, you can end up in unfortunate situations where you can't backtrace out of these functions in a debugger. Remove the old noreturn test, and add a new one which is more descriptive. Remove the restriction that we can't outline from noreturn functions as well since we now do the right thing.	2020-01-07 11:27:25 -08:00
Craig Topper	f60f03ecae	[X86] Improve lowering of (v2i64 (setgt X, -1)) on pre-SSE2 targets. Enable v2i64 in foldVectorXorShiftIntoCmp. Similar to D72302 but for the canonical form for the opposite case. I've changed foldVectorXorShiftIntoCmp to form a target independent setcc node instead of PCMPGT now and enabled its for v2i64 on pre-SSE4.2 targets. The setcc should eventually get lowered to PCMPGT or the new v2i64 sequence. Differential Revision: https://reviews.llvm.org/D72318	2020-01-07 11:22:04 -08:00
Craig Topper	8548edec3f	[X86] Improve lowering of v2i64 sign bit tests on pre-sse4.2 targets Without sse4.2 a v2i64 setlt needs to expand into a pcmpgtd, pcmpeqd, 3 shuffles, and 2 logic ops. But if we're only interested in the sign bit of the i64 elements, we can just use one pcmpgtd and shuffle the odd elements to the even elements. Differential Revision: https://reviews.llvm.org/D72302	2020-01-07 11:22:03 -08:00
LLVM GN Syncbot	f330417d5f	[gn build] Port 1d94fb21118	2020-01-07 19:13:41 +00:00
Daniel Sanders	728c2ac3b1	[gicombiner] Add GIMatchTree and use it for the code generation Summary: GIMatchTree's job is to build a decision tree by zipping all the GIMatchDag's together. Each DAG is added to the tree builder as a leaf and partitioners are used to subdivide each node until there are no more partitioners to apply. At this point, the code generator is responsible for testing any untested predicates and following any unvisited traversals (there shouldn't be any of the latter as the getVRegDef partitioner handles them all). Note that the leaves don't always fit into partitions cleanly and the partitions may overlap as a result. This is resolved by cloning the leaf into every partition it belongs to. One example of this is a rule that can match one of N opcodes. The leaf for this rule would end up in N partitions when processed by the opcode partitioner. A similar example is the getVRegDef partitioner where having rules (add $a, $b), and (add ($a, $b), $c) will result in the former being in the partition for successfully following the vreg-def and failing to do so as it doesn't care which happens. Depends on D69151 Fixed the issues with the windows bots which were caused by stdout/stderr interleaving. Reviewers: bogner, volkan Reviewed By: volkan Subscribers: lkail, mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69152	2020-01-07 11:12:53 -08:00
Alexandre Ganea	68b66f4353	Fix issues reported by -Wrange-loop-analysis when building with latest Clang (trunk). NFC. Fixes warning: loop variable 'E' of type 'const llvm::StringRef' creates a copy from type 'const llvm::StringRef' [-Wrange-loop-analysis]	2020-01-07 13:58:26 -05:00
Simon Pilgrim	4d8923b959	[ARM] Regenerate bfi.ll test cases	2020-01-07 16:51:11 +00:00

1 2 3 4 5 ...

189749 Commits