llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Kazushi (Jam) Marukawa	6a98aebfc4	[VE] udiv/sdiv/urem/srem/mul isel patterns Summary: udiv/sdiv/urem/srem/mul integer isel patterns and tests. Pretend for now that integer division were always cheap in HW. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73623	2020-01-29 15:59:50 +01:00
Matt Arsenault	45d2ea4dc2	AMDGPU/GlobalISel: Manually select scalar f64 G_FNEG This should be no problem to support with a pattern, but it turns out there are just too many yaks to shave. The main problem is in the DAG emitter, which I have no desire to sink effort into fixing. If we had a bit to disable patterns in the DAG importer, fixing the GlobalISelEmitter is more manageable.	2020-01-29 06:49:16 -08:00
Matt Arsenault	f3955d7cae	Analysis: Add max recursison to isDereferenceableAndAlignedPointer Fixes stack overflow in test/CodeGen/X86/large-gep-chain.ll when store lowering starts adding dereferenceable flags.	2020-01-29 06:48:24 -08:00
Matt Arsenault	fbec789204	GlobalISel: Lower G_WRITE_REGISTER	2020-01-29 06:48:24 -08:00
Connor Abbott	f758df6c01	AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns Summary: The code was assuming in a few places that if there was only one exit from the function that it was a normal return, which is invalid. It could be an infinite loop, in which case we still need to insert the usual fake edge so that the null export happens. This fixes shaders that end with an infinite loop that discards. Reviewers: arsenm, nhaehnle, critson Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71192	2020-01-29 15:08:46 +01:00
Connor Abbott	b3f994414f	AMDGPU: Fix handling of infinite loops in fragment shaders Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provably infinite loops that are actually supposed to end successfully. The AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because there's no obvious place to make the loop branch to, it just makes it return immediately, which skips the exports that are supposed to happen at the end and hangs the GPU if all the threads end up being killed. While it would be nice if the fact that kill terminates the thread were modeled in the IR, I think that the structurizer as-is would make a mess if we did that when the kill is inside control flow. For now, we just add a null export at the end to make sure that it always exports something, which fixes the immediate problem without penalizing the more common case. This means that we sometimes do two "done" exports when only some of the threads enter the discard loop, but from tests the hardware seems ok with that. This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70781	2020-01-29 15:08:46 +01:00
Sanjay Patel	32ce6ddd90	[InstCombine] canonicalize splat shuffle after cmp cmp (splat V1, M), SplatC --> splat (cmp V1, SplatC'), M As discussed in PR44588: https://bugs.llvm.org/show_bug.cgi?id=44588 ...we try harder to push shuffles after binops than after compares. This patch handles the special (but presumably most common case) of splat shuffles. If both operands are splats, then we can do the comparison on the non-splat inputs followed by splat of the compare. That should take care of the regression noted in D73411. There's another potential fold requested in PR37463 to scalarize the compare, but that's another patch (and it's not clear if we can do that without the ability to undo it later): https://bugs.llvm.org/show_bug.cgi?id=37463 Differential Revision: https://reviews.llvm.org/D73575	2020-01-29 08:34:29 -05:00
Sanne Wouda	5c00a1f121	[AArch64] Add IR intrinsics for sq(r)dmulh_lane(q) Summary: Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h), are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated) indices, like so: %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3> %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle) When %v's values are known, the shufflevector is optimized away and we are no longer able to select the lane variant of sqdmulh in the backend. This defeats a (hand-coded) optimization that packs several constants into a single vector and uses the lane intrinsics to reduce register pressure and trade-off materialising several constants for a single vector load from the constant pool, like so: int16x8_t v = {2,3,4,5,6,7,8,9}; a = vqdmulh_laneq_s16(a, v, 0); b = vqdmulh_laneq_s16(b, v, 1); c = vqdmulh_laneq_s16(c, v, 2); d = vqdmulh_laneq_s16(d, v, 3); [...] In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4% performance difference. We could teach the compiler to recover the lane variants, but this would likely require its own pass. (Alternatively, "volatile" could be used on the constants vector, but this is a bit ugly.) This patch instead implements the following LLVM IR intrinsics for AArch64 to maintain the original structure through IR optmization and into instruction selection: - sqdmulh_lane - sqdmulh_laneq - sqrdmulh_lane - sqrdmulh_laneq. These 'lane' variants need an additional register class. The second argument must be in the lower half of the 64-bit NEON register file, but only when operating on i16 elements. Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane (etc.) remain, so code that does not rely on NEON intrinsics to generate these instructions is not affected. This patch also changes clang to emit these IR intrinsics for the corresponding NEON intrinsics (AArch64 only). Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71469	2020-01-29 13:25:23 +00:00
Sjoerd Meijer	d34c0f6368	[MVE][MC] evaluateBranch: add missing MVE opcode This adds some missing MVE opcodes to evaluateBranch, which results in llvm-objdump being able to print the PC relative branch target as an annotation. Differential Revision: https://reviews.llvm.org/D73553	2020-01-29 13:19:45 +00:00
Kazushi (Jam) Marukawa	ec15ab302a	[VE] Isel patterns for fp32/64 and i32/64 conversion Summary: fp32/64 <> signed/unsigned i32/64 conversion isel patterns and tests (This patch depends on `fsub` implemented by https://reviews.llvm.org/D73540 ) Reviewers: arsenm, craig.topper, rengolin, k-ishizaka Reviewed By: arsenm Subscribers: merge_guards_bot, wdng, hiraditya, llvm-commits Tags: #ve, #llvm Differential Revision: https://reviews.llvm.org/D73544	2020-01-29 14:10:22 +01:00
Georgii Rymar	f4677aee8d	[yaml2obj][obj2yaml] - Add lost test cases. It is a part of https://reviews.llvm.org/D71872 which was lost somehow during relanding after being reverted: https://reviews.llvm.org/rG7570d387c21935b58afa67cb9ee17250e38721fa	2020-01-29 15:40:35 +03:00
Kerry McLaughlin	4248c8f170	[AArch64][SVE] Add SVE2 intrinsics for uniform DSP operations Summary: Implements the following intrinsics: - sqrdmlah, sqrdmlsh, sqrdmulh & sqdmulh - [s\|u]hadd, [s\|u]hsub, [s\|u]rhadd & [s\|u]hsubr - urecpe, ursqrte, sqabs & sqneg Reviewers: sdesmalen, efriedma, dancgr, cameron.mcinally Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73493	2020-01-29 12:03:15 +00:00
Sam Parker	8ae8a390a3	[NFC][ARM] Add test	2020-01-29 06:59:21 -05:00
Kerry McLaughlin	163db16495	[AArch64][SVE] Add SVE2 intrinsics for pairwise arithmetic Summary: Implements the following intrinsics: - addp - smaxp, sminp, umaxp & uminp - sadalp & uadalp Reviewers: dancgr, efriedma, sdesmalen, c-rhodes Reviewed By: c-rhodes Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73347	2020-01-29 10:31:31 +00:00
James Henderson	0edcf38c7a	[DebugInfo] Make most debug line prologue errors non-fatal to parsing Many of the debug line prologue errors are not inherently fatal. In most cases, we can make reasonable assumptions and carry on. This patch does exactly that. In the case of length problems, the approach of "assume stated length is correct" is taken which means the offset might need adjusting. This is a relanding of b94191fe, fixing an LLD test and the LLDB build. Reviewed by: dblaikie, labath Differential Revision: https://reviews.llvm.org/D72158	2020-01-29 10:23:41 +00:00
Kazushi (Jam) Marukawa	9f87edf4d9	[VE] fp32/64 fadd/fsub/fdiv/fmul isel patterns Summary: fp32/64 fadd/fsub/fdiv/fmul isel patterns and tests. Reviewers: arsenm, craig.topper, rengolin, k-ishizaka Subscribers: merge_guards_bot, wdng, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D73540	2020-01-29 11:00:56 +01:00
David Stenberg	c38f7a01f2	[ARM64] Debug info for structure argument missing DW_AT_location Summary: Prevent eliminating dbg_val due to COPY. Fixes this https://bugs.llvm.org/show_bug.cgi?id=40709 Patch by: Kamlesh Kumar (kamleshbhalui) Reviewers: aprantl, dblaikie, vsk, dsanders Reviewed By: dsanders Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73159	2020-01-29 10:56:23 +01:00
Simon Moll	838c174864	[VE][fix] (more) explicit StringRef to std::string	2020-01-29 10:46:59 +01:00
Jay Foad	014967c65f	[AMDGPU] Simplify DS and SM cases in getMemOperandsWithOffset Summary: This removes a couple of unnecessary isReg checks, now that memOpsHaveSameBasePtr can handle FI operands, but is otherwise NFC. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73485	2020-01-29 09:43:24 +00:00
Simon Moll	33ffba9b05	[VE][fix] Explicit StringRef to std::string conversion Adapt to changes of "[ADT] Make StringRef's std::string conversion operator explicit" (777180a32).	2020-01-29 10:34:28 +01:00
Fangrui Song	f4344ccb7e	[ARC] Fix ARCTargetMachine after 777180a32b6107	2020-01-29 00:59:16 -08:00
Sam Parker	15040d710f	[RDA][ARM] Move functionality into RDA Add several new helpers to RDA: - hasLocalDefBefore - isRegDefinedAfter - isSafeToDefRegAt And move two bits of logic from ARMLowOverheadLoops into RDA: - isSafeToMove - isSafeToRemove Both of these have some wrappers too to make them more convienent to use. Differential Revision: https://reviews.llvm.org/D73460	2020-01-29 03:27:47 -05:00
Fangrui Song	f3bb2fa95f	[X86] matchAdd: don't fold a large offset into a %rip relative address For `ret i64 add (i64 ptrtoint (i32* @foo to i64), i64 1701208431)`, ``` X86DAGToDAGISel::matchAdd ... // AM.setBaseReg(CurDAG->getRegister(X86::RIP, MVT::i64)); if (!matchAddressRecursively(N.getOperand(0), AM, Depth+1) && // Try folding offset but fail; there is a symbolic displacement, so offset cannot be too large !matchAddressRecursively(Handle.getValue().getOperand(1), AM, Depth+1)) return false; ... // Try again after commuting the operands. // AM.Disp = Val; foldOffsetIntoAddress() does not know there will be a symbolic displacement if (!matchAddressRecursively(Handle.getValue().getOperand(1), AM, Depth+1) && // AM.setBaseReg(CurDAG->getRegister(X86::RIP, MVT::i64)); !matchAddressRecursively(Handle.getValue().getOperand(0), AM, Depth+1)) // Succeeded! Produced leaq sym+disp(%rip),... return false; ``` `foldOffsetIntoAddress()` currently does not know there is a symbolic displacement and can fold a large offset. The produced `leaq sym+disp(%rip), %rax` instruction is relocated by an R_X86_64_PC32. If disp is large and sym+disp-rip>=2**31, there will be a relocation overflow. This approach is still not elegant. Unfortunately the isRIPRelative interface is a bit clumsy. I tried several solutions and eventually picked this one. Differential Revision: https://reviews.llvm.org/D73606	2020-01-28 22:30:52 -08:00
Johannes Doerfert	fe2a59a6a5	[Attributor][Fix] Initialize unused but loaded variable This hopefully un-breaks: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38333	2020-01-28 23:52:16 -06:00
Johannes Doerfert	853ea66edb	[Attributor] Reuse existing logic to avoid duplication There was a TODO in AAValueConstantRangeArgument to reuse AAArgumentFromCallSiteArguments. We now do this by allowing new States to be build from the bestState.	2020-01-28 23:45:59 -06:00
Johannes Doerfert	9f299369d5	[Attributor][FIX] Treat invalidated attributes as changed If we invalidate an attribute we need to inform all dependent ones even if the fixpoint state is not invalid. Before we only continued invalidation if the fixpoint state was invalid, now we signal a change in case the fixpoint state is valid. The test case was already included in D71620 but the problem was hiding because it only manifested with the old PM (for that input).	2020-01-28 23:40:41 -06:00
Johannes Doerfert	99e7762288	[Attributor] Modularize AANoAliasCallSiteArgument to simplify extensions This patch modularizes the way we check for no-alias call site arguments by putting the existing logic into helper functions. The reasoning was not changed but special cases for readonly/readnone were added.	2020-01-28 23:39:29 -06:00
Johannes Doerfert	c2c1ca3581	[Attributor] Mark a non-defined `null` pointer as `noalias` If `null` is not defined we cannot access it, hence the pointer is `noalias`. While this is not helpful on it's own it simplifies later deductions that can skip over already known `noalias` pointers in certain situations.	2020-01-28 23:09:37 -06:00
Johannes Doerfert	9de3518b35	[Attributor][NFC] Remove ugly and unneeded cast	2020-01-28 22:54:31 -06:00
Johannes Doerfert	809a626b9a	[Attributor][NFC] Improve debug messages	2020-01-28 22:53:19 -06:00
Johannes Doerfert	43f214c985	[Attributor][NFC] Internalize helper function	2020-01-28 22:50:34 -06:00
Benjamin Kramer	c0bcd6fe73	One more bugpoitn fix for GCC5	2020-01-29 03:42:02 +01:00
Benjamin Kramer	9bf5dbee1c	Try harder to fix bugpoint with GCC5	2020-01-29 03:30:47 +01:00
Benjamin Kramer	2d00d6fd92	Make bugpoint work with gcc5 again.	2020-01-29 03:11:00 +01:00
Benjamin Kramer	915a3355f8	Fix more implicit conversions. Getting closer to having clang working with gcc 5 again	2020-01-29 02:57:59 +01:00
Benjamin Kramer	36307eb072	Fix conversions in clang and examples	2020-01-29 02:48:15 +01:00
Nate Voorhies	7d8a8e7641	[NFC] Fix unused variable warning. Reviewers: dschuff Reviewed By: dschuff Subscribers: hiraditya, aheejin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73591	2020-01-28 17:19:23 -08:00
Vedant Kumar	7bbbffcf9e	[CodeExtractor] Remove stale llvm.assume calls from extracted region During extraction, stale llvm.assume handles may be retained in the original function. The setup is: 1) CodeExtractor unregisters assumptions in the blocks that are to be extracted. 2) Extraction happens. There are now two functions: f1 and f1.extracted. 3) Leftover assumptions in f1 (/not/ removed as they were not in the set of blocks to be extracted) now have affected-value llvm.assume handles in f1.extracted. When assumptions for a value used in f1 are looked up, ValueTracking can assert as some of the handles are in the wrong function. To fix this, simply erase the llvm.assume calls in the extracted function. Alternatives include flushing the assumption cache in the original function, or walking all values used in the original function to prune stale affected-value handles. Both seem more expensive. Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled rdar://58460728	2020-01-28 17:18:01 -08:00
Benjamin Kramer	9b427c8e8d	Another stab at making the gold plugin compile again	2020-01-29 02:12:53 +01:00
Benjamin Kramer	76555832b5	Another round of GCC5 fixes.	2020-01-29 02:09:24 +01:00
Derek Schuff	0cf7fafc53	[WebAssembly] Preserve debug frame base information through register coloring 2 fixes: Register coloring can re-assign virtual registers. When the frame base register is colored, update the DwarfFrameBase accordingly When the frame base register is stackified, do not attempt to encode DW_AT_frame_base as a local In the future we will presumably want to handle this case better but for now we can emit worse debug info rather than crashing. Differential Revision: https://reviews.llvm.org/D73581	2020-01-28 16:58:15 -08:00
Benjamin Kramer	3e6e191872	Fix one round of implicit conversions found by g++5.	2020-01-29 01:52:48 +01:00
Craig Topper	3b91c078ed	[X86] Use SelectionDAG::getZExtOrTrunc to simplify some code. NFCI	2020-01-28 16:27:59 -08:00
Craig Topper	4e605f5fdf	[X86] Add test case for llvm.flt.rounds	2020-01-28 16:27:59 -08:00
Nico Weber	0af58442ef	Fix AVR build after 777180a32b6107	2020-01-28 19:22:22 -05:00
Benjamin Kramer	633cdf3dc1	Address implicit conversions detected by g++ 5 only.	2020-01-29 01:01:09 +01:00
Benjamin Kramer	5313b54c46	One more batch of things found by g++ 6	2020-01-29 00:50:34 +01:00
Eli Friedman	b7dd99544e	[AliasAnalysis] Add missing FMRB_* enums. Previously, the enums didn't account for all the possible cases, which could cause misleading results (particularly for a "switch" on FunctionModRefBehavior). Fixes regression in polly from recent patch to add writeonly to memset. While I'm here, also fix a few dubious uses of the FMRB_* enum values. Differential Revision: https://reviews.llvm.org/D73154	2020-01-28 15:47:08 -08:00
Benjamin Kramer	8d50d8820b	Fix a couple more implicit conversions that Clang doesn't diagnose.	2020-01-29 00:42:56 +01:00
Benjamin Kramer	5805fcea4b	A bunch more implicit string conversions that my Clang didn't detect.	2020-01-29 00:30:16 +01:00

1 2 3 4 5 ...

190891 Commits