1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

190884 Commits

Author SHA1 Message Date
Sanne Wouda
5c00a1f121 [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q)
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:

   %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
   %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)

When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.

This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:

   int16x8_t v = {2,3,4,5,6,7,8,9};
   a = vqdmulh_laneq_s16(a, v, 0);
   b = vqdmulh_laneq_s16(b, v, 1);
   c = vqdmulh_laneq_s16(c, v, 2);
   d = vqdmulh_laneq_s16(d, v, 3);
   [...]

In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.

We could teach the compiler to recover the lane variants, but this would likely
require its own pass.  (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)

This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.

These 'lane' variants need an additional register class.  The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.

Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.

This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).

Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71469
2020-01-29 13:25:23 +00:00
Sjoerd Meijer
d34c0f6368 [MVE][MC] evaluateBranch: add missing MVE opcode
This adds some missing MVE opcodes to evaluateBranch, which results in
llvm-objdump being able to print the PC relative branch target as an
annotation.

Differential Revision: https://reviews.llvm.org/D73553
2020-01-29 13:19:45 +00:00
Kazushi (Jam) Marukawa
ec15ab302a [VE] Isel patterns for fp32/64 and i32/64 conversion
Summary:
fp32/64 <> signed/unsigned i32/64 conversion isel patterns and tests

(This patch depends on `fsub` implemented by https://reviews.llvm.org/D73540 )

Reviewers: arsenm, craig.topper, rengolin, k-ishizaka

Reviewed By: arsenm

Subscribers: merge_guards_bot, wdng, hiraditya, llvm-commits

Tags: #ve, #llvm

Differential Revision: https://reviews.llvm.org/D73544
2020-01-29 14:10:22 +01:00
Georgii Rymar
f4677aee8d [yaml2obj][obj2yaml] - Add lost test cases.
It is a part of https://reviews.llvm.org/D71872 which
was lost somehow during relanding after being reverted:

https://reviews.llvm.org/rG7570d387c21935b58afa67cb9ee17250e38721fa
2020-01-29 15:40:35 +03:00
Kerry McLaughlin
4248c8f170 [AArch64][SVE] Add SVE2 intrinsics for uniform DSP operations
Summary:
Implements the following intrinsics:
 - sqrdmlah, sqrdmlsh, sqrdmulh & sqdmulh
 - [s|u]hadd, [s|u]hsub, [s|u]rhadd & [s|u]hsubr
 - urecpe, ursqrte, sqabs & sqneg

Reviewers: sdesmalen, efriedma, dancgr, cameron.mcinally

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73493
2020-01-29 12:03:15 +00:00
Sam Parker
8ae8a390a3 [NFC][ARM] Add test 2020-01-29 06:59:21 -05:00
Kerry McLaughlin
163db16495 [AArch64][SVE] Add SVE2 intrinsics for pairwise arithmetic
Summary:
Implements the following intrinsics:
 - addp
 - smaxp, sminp, umaxp & uminp
 - sadalp & uadalp

Reviewers: dancgr, efriedma, sdesmalen, c-rhodes

Reviewed By: c-rhodes

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73347
2020-01-29 10:31:31 +00:00
James Henderson
0edcf38c7a [DebugInfo] Make most debug line prologue errors non-fatal to parsing
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "assume
stated length is correct" is taken which means the offset might need
adjusting.

This is a relanding of b94191fe, fixing an LLD test and the LLDB build.

Reviewed by: dblaikie, labath

Differential Revision: https://reviews.llvm.org/D72158
2020-01-29 10:23:41 +00:00
Kazushi (Jam) Marukawa
9f87edf4d9 [VE] fp32/64 fadd/fsub/fdiv/fmul isel patterns
Summary: fp32/64 fadd/fsub/fdiv/fmul isel patterns and tests.

Reviewers: arsenm, craig.topper, rengolin, k-ishizaka

Subscribers: merge_guards_bot, wdng, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D73540
2020-01-29 11:00:56 +01:00
David Stenberg
c38f7a01f2 [ARM64] Debug info for structure argument missing DW_AT_location
Summary:
Prevent eliminating dbg_val due to COPY.

Fixes this
https://bugs.llvm.org/show_bug.cgi?id=40709

Patch by: Kamlesh Kumar (kamleshbhalui)

Reviewers: aprantl, dblaikie, vsk, dsanders

Reviewed By: dsanders

Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D73159
2020-01-29 10:56:23 +01:00
Simon Moll
838c174864 [VE][fix] (more) explicit StringRef to std::string 2020-01-29 10:46:59 +01:00
Jay Foad
014967c65f [AMDGPU] Simplify DS and SM cases in getMemOperandsWithOffset
Summary:
This removes a couple of unnecessary isReg checks, now that
memOpsHaveSameBasePtr can handle FI operands, but is otherwise NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73485
2020-01-29 09:43:24 +00:00
Simon Moll
33ffba9b05 [VE][fix] Explicit StringRef to std::string conversion
Adapt to changes of "[ADT] Make StringRef's std::string conversion
operator explicit" (777180a32).
2020-01-29 10:34:28 +01:00
Fangrui Song
f4344ccb7e [ARC] Fix ARCTargetMachine after 777180a32b6107 2020-01-29 00:59:16 -08:00
Sam Parker
15040d710f [RDA][ARM] Move functionality into RDA
Add several new helpers to RDA:
- hasLocalDefBefore
- isRegDefinedAfter
- isSafeToDefRegAt

And move two bits of logic from ARMLowOverheadLoops into RDA:
- isSafeToMove
- isSafeToRemove

Both of these have some wrappers too to make them more convienent to
use.

Differential Revision: https://reviews.llvm.org/D73460
2020-01-29 03:27:47 -05:00
Fangrui Song
f3bb2fa95f [X86] matchAdd: don't fold a large offset into a %rip relative address
For `ret i64 add (i64 ptrtoint (i32* @foo to i64), i64 1701208431)`,

```
X86DAGToDAGISel::matchAdd
  ...
// AM.setBaseReg(CurDAG->getRegister(X86::RIP, MVT::i64));
  if (!matchAddressRecursively(N.getOperand(0), AM, Depth+1) &&
// Try folding offset but fail; there is a symbolic displacement, so offset cannot be too large
      !matchAddressRecursively(Handle.getValue().getOperand(1), AM, Depth+1))
    return false;
  ...
  // Try again after commuting the operands.
// AM.Disp = Val; foldOffsetIntoAddress() does not know there will be a symbolic displacement
  if (!matchAddressRecursively(Handle.getValue().getOperand(1), AM, Depth+1) &&
// AM.setBaseReg(CurDAG->getRegister(X86::RIP, MVT::i64));
      !matchAddressRecursively(Handle.getValue().getOperand(0), AM, Depth+1))
// Succeeded! Produced leaq sym+disp(%rip),...
    return false;
```

`foldOffsetIntoAddress()` currently does not know there is a symbolic
displacement and can fold a large offset.

The produced `leaq sym+disp(%rip), %rax` instruction is relocated by
an R_X86_64_PC32. If disp is large and sym+disp-rip>=2**31, there
will be a relocation overflow.

This approach is still not elegant. Unfortunately the isRIPRelative
interface is a bit clumsy. I tried several solutions and eventually
picked this one.

Differential Revision: https://reviews.llvm.org/D73606
2020-01-28 22:30:52 -08:00
Johannes Doerfert
fe2a59a6a5 [Attributor][Fix] Initialize unused but loaded variable
This hopefully un-breaks:
  http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38333
2020-01-28 23:52:16 -06:00
Johannes Doerfert
853ea66edb [Attributor] Reuse existing logic to avoid duplication
There was a TODO in AAValueConstantRangeArgument to reuse
AAArgumentFromCallSiteArguments. We now do this by allowing new States
to be build from the bestState.
2020-01-28 23:45:59 -06:00
Johannes Doerfert
9f299369d5 [Attributor][FIX] Treat invalidated attributes as changed
If we invalidate an attribute we need to inform all dependent ones even
if the fixpoint state is not invalid. Before we only continued
invalidation if the fixpoint state was invalid, now we signal a change
in case the fixpoint state is valid.

The test case was already included in D71620 but the problem was hiding
because it only manifested with the old PM (for that input).
2020-01-28 23:40:41 -06:00
Johannes Doerfert
99e7762288 [Attributor] Modularize AANoAliasCallSiteArgument to simplify extensions
This patch modularizes the way we check for no-alias call site arguments
by putting the existing logic into helper functions. The reasoning was
not changed but special cases for readonly/readnone were added.
2020-01-28 23:39:29 -06:00
Johannes Doerfert
c2c1ca3581 [Attributor] Mark a non-defined null pointer as noalias
If `null` is not defined we cannot access it, hence the pointer is
`noalias`. While this is not helpful on it's own it simplifies later
deductions that can skip over already known `noalias` pointers in
certain situations.
2020-01-28 23:09:37 -06:00
Johannes Doerfert
9de3518b35 [Attributor][NFC] Remove ugly and unneeded cast 2020-01-28 22:54:31 -06:00
Johannes Doerfert
809a626b9a [Attributor][NFC] Improve debug messages 2020-01-28 22:53:19 -06:00
Johannes Doerfert
43f214c985 [Attributor][NFC] Internalize helper function 2020-01-28 22:50:34 -06:00
Benjamin Kramer
c0bcd6fe73 One more bugpoitn fix for GCC5 2020-01-29 03:42:02 +01:00
Benjamin Kramer
9bf5dbee1c Try harder to fix bugpoint with GCC5 2020-01-29 03:30:47 +01:00
Benjamin Kramer
2d00d6fd92 Make bugpoint work with gcc5 again. 2020-01-29 03:11:00 +01:00
Benjamin Kramer
915a3355f8 Fix more implicit conversions. Getting closer to having clang working with gcc 5 again 2020-01-29 02:57:59 +01:00
Benjamin Kramer
36307eb072 Fix conversions in clang and examples 2020-01-29 02:48:15 +01:00
Nate Voorhies
7d8a8e7641 [NFC] Fix unused variable warning.
Reviewers: dschuff

Reviewed By: dschuff

Subscribers: hiraditya, aheejin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73591
2020-01-28 17:19:23 -08:00
Vedant Kumar
7bbbffcf9e [CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:

1) CodeExtractor unregisters assumptions in the blocks that are to be
   extracted.

2) Extraction happens. There are now two functions: f1 and f1.extracted.

3) Leftover assumptions in f1 (/not/ removed as they were not in the set of
   blocks to be extracted) now have affected-value llvm.assume handles in
   f1.extracted.

When assumptions for a value used in f1 are looked up, ValueTracking can assert
as some of the handles are in the wrong function. To fix this, simply erase the
llvm.assume calls in the extracted function.

Alternatives include flushing the assumption cache in the original function, or
walking all values used in the original function to prune stale affected-value
handles. Both seem more expensive.

Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled

rdar://58460728
2020-01-28 17:18:01 -08:00
Benjamin Kramer
9b427c8e8d Another stab at making the gold plugin compile again 2020-01-29 02:12:53 +01:00
Benjamin Kramer
76555832b5 Another round of GCC5 fixes. 2020-01-29 02:09:24 +01:00
Derek Schuff
0cf7fafc53 [WebAssembly] Preserve debug frame base information through register coloring
2 fixes:

Register coloring can re-assign virtual registers. When the frame base register
is colored, update the DwarfFrameBase accordingly When the frame base register
is stackified, do not attempt to encode DW_AT_frame_base as a local In the
future we will presumably want to handle this case better but for now we can
emit worse debug info rather than crashing.

Differential Revision: https://reviews.llvm.org/D73581
2020-01-28 16:58:15 -08:00
Benjamin Kramer
3e6e191872 Fix one round of implicit conversions found by g++5. 2020-01-29 01:52:48 +01:00
Craig Topper
3b91c078ed [X86] Use SelectionDAG::getZExtOrTrunc to simplify some code. NFCI 2020-01-28 16:27:59 -08:00
Craig Topper
4e605f5fdf [X86] Add test case for llvm.flt.rounds 2020-01-28 16:27:59 -08:00
Nico Weber
0af58442ef Fix AVR build after 777180a32b6107 2020-01-28 19:22:22 -05:00
Benjamin Kramer
633cdf3dc1 Address implicit conversions detected by g++ 5 only. 2020-01-29 01:01:09 +01:00
Benjamin Kramer
5313b54c46 One more batch of things found by g++ 6 2020-01-29 00:50:34 +01:00
Eli Friedman
b7dd99544e [AliasAnalysis] Add missing FMRB_* enums.
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).

Fixes regression in polly from recent patch to add writeonly to memset.

While I'm here, also fix a few dubious uses of the FMRB_* enum values.

Differential Revision: https://reviews.llvm.org/D73154
2020-01-28 15:47:08 -08:00
Benjamin Kramer
8d50d8820b Fix a couple more implicit conversions that Clang doesn't diagnose. 2020-01-29 00:42:56 +01:00
Benjamin Kramer
5805fcea4b A bunch more implicit string conversions that my Clang didn't detect. 2020-01-29 00:30:16 +01:00
Benjamin Kramer
608c5ac650 [tblgen] Fix implicit conversion only diagnosed by g++ 6 2020-01-29 00:25:44 +01:00
Jonas Devlieghere
c3e5ca8b31 Fix more implicit conversions 2020-01-28 15:19:27 -08:00
Benjamin Kramer
e25100ea2b Fix implicit conversions in example code. 2020-01-29 00:17:35 +01:00
Benjamin Kramer
6ec02ae10e [Support] Fix implicit std::string conversions on Win32. 2020-01-29 00:02:26 +01:00
Benjamin Kramer
1d02418b3c [ADT] Make StringRef's std::string conversion operator explicit
This has the same behavior as converting std::string_view to
std::string. This is an expensive conversion, so explicit conversions
are helpful for avoiding unneccessary string copies.
2020-01-28 23:47:07 +01:00
Shoaib Meenai
cfc8ee6256 [runtimes] Fix passing lists to runtimes configures
We have to replace the ";" with "|" (since LLVMExternalProjectUtils uses
"|" as the `LIST_SEPARATOR` when invoking `ExternalProject_Add`) in
order for lists to be passed correctly to the runtimes CMake configures.
Remove the special case for `LLVM_ENABLE_RUNTIMES`, since it'll just get
handled by the general logic now.

Differential Revision: https://reviews.llvm.org/D73512
2020-01-28 14:36:14 -08:00
Benjamin Kramer
87d13166c7 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00