1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

204355 Commits

Author SHA1 Message Date
Amara Emerson
635632451e [GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.

Differential Revision: https://reviews.llvm.org/D88364
2020-09-28 14:00:46 -07:00
David Tenty
c82257db65 [CMake][AIX] Limit tools in external project build
This is a follow on to D85329 which disabled some llvm tools in the
runtimes build due to XCOFF64 limitations. This change disables them
in other external project builds as well, when no list of tools is
specified in the arguments.

Reviewed By: hubert.reinterpretcast, stevewan

Differential Revision: https://reviews.llvm.org/D88310
2020-09-28 16:59:25 -04:00
Nico Weber
cce683e16e [gn build] Re-run CompletionModelCodegen when input json files change 2020-09-28 16:58:00 -04:00
Amara Emerson
34b690d29f Revert "Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar.""
This isn't a real with the codegen, it's a previously known bug in clang which
causes non-deterministic failures due to garbage bits in undef registers being
used in saturating instructions.

I'm disabling the result checking for the test until this issue is resolved.

This reverts commit 6c8168324b5329c94fe7e8f9a1619802091b9bec.
2020-09-28 13:44:51 -07:00
Craig Topper
1804ba7f5b [X86] Use inlineasm flag output for the _bittest* intrinsics.
Instead of expliciting emitting a setc in the inline asm instructions,
we can use flag output. This allows the backend to use the flag
directly if it is needed by a branch. Previously we needed a test
instruction to convert the register back to a flag.

If the flag can't be used directly, the backend will emit a setcc.

Differential Revision: https://reviews.llvm.org/D87888
2020-09-28 13:33:22 -07:00
Simon Pilgrim
af960b5500 [InstCombine] Regenerate cast tests. NFC. 2020-09-28 21:32:12 +01:00
Eric Astor
858c1cc868 [COFF] Aliases resolve directly to defined external targets
Avoid introducing unnecessary indirection for weak-external references.

We only need to introduce ".weak.<SYMBOL>.default" when referencing a
symbol that is defined, but not external.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D88305
2020-09-28 16:12:45 -04:00
Benjamin Kramer
46460daf85 [wasm] Move WasmTraits.h to BinaryFormat
There's no dependency on Object in there and this avoids a cyclic
dependency between libMC and libObject.
2020-09-28 22:07:28 +02:00
Sanjay Patel
068a0e3768 [CostModel] remove hack for intrinsic cost based on cost type
This hack seems to only have been necessary because of the
constructor bug noted in 33125cffd.

Once again, it's hard to prove NFC, but that's the hope...
2020-09-28 15:58:42 -04:00
Baptiste Saleil
3dca9af6d2 [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
2020-09-28 14:39:37 -05:00
Sanjay Patel
1fffe5761d [CostModel] fill in arguments as part of intrinsic attribute constructor
This appears to be an error of code duplication - instead of
one constructor variant calling another, we have N similar
but not identical versions.

I think this is 'NFC' based on the current callers, but it's
hard to tell or guess the intent in all cases.
2020-09-28 15:27:45 -04:00
Jon Roelofs
9b0c93e6aa [AArch64] reuse another map iterator. NFC 2020-09-28 11:30:21 -07:00
Amara Emerson
188cec631b Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar."
This reverts commit b5e87c9ef2243ecd65e0ef87a1bf303c0c26db04 as it seems to have
broken a bot.
2020-09-28 11:25:19 -07:00
Dominic Chen
12ebbafac0 [AddressSanitizer] Copy type metadata to prevent miscompilation
When ASan and e.g. Dead Virtual Function Elimination are enabled, the
latter will rely on type metadata to determine if certain virtual calls can be
removed. However, ASan currently does not copy type metadata, which can cause
virtual function calls to be incorrectly removed.

Differential Revision: https://reviews.llvm.org/D88368
2020-09-28 13:56:05 -04:00
Simon Pilgrim
60566b9591 [InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests 2020-09-28 18:53:38 +01:00
Heejin Ahn
88d1852eae [WebAssembly] Use wasm::Signature for in ObjectWriter (NFC)
There are two `WasmSignature` structs, one in
include/llvm/BinaryFormat/Wasm.h and the other in
lib/MC/WasmObjectWriter.cpp. I don't know why they got separated in this
way in the first place, but it seems we can unify them to use the one in
Wasm.h for all cases.

Reviewed By: dschuff, sbc100

Differential Revision: https://reviews.llvm.org/D88428
2020-09-28 10:46:55 -07:00
Jessica Paquette
2dcd508332 [AArch64][GlobalISel] Infer whether G_PHI is going to be a FPR in regbankselect
Some instructions (G_LOAD, G_SELECT, G_UNMERGE_VALUES) check if their uses
will define/use FPRs (using `onlyUsesFP` and `onlyDefinesFP`).

The register bank of a use isn't necessarily known when an instruction asks for
this.

Teach `hasFPConstraints` to look at the instructions feeding into a G_PHI when
its destination bank is unknown. If any of them are FPR, assume the entire
G_PHI will also be assigned a FPR.

Since a phi can have many inputs, and those inputs can in turn be phis,
restrict the search depth to a very low number.

Also improve the docs for `hasFPConstraints` and friends a little.

This is a 0.3% code size improvement on CTMark/Bullet at -O3, and a 0.2% code
size improvement at CTMark/pairlocalalign at -O3.

Differential Revision: https://reviews.llvm.org/D88177
2020-09-28 10:37:09 -07:00
Sanjay Patel
1d7afc6eeb [CostModel] move early exit for free intrinsics
This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
2020-09-28 13:30:55 -04:00
Sanjay Patel
2915eafc46 [CostModel] split handling of intrinsics from other calls
This should be close to NFC (no-functional-change), but I
can't completely rule out that some call on some target
travels down a different path. There's an especially large
amount of code spaghetti in this part of the cost model.

The goal is to clean up the intrinsic cost handling so
we can canonicalize to the new min/max intrinsics without
causing regressions.
2020-09-28 13:30:55 -04:00
Jessica Paquette
7ee5835082 [AArch64][GlobalISel] Support shifted register form in emitTST
Support emitting ANDSXrs and ANDSWrs in `emitTST`. Update opt-fold-compare.mir
to show that it works.

Differential Revision: https://reviews.llvm.org/D87530
2020-09-28 10:13:47 -07:00
Jessica Paquette
7a97485533 [GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:

```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```

Produce this:

```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```

as long as we are guaranteed to eliminate the original G_AND.

Also matches all commuted forms. E.g.

```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```

will be matched as well.

Differential Revision: https://reviews.llvm.org/D88104
2020-09-28 10:08:14 -07:00
Simon Pilgrim
135593d1be [InstCombine] Add basic trunc(shr(trunc(x),c)) tests
Helps improve the minor regressions noticed on D88316
2020-09-28 18:00:28 +01:00
Jon Roelofs
95dada364f [AArch64] Reuse map iterator instead of double lookup. NFC 2020-09-28 09:47:00 -07:00
Mikhail Maltsev
48709cc3db [unittests] Preserve LD_LIBRARY_PATH in crash recovery test
We need to preserve the LD_LIBRARY_PATH environment variable when
spawning a child process (certain setups rely on non-standard paths
for e.g. libstdc++). In order to achieve this, set
LLVM_CRC_UNIXCRCRETURNCODE in the parent process instead of creating
the child's environment from scratch.

Reviewed By: aganea

Differential Revision: https://reviews.llvm.org/D88308
2020-09-28 17:46:03 +01:00
Jay Foad
f4f70547af [AMDGPU] Reformat AMDGPUTargetLowering::isSDNodeAlwaysUniform. NFC. 2020-09-28 16:24:16 +01:00
Sam Parker
5a6cdf9c3b [ARM][LowOverheadLoops] Cleanup and re-arrange
Rename and reorganise how we decide where to put the LoopStart
instruction.
2020-09-28 16:06:30 +01:00
Tres Popp
352980fbe9 [llvm] Fix unused variable in non-debug configurations 2020-09-28 17:04:08 +02:00
Meera Nakrani
54f9731add [ARM] Added more patterns to generate SSAT/USAT with shift
Added patterns to generate an SSAT or USAT with shift for
SSAT/USAT instructions that are matched from IR patterns.

Differential Revision: https://reviews.llvm.org/D88145
2020-09-28 14:50:19 +00:00
Cameron McInally
a43296f995 [SVE] Lower fixed length VECREDUCE_[UMAX|UMIN] to Scalable
Essentially the same as the signed variants from D88259. Also includes a clean up of the lowering function.

Differential Revision: https://reviews.llvm.org/D88317
2020-09-28 09:29:00 -05:00
Juneyoung Lee
31a4179ce0 [ValueTracking] Fix analyses to update CxtI to be phi's incoming edges' terminators
It was mentioned that D88276 that when a phi node is visited, terminators at their incoming edges should be used for CtxI.
This is a patch that makes two functions (ComputeNumSignBitsImpl, isGuaranteedNotToBeUndefOrPoison) to do so.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D88360
2020-09-28 23:24:20 +09:00
Paul C. Anagnostopoulos
90d4bf8784 [TableGen] Improved messages in PseudoLoweringEmitter. 2020-09-28 10:18:22 -04:00
Simon Pilgrim
a9b5f69419 [InstCombine] matchRotate - force splat of uniform constant rotation amounts (PR46895)
Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.
2020-09-28 15:12:41 +01:00
Sam Parker
db9206f64f [NFC][ARM] Factor out some logic for LoLoops.
Create a DCE function that accepts an instruction.
2020-09-28 14:51:52 +01:00
Jay Foad
f7c8caa309 [AMDGPU] Reformat SITargetLowering::isSDNodeSourceOfDivergence. NFC. 2020-09-28 14:42:05 +01:00
Georgii Rymar
c050c97e5e [llvm-readobj/elf] - Fix the PREL31 relocation computation used for dumping arm32 unwind info (-u).
This is a part of https://bugs.llvm.org/show_bug.cgi?id=47581.

We have the following computation:
```
(1) uint64_t Location = Address & 0x7fffffff;
(2) if (Location & 0x04000000)
(3)   Location |= (uint64_t) ~0x7fffffff;
(4) return Location + Place;
```

At line 2 there is a mistype. The constant should be `0x40000000`,
not `0x04000000`, because the intention here is to sign extend the `Location`,
which is the 31 bit signed value.

Differential revision: https://reviews.llvm.org/D88407
2020-09-28 16:22:56 +03:00
Sjoerd Meijer
f26e5a332c [ARM][MVE] Enable tail-predication by default
We have been running tests/benchmarks downstream with tail-predication enabled
for some time now and this behaves as expected: we are not aware of any
correctness issues, and this performs better across the board than with
tail-predication disabled. Time to flip the switch!

Differential Revision: https://reviews.llvm.org/D88093
2020-09-28 14:01:23 +01:00
Simon Pilgrim
a75755b2d9 [InstCombine] matchRotate - allow undef in uniform constant rotation amounts (PR46895)
An extension to D87452, we can safely permit undefs in the uniform/splat detection

https://alive2.llvm.org/ce/z/nT-ptN

Differential Revision: https://reviews.llvm.org/D88402
2020-09-28 13:36:13 +01:00
Florian Hahn
08b000a04c [SCEV] Also use info from assumes in applyLoopGuards.
Similar to collecting information from branches guarding a loop, we can
also collect information from assumes dominating the loop header.

Fixes PR47247.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87854
2020-09-28 13:14:24 +01:00
Daniel Kiss
85d6c62b2c [AArch64] Generate .note.gnu.property based on module flags.
Flags of the module derived exclusively from the compiler flag `-mbranch-protection`.
The note is generated based on the module flags accordingly.
After this change in case of compile unit without function won't have
the .note.gnu.property if the compiler flag is not present [1].

[1] https://bugs.llvm.org/show_bug.cgi?id=46480

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D80791
2020-09-28 14:14:04 +02:00
Simon Pilgrim
1b71f5f90f [X86] Flip isShuffleEquivalent argument order to match isTargetShuffleEquivalent
A while ago, we converted isShuffleEquivalent/isTargetShuffleEquivalent to both use IsElementEquivalent internally.

This allows us to make the shuffle args optional like isTargetShuffleEquivalent and update foldShuffleOfHorizOp to use isShuffleEquivalent (which it should as its using a ISD::VECTOR_SHUFFLE mask).
2020-09-28 12:53:56 +01:00
Simon Pilgrim
b93147ffac [X86] Simplify broadcast mask detection with isUndefOrEqual helper.
Add an additional isUndefOrEqual variant that matches an entire mask, not just a single value.
2020-09-28 12:53:56 +01:00
LLVM GN Syncbot
9643d50ff5 [gn build] Port 018066d9475 2020-09-28 11:38:04 +00:00
Qiu Chaofan
a3362da3ea [PowerPC] Clean-up mayRaiseFPException bits
According to POWER ISA, floating point instructions altering exception
bits in FPSCR should be 'may raise FP exception'. (excluding those
read or write the whole FPSCR directly, like mffs/mtfsf) We need to
model FPSCR well in future patches to handle the special case properly.

Instructions added mayRaiseFPException:
- fre(s)/frsqrte(s)
- fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s)
- xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp
- xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp
- xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp

Instructions removed mayRaiseFPException:
- xstdivdp/xvtdiv(d|s)p/xstsqrtdp/xvtsqrt(d|s)p
- xsabsdp/xsnabsdp/xvabs(d|s)p/xvnabs(d|s)p
- xsnegdp/xscpsgndp/xvneg(d|s)p/xvcpsgn(d|s)p
- xvcvsxwdp/xvcvuxwdp
- xscvdpspn/xscvspdpn

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D87738
2020-09-28 18:22:12 +08:00
Jay Foad
3f3da716fa [AMDGPU] Add bfi immediate pattern
Differential Revision: https://reviews.llvm.org/D88246
2020-09-28 10:16:51 +01:00
Jay Foad
905b53ab6b [AMDGPU] Make bfi patterns divergence-aware
This tends to increase code size but more importantly it reduces vgpr
usage, and could avoid costly readfirstlanes if the result needs to be
in an sgpr.

Differential Revision: https://reviews.llvm.org/D88245
2020-09-28 10:16:51 +01:00
Jay Foad
343d947d8d [AMDGPU] Split R600 and GCN bfi patterns
This is in preparation for making the GCN patterns divergence-aware.
NFC.

Differential Revision: https://reviews.llvm.org/D88244
2020-09-28 10:16:51 +01:00
Simon Pilgrim
4d69aec2be [InstCombine] Add tests for vector rotate by constants with undefs. 2020-09-28 09:55:43 +01:00
Georgii Rymar
b76a0e7b80 [yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section.
This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools.

SHT_ARM_EXIDX is a ARM specific index table filled with entries.
Each entry consists of two 4-bytes values (words).
(https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries)

Differential revision: https://reviews.llvm.org/D88228
2020-09-28 11:45:49 +03:00
Georgii Rymar
e88d56b496 [obj2yaml][yaml2obj] - Stop recognizing SHT_MIPS_ABIFLAGS on non-MIPS targets.
Currently we are always recognizing the `SHT_MIPS_ABIFLAGS` section,
even on non-MIPS targets.

The problem of doing this is briefly discussed in D88228 which does the same for `SHT_ARM_EXIDX`:

"The problem is that `SHT_ARM_EXIDX` shares the value with `SHT_X86_64_UNWIND (0x70000001U)`.
We might have other machine specific conflicts, e.g.
`SHT_ARM_ATTRIBUTES` vs `SHT_MSP430_ATTRIBUTES` vs `SHT_RISCV_ATTRIBUTES (0x70000003U)`."

I think we should only recognize target specific sections when the machine type
matches. I.e. `SHT_MIPS_*` should be recognized only on `MIPS`, `SHT_ARM_*`
only on `ARM` etc.

This patch stops recognizing `SHT_MIPS_ABIFLAGS` on `non-MIPS` targets.

Note: I had to update `ScalarEnumerationTraits<ELFYAML::MIPS_ISA>::enumeration`, because
otherwise test crashes, calling `llvm_unreachable`.

Differential revision: https://reviews.llvm.org/D88294
2020-09-28 11:28:53 +03:00
Benjamin Kramer
d83ca05fea [Coroutines] Remove unused includes. NFC. 2020-09-28 10:27:23 +02:00