1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

204297 Commits

Author SHA1 Message Date
Sanjay Patel
068a0e3768 [CostModel] remove hack for intrinsic cost based on cost type
This hack seems to only have been necessary because of the
constructor bug noted in 33125cffd.

Once again, it's hard to prove NFC, but that's the hope...
2020-09-28 15:58:42 -04:00
Baptiste Saleil
3dca9af6d2 [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
2020-09-28 14:39:37 -05:00
Sanjay Patel
1fffe5761d [CostModel] fill in arguments as part of intrinsic attribute constructor
This appears to be an error of code duplication - instead of
one constructor variant calling another, we have N similar
but not identical versions.

I think this is 'NFC' based on the current callers, but it's
hard to tell or guess the intent in all cases.
2020-09-28 15:27:45 -04:00
Jon Roelofs
9b0c93e6aa [AArch64] reuse another map iterator. NFC 2020-09-28 11:30:21 -07:00
Amara Emerson
188cec631b Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar."
This reverts commit b5e87c9ef2243ecd65e0ef87a1bf303c0c26db04 as it seems to have
broken a bot.
2020-09-28 11:25:19 -07:00
Dominic Chen
12ebbafac0 [AddressSanitizer] Copy type metadata to prevent miscompilation
When ASan and e.g. Dead Virtual Function Elimination are enabled, the
latter will rely on type metadata to determine if certain virtual calls can be
removed. However, ASan currently does not copy type metadata, which can cause
virtual function calls to be incorrectly removed.

Differential Revision: https://reviews.llvm.org/D88368
2020-09-28 13:56:05 -04:00
Simon Pilgrim
60566b9591 [InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests 2020-09-28 18:53:38 +01:00
Heejin Ahn
88d1852eae [WebAssembly] Use wasm::Signature for in ObjectWriter (NFC)
There are two `WasmSignature` structs, one in
include/llvm/BinaryFormat/Wasm.h and the other in
lib/MC/WasmObjectWriter.cpp. I don't know why they got separated in this
way in the first place, but it seems we can unify them to use the one in
Wasm.h for all cases.

Reviewed By: dschuff, sbc100

Differential Revision: https://reviews.llvm.org/D88428
2020-09-28 10:46:55 -07:00
Jessica Paquette
2dcd508332 [AArch64][GlobalISel] Infer whether G_PHI is going to be a FPR in regbankselect
Some instructions (G_LOAD, G_SELECT, G_UNMERGE_VALUES) check if their uses
will define/use FPRs (using `onlyUsesFP` and `onlyDefinesFP`).

The register bank of a use isn't necessarily known when an instruction asks for
this.

Teach `hasFPConstraints` to look at the instructions feeding into a G_PHI when
its destination bank is unknown. If any of them are FPR, assume the entire
G_PHI will also be assigned a FPR.

Since a phi can have many inputs, and those inputs can in turn be phis,
restrict the search depth to a very low number.

Also improve the docs for `hasFPConstraints` and friends a little.

This is a 0.3% code size improvement on CTMark/Bullet at -O3, and a 0.2% code
size improvement at CTMark/pairlocalalign at -O3.

Differential Revision: https://reviews.llvm.org/D88177
2020-09-28 10:37:09 -07:00
Sanjay Patel
1d7afc6eeb [CostModel] move early exit for free intrinsics
This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
2020-09-28 13:30:55 -04:00
Sanjay Patel
2915eafc46 [CostModel] split handling of intrinsics from other calls
This should be close to NFC (no-functional-change), but I
can't completely rule out that some call on some target
travels down a different path. There's an especially large
amount of code spaghetti in this part of the cost model.

The goal is to clean up the intrinsic cost handling so
we can canonicalize to the new min/max intrinsics without
causing regressions.
2020-09-28 13:30:55 -04:00
Jessica Paquette
7ee5835082 [AArch64][GlobalISel] Support shifted register form in emitTST
Support emitting ANDSXrs and ANDSWrs in `emitTST`. Update opt-fold-compare.mir
to show that it works.

Differential Revision: https://reviews.llvm.org/D87530
2020-09-28 10:13:47 -07:00
Jessica Paquette
7a97485533 [GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:

```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```

Produce this:

```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```

as long as we are guaranteed to eliminate the original G_AND.

Also matches all commuted forms. E.g.

```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```

will be matched as well.

Differential Revision: https://reviews.llvm.org/D88104
2020-09-28 10:08:14 -07:00
Simon Pilgrim
135593d1be [InstCombine] Add basic trunc(shr(trunc(x),c)) tests
Helps improve the minor regressions noticed on D88316
2020-09-28 18:00:28 +01:00
Jon Roelofs
95dada364f [AArch64] Reuse map iterator instead of double lookup. NFC 2020-09-28 09:47:00 -07:00
Mikhail Maltsev
48709cc3db [unittests] Preserve LD_LIBRARY_PATH in crash recovery test
We need to preserve the LD_LIBRARY_PATH environment variable when
spawning a child process (certain setups rely on non-standard paths
for e.g. libstdc++). In order to achieve this, set
LLVM_CRC_UNIXCRCRETURNCODE in the parent process instead of creating
the child's environment from scratch.

Reviewed By: aganea

Differential Revision: https://reviews.llvm.org/D88308
2020-09-28 17:46:03 +01:00
Jay Foad
f4f70547af [AMDGPU] Reformat AMDGPUTargetLowering::isSDNodeAlwaysUniform. NFC. 2020-09-28 16:24:16 +01:00
Sam Parker
5a6cdf9c3b [ARM][LowOverheadLoops] Cleanup and re-arrange
Rename and reorganise how we decide where to put the LoopStart
instruction.
2020-09-28 16:06:30 +01:00
Tres Popp
352980fbe9 [llvm] Fix unused variable in non-debug configurations 2020-09-28 17:04:08 +02:00
Meera Nakrani
54f9731add [ARM] Added more patterns to generate SSAT/USAT with shift
Added patterns to generate an SSAT or USAT with shift for
SSAT/USAT instructions that are matched from IR patterns.

Differential Revision: https://reviews.llvm.org/D88145
2020-09-28 14:50:19 +00:00
Cameron McInally
a43296f995 [SVE] Lower fixed length VECREDUCE_[UMAX|UMIN] to Scalable
Essentially the same as the signed variants from D88259. Also includes a clean up of the lowering function.

Differential Revision: https://reviews.llvm.org/D88317
2020-09-28 09:29:00 -05:00
Juneyoung Lee
31a4179ce0 [ValueTracking] Fix analyses to update CxtI to be phi's incoming edges' terminators
It was mentioned that D88276 that when a phi node is visited, terminators at their incoming edges should be used for CtxI.
This is a patch that makes two functions (ComputeNumSignBitsImpl, isGuaranteedNotToBeUndefOrPoison) to do so.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D88360
2020-09-28 23:24:20 +09:00
Paul C. Anagnostopoulos
90d4bf8784 [TableGen] Improved messages in PseudoLoweringEmitter. 2020-09-28 10:18:22 -04:00
Simon Pilgrim
a9b5f69419 [InstCombine] matchRotate - force splat of uniform constant rotation amounts (PR46895)
Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.
2020-09-28 15:12:41 +01:00
Sam Parker
db9206f64f [NFC][ARM] Factor out some logic for LoLoops.
Create a DCE function that accepts an instruction.
2020-09-28 14:51:52 +01:00
Jay Foad
f7c8caa309 [AMDGPU] Reformat SITargetLowering::isSDNodeSourceOfDivergence. NFC. 2020-09-28 14:42:05 +01:00
Georgii Rymar
c050c97e5e [llvm-readobj/elf] - Fix the PREL31 relocation computation used for dumping arm32 unwind info (-u).
This is a part of https://bugs.llvm.org/show_bug.cgi?id=47581.

We have the following computation:
```
(1) uint64_t Location = Address & 0x7fffffff;
(2) if (Location & 0x04000000)
(3)   Location |= (uint64_t) ~0x7fffffff;
(4) return Location + Place;
```

At line 2 there is a mistype. The constant should be `0x40000000`,
not `0x04000000`, because the intention here is to sign extend the `Location`,
which is the 31 bit signed value.

Differential revision: https://reviews.llvm.org/D88407
2020-09-28 16:22:56 +03:00
Sjoerd Meijer
f26e5a332c [ARM][MVE] Enable tail-predication by default
We have been running tests/benchmarks downstream with tail-predication enabled
for some time now and this behaves as expected: we are not aware of any
correctness issues, and this performs better across the board than with
tail-predication disabled. Time to flip the switch!

Differential Revision: https://reviews.llvm.org/D88093
2020-09-28 14:01:23 +01:00
Simon Pilgrim
a75755b2d9 [InstCombine] matchRotate - allow undef in uniform constant rotation amounts (PR46895)
An extension to D87452, we can safely permit undefs in the uniform/splat detection

https://alive2.llvm.org/ce/z/nT-ptN

Differential Revision: https://reviews.llvm.org/D88402
2020-09-28 13:36:13 +01:00
Florian Hahn
08b000a04c [SCEV] Also use info from assumes in applyLoopGuards.
Similar to collecting information from branches guarding a loop, we can
also collect information from assumes dominating the loop header.

Fixes PR47247.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87854
2020-09-28 13:14:24 +01:00
Daniel Kiss
85d6c62b2c [AArch64] Generate .note.gnu.property based on module flags.
Flags of the module derived exclusively from the compiler flag `-mbranch-protection`.
The note is generated based on the module flags accordingly.
After this change in case of compile unit without function won't have
the .note.gnu.property if the compiler flag is not present [1].

[1] https://bugs.llvm.org/show_bug.cgi?id=46480

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D80791
2020-09-28 14:14:04 +02:00
Simon Pilgrim
1b71f5f90f [X86] Flip isShuffleEquivalent argument order to match isTargetShuffleEquivalent
A while ago, we converted isShuffleEquivalent/isTargetShuffleEquivalent to both use IsElementEquivalent internally.

This allows us to make the shuffle args optional like isTargetShuffleEquivalent and update foldShuffleOfHorizOp to use isShuffleEquivalent (which it should as its using a ISD::VECTOR_SHUFFLE mask).
2020-09-28 12:53:56 +01:00
Simon Pilgrim
b93147ffac [X86] Simplify broadcast mask detection with isUndefOrEqual helper.
Add an additional isUndefOrEqual variant that matches an entire mask, not just a single value.
2020-09-28 12:53:56 +01:00
LLVM GN Syncbot
9643d50ff5 [gn build] Port 018066d9475 2020-09-28 11:38:04 +00:00
Qiu Chaofan
a3362da3ea [PowerPC] Clean-up mayRaiseFPException bits
According to POWER ISA, floating point instructions altering exception
bits in FPSCR should be 'may raise FP exception'. (excluding those
read or write the whole FPSCR directly, like mffs/mtfsf) We need to
model FPSCR well in future patches to handle the special case properly.

Instructions added mayRaiseFPException:
- fre(s)/frsqrte(s)
- fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s)
- xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp
- xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp
- xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp

Instructions removed mayRaiseFPException:
- xstdivdp/xvtdiv(d|s)p/xstsqrtdp/xvtsqrt(d|s)p
- xsabsdp/xsnabsdp/xvabs(d|s)p/xvnabs(d|s)p
- xsnegdp/xscpsgndp/xvneg(d|s)p/xvcpsgn(d|s)p
- xvcvsxwdp/xvcvuxwdp
- xscvdpspn/xscvspdpn

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D87738
2020-09-28 18:22:12 +08:00
Jay Foad
3f3da716fa [AMDGPU] Add bfi immediate pattern
Differential Revision: https://reviews.llvm.org/D88246
2020-09-28 10:16:51 +01:00
Jay Foad
905b53ab6b [AMDGPU] Make bfi patterns divergence-aware
This tends to increase code size but more importantly it reduces vgpr
usage, and could avoid costly readfirstlanes if the result needs to be
in an sgpr.

Differential Revision: https://reviews.llvm.org/D88245
2020-09-28 10:16:51 +01:00
Jay Foad
343d947d8d [AMDGPU] Split R600 and GCN bfi patterns
This is in preparation for making the GCN patterns divergence-aware.
NFC.

Differential Revision: https://reviews.llvm.org/D88244
2020-09-28 10:16:51 +01:00
Simon Pilgrim
4d69aec2be [InstCombine] Add tests for vector rotate by constants with undefs. 2020-09-28 09:55:43 +01:00
Georgii Rymar
b76a0e7b80 [yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section.
This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools.

SHT_ARM_EXIDX is a ARM specific index table filled with entries.
Each entry consists of two 4-bytes values (words).
(https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries)

Differential revision: https://reviews.llvm.org/D88228
2020-09-28 11:45:49 +03:00
Georgii Rymar
e88d56b496 [obj2yaml][yaml2obj] - Stop recognizing SHT_MIPS_ABIFLAGS on non-MIPS targets.
Currently we are always recognizing the `SHT_MIPS_ABIFLAGS` section,
even on non-MIPS targets.

The problem of doing this is briefly discussed in D88228 which does the same for `SHT_ARM_EXIDX`:

"The problem is that `SHT_ARM_EXIDX` shares the value with `SHT_X86_64_UNWIND (0x70000001U)`.
We might have other machine specific conflicts, e.g.
`SHT_ARM_ATTRIBUTES` vs `SHT_MSP430_ATTRIBUTES` vs `SHT_RISCV_ATTRIBUTES (0x70000003U)`."

I think we should only recognize target specific sections when the machine type
matches. I.e. `SHT_MIPS_*` should be recognized only on `MIPS`, `SHT_ARM_*`
only on `ARM` etc.

This patch stops recognizing `SHT_MIPS_ABIFLAGS` on `non-MIPS` targets.

Note: I had to update `ScalarEnumerationTraits<ELFYAML::MIPS_ISA>::enumeration`, because
otherwise test crashes, calling `llvm_unreachable`.

Differential revision: https://reviews.llvm.org/D88294
2020-09-28 11:28:53 +03:00
Benjamin Kramer
d83ca05fea [Coroutines] Remove unused includes. NFC. 2020-09-28 10:27:23 +02:00
Sjoerd Meijer
72cf3e7d38 [ARM][MVE] tail-predication: overflow checks for elementcount, cont'd
This is a reimplementation of the overflow checks for the elementcount,
i.e. the 2nd argument of intrinsic get.active.lane.mask. The element
count is lowered in each iteration of the tail-predicated loop, and
we must prove that this expression doesn't overflow.

Many thanks to Eli Friedman and Sam Parker for all their help with
this work.

Differential Revision: https://reviews.llvm.org/D88086
2020-09-28 09:20:51 +01:00
David Green
82ad955cee [ARM] Expand cannotInsertWDLSTPBetween to the last instruction
9d9a11c7be037 added this check for predicatable instructions between the
D/WLSTP and the loop's start, but it was missing the last instruction in
the block. Change it to use some iterators instead.

Differential Revision: https://reviews.llvm.org/D88354
2020-09-28 09:14:40 +01:00
Chuanqi Xu
5802e5931e [Coroutines] Reuse storage for local variables with non-overlapping lifetimes
bug 45566 shows the process of building coroutine frame won't consider
that the lifetimes of different local variables are not overlapped,
which means the compiler could generates smaller frame.

This patch calculate the lifetime range of each alloca by StackLifetime
class. Then the patch build non-overlapped sets for allocas whose
lifetime ranges are not overlapped. We use the largest type in a
non-overlapped set as the field type in the frame. In insertSpills
process, if we find the type of field is not the same with the alloca,
we cast the pointer to the field type to the pointer to the alloca type.
Since the lifetime range of alloca in one non-overlapped set is not
overlapped with each other, it should be ok to reuse the storage space
in the frame.

Test plan: check-llvm, check-clang, cppcoro, folly

Reviewers: junparser, lxfind, modocache

Differential Revision: https://reviews.llvm.org/D87596
2020-09-28 15:48:00 +08:00
David Sherwood
0927cfa9f6 [SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy
After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and TypeSize,
since this could give the impression that these classes can be used
in the same way as basic integer integer types. However, division
for scalable types is a bit odd because we are only dividing the
minimum quantity by a value, as opposed to something like:

  (MinSize * Vscale) / SomeValue

This is why when performing division it's important the caller
first establishes whether the operation makes sense, perhaps by
calling isKnownMultipleOf() prior to division. The caller must now
explictly call divideCoefficientBy() on the class to perform the
operation.

Differential Revision: https://reviews.llvm.org/D87700
2020-09-28 08:03:00 +01:00
Kai Luo
e87c3b7d7a [PowerPC] Add tests for select patterns. NFC. 2020-09-28 06:11:40 +00:00
Arthur Eubanks
546a7d793e Revert "Reland [CodeGen] emit CG profile for COFF object file"
This reverts commit 506b6170cb513f1cb6e93a3b690c758f9ded18ac.

This still causes link errors, see https://crbug.com/1130780.
2020-09-27 22:43:14 -07:00
Max Kazantsev
6281760e76 [Test] Add tests where we can replace condition with invariants 2020-09-28 12:04:20 +07:00
Dávid Bolvanský
1c26e35afd [BuildLibCalls] Add noalias for strcat and stpcpy
strcat:
destination and source shall not overlap. (http://www.cplusplus.com/reference/cstring/strcat/)

stpcpy:
The strings may not overlap, and the destination string dest must be  large enough to receive the copy. (https://man7.org/linux/man-pages/man3/stpcpy.3.html)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D88335
2020-09-27 21:37:09 +02:00