1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

6434 Commits

Author SHA1 Message Date
Craig Topper
b58d6a4b20 [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Chen Zheng
af3a851dc4 [PowerPC] [NFC] fix wording typos
Post commit comments address for D92071.
2021-02-02 21:03:17 -05:00
Stefan Pintilie
020aa3c460 [PowerPC] Materialize 34 bit constants with pli on Power 10.
NOTE: This patch was originally written by Anil Mahmud. His code has been
rebased but otherwise left mostly unchanged.

A new instructon on Power 10 allows for the materialization of 34 bit
immediate values. This patch allows the compiler to take advantage of
the new instruction in this situation.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D92879
2021-02-02 09:49:22 -06:00
Craig Topper
92b45d2660 [PowerPC] Remove vnot_ppc and replace with the standard vnot.
immAllOnesV has special support for looking through bitcasts
automatically so isel patterns don't need to explicitly look
for the bitconvert.
2021-01-31 19:41:33 -08:00
Kazu Hirata
d4906cf2ad [llvm] Add missing header guards (NFC)
Identified with llvm-header-guard.
2021-01-30 09:53:42 -08:00
Kazu Hirata
ab8cd6c8fd [llvm] Use isa instead of dyn_cast (NFC) 2021-01-29 23:23:37 -08:00
Kazu Hirata
425d87c917 [llvm] Populate SmallVector at construction time (NFC) 2021-01-28 22:21:14 -08:00
Albion Fung
d6b6754937 [PowerPC][Power10] Fix XXSPLI32DX not correctly exploiting specific cases
Some cases may be transformed into 32 bit splats before hitting the boolean statement, which may cause incorrect behaviour and provide XXSPLTI32DX with the incorrect values of splat. The condition was reversed so that the shortcut prevents this problem.

Differential Revision: https://reviews.llvm.org/D95634
2021-01-28 15:17:32 -05:00
Nemanja Ivanovic
a8e6b38859 [PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants
If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than
64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned
by the function appears to be incorrect and we'll investigate/fix it in a
follow-up commit. However, since this causes miscompiles, we must
temporarily disable emitting this instruction for such values.
2021-01-28 04:16:48 -06:00
Nemanja Ivanovic
ba61008cc3 [PowerPC] Do not emit HW loop with half precision operations
If a loop has any operations on half precision values, there will be calls to
library functions on Power8. Even on Power9, there is a small subset of
instructions that are actually supported for the type.

This patch disables HW loops whenever any operations on the type are found
(other than the handfull of supported ones when compiling for Power9). Fixes a
few PR's opened by Julia:

https://bugs.llvm.org/show_bug.cgi?id=48785
https://bugs.llvm.org/show_bug.cgi?id=48786
https://bugs.llvm.org/show_bug.cgi?id=48519

Differential revision: https://reviews.llvm.org/D94980
2021-01-25 20:55:56 -06:00
Nemanja Ivanovic
e9b75e2e14 [PowerPC] Add missing negate for VPERMXOR on little endian subtargets
This intrinsic is supposed to have the permute control vector complemented on
little endian systems (as the ABI specifies and GCC implements). With the
current code gen, the result vector is byte-reversed.

Differential revision: https://reviews.llvm.org/D95004
2021-01-25 12:23:33 -06:00
QingShan Zhang
d5b70bbb38 [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
For now, we correct the result for sqrt if iteration > 0. This doesn't make
sense as they are not strict relative.

Reviewed By: dmgreen, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D94480
2021-01-25 04:02:44 +00:00
Chen Zheng
db58448497 [PowerPC] support register pressure reduction in machine combiner.
Reassociating some patterns to generate more fma instructions to
reduce register pressure.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92071
2021-01-24 21:28:21 -05:00
Kazu Hirata
0ec2908ce9 [llvm] Use pop_back_val (NFC) 2021-01-24 12:18:57 -08:00
Kazu Hirata
0bd8f7e194 [Target] Use llvm::append_range (NFC) 2021-01-24 12:18:56 -08:00
Kazu Hirata
ad24e08f1c Revert "[Target] Use llvm::append_range (NFC)"
This reverts commit cc7a23828657f35f706343982cf96bb6583d4d73.

The X86WinEHState.cpp hunk seems to break certain builds.
2021-01-23 11:25:27 -08:00
Kazu Hirata
f3e18e1dd6 [Target] Use llvm::append_range (NFC) 2021-01-23 10:56:31 -08:00
Stanislav Mekhanoshin
dd3332e2e5 Change materializeFrameBaseRegister() to return register
The only caller of this function is in the LocalStackSlotAllocation
and it creates base register of class returned by the target's
getPointerRegClass(). AMDGPU wants to use a different reg class
here so let materializeFrameBaseRegister to just create and return
whatever it wants.

Differential Revision: https://reviews.llvm.org/D95268
2021-01-22 15:51:06 -08:00
Kazu Hirata
ff389465ca [llvm] Don't include StringSwitch.h where unnecessary (NFC) 2021-01-21 19:59:48 -08:00
Qiu Chaofan
87ee2e6422 [PowerPC] Duplicate inherited heuristic from base scheduler
PowerPC has its custom scheduler heuristic. It calls parent classes'
tryCandidate in override version, but the function returns void, so this
way doesn't actually help. This patch duplicates code from base scheduler
into PPC machine scheduler class, which does what we wanted.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D94464
2021-01-22 10:11:03 +08:00
Albion Fung
c84c2e366c [PowerPC][Power10] Exploit splat instruction xxsplti32dx in Power10
Exploits the instruction xxsplti32dx.

It can be used to materialize any 64 bit scalar/vector splat by using two instances, one for the upper 32 bits and the other for the lower 32 bits. It should not materialize the cases which can be materialized by using the instruction xxspltidp.

Differential Revision: https://https://reviews.llvm.org/D90173
2021-01-20 12:55:52 -05:00
Kazu Hirata
f118370581 [llvm] Use llvm::find (NFC) 2021-01-19 20:19:14 -08:00
Victor Huang
c88264fd68 [PowerPC] Fix the check for the instruction using FRSP/XSRSP output register
When performing peephole optimization to simplify the code, after removing
passed FPSP/XSRSP instruction we will set any uses of that FRSP/XSRSP to the
source of the FRSP/XSRSP.

We are finding the machine instruction using virtual register holding FRSP/XSRSP
results by searching all following instructions and encountering an issue
that the first use of the virtual register is a debug MI causing:
1. virtual register in the debug MI removed unexpectedly.
2. virtual register used in non-debug MI not replaced with the source of
  FRSP/XSRSP. which stays in a undef status.

This patch fix the issue by only searching non-debug machine instruction using
virtual register holding FRSP/XSRSP results when the vr only has one non debug
usage.

Differential Revisien: https://reviews.llvm.org/D94711
Reviewed by: nemanjai
2021-01-19 09:20:03 -06:00
Nemanja Ivanovic
b4252224e7 [PowerPC] Sign extend comparison operand for signed atomic comparisons
As of 8dacca943af8a53a23b1caf3142d10fb4a77b645, we sign extend the atomic loaded
operand for signed subword comparisons. However, the assumption that the other
operand is correctly sign extended doesn't always hold. This patch sign extends
the other operand if it needs to be sign extended.

This is a second fix for https://bugs.llvm.org/show_bug.cgi?id=30451

Differential revision: https://reviews.llvm.org/D94058
2021-01-18 21:19:25 -06:00
Sean Fertile
bb42a307cd [PowerPC][AIX]Do not emit xxspltd mnemonic on AIX.
A bug in the system assembler can assemble the xxspltd extended
menemonic into the wrong instruction (extracting the wrong element).
Emit the full xxpermdi with all operands to work around the problem.

Differential Revision: https://reviews.llvm.org/D94419
2021-01-18 09:25:31 -05:00
Tres Popp
3fdf369051 Revert "[PowerPC] support register pressure reduction in machine combiner."
This reverts commit 26a396c4ef481cb159bba631982841736a125a9c.

See https://reviews.llvm.org/D92071 for a description of the issue.
2021-01-18 12:01:57 +01:00
Chen Zheng
6f751d75c9 [PowerPC] support register pressure reduction in machine combiner.
Reassociating some patterns to generate more fma instructions to
reduce register pressure.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92071
2021-01-17 23:56:13 -05:00
Kazu Hirata
39185b091b [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Jinsong Ji
1232463119 [PowerPC] Only use some extend mne if assembler is modern enough
Legacy AIX assembly might not support all extended mnes,
add one feature bit to control the generation in MC,
and avoid generating them by default on AIX.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D94458
2021-01-14 20:36:10 +00:00
Esme-Yi
7c215e1404 [PowerPC] Try to fold sqrt/sdiv test results with the branch.
Summary: The patch tries to fold sqrt/sdiv test node, i.g FTSQRT, XVTDIVDP, and the branch, i.e br_cc if they meet these patterns:
(br_cc seteq, (truncateToi1 SWTestOp), 0) -> (BCC PRED_NU, SWTestOp)
(br_cc seteq, (and SWTestOp, 2), 0) -> (BCC PRED_NE, SWTestOp)
(br_cc seteq, (and SWTestOp, 4), 0) -> (BCC PRED_LE, SWTestOp)
(br_cc seteq, (and SWTestOp, 8), 0) -> (BCC PRED_GE, SWTestOp)
(br_cc setne, (truncateToi1 SWTestOp), 0) -> (BCC PRED_UN, SWTestOp)
(br_cc setne, (and SWTestOp, 2), 0) -> (BCC PRED_EQ, SWTestOp)
(br_cc setne, (and SWTestOp, 4), 0) -> (BCC PRED_GT, SWTestOp)
(br_cc setne, (and SWTestOp, 8), 0) -> (BCC PRED_LT, SWTestOp)

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D94054
2021-01-14 02:15:19 +00:00
Jinsong Ji
1fa0f655fd [PowerPC][NFCI] PassSubtarget to ASMWriter
Subtarget feature bits are needed to change instprinter's behavior based
on feature bits.

Most of the other popular targets were updated back in 2015,
in https://reviews.llvm.org/rGb46d0234a6969
we should update it too.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D94449
2021-01-12 16:25:35 +00:00
Nemanja Ivanovic
dac23cf4a3 [PowerPC] Add support for embedded devices with EFPU2
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.

This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.

[1] Core reference:
  https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf

Differential revision: https://reviews.llvm.org/D92935
2021-01-12 09:47:00 -06:00
Kazu Hirata
0452f12eb6 [llvm] Simplify string comparisons (NFC)
Identified with readability-string-compare.
2021-01-11 18:48:09 -08:00
Kazu Hirata
13eee3533c [llvm] Ensure newlines at the end of files (NFC)
This patch eliminates pesky "No newline at end of file" messages from
git diff.
2021-01-10 09:24:57 -08:00
Kazu Hirata
ea45e4cb34 [Target, Transforms] Use *Set::contains (NFC) 2021-01-08 18:39:54 -08:00
Fangrui Song
9d72576890 [PowerPC] Delete dead Lower* 2021-01-06 21:58:40 -08:00
Fangrui Song
74fe4086d5 [PowerPC] Delete remnant Darwin ISelLowering code 2021-01-06 21:40:40 -08:00
Fangrui Song
fba3075f39 [PowerPC] Delete remnant isOSDarwin references 2021-01-06 21:18:35 -08:00
Kit Barton
52191ae5f0 [PPC] Remove old PPCSubTarget variable.
The PPCSubTarget variable has been replaced with the Subtarget variable. This
removes the remaining instances of PPCSubTarget as they are no longer necessary.
2021-01-06 17:44:07 -06:00
Stefan Pintilie
a955465035 [PowerPC] Fix issue where vsrq is given incorrect shift vector
The new Power10 instruction vsrq was being given the wrong shift vector.
The original code assumed that the shift would be found in bits 121 to 127.
This is not correct. The shift is found in bits 57 to 63.
This can be fixed by swaping the first and second double words.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D94113
2021-01-06 05:56:09 -06:00
Christudasan Devadasan
f0d6cc1d99 [GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers
SelectionDAG performs the sret demotion. This patch
contains the basic implementation for the same in
the GlobalISel pipeline.

Furthermore, targets should bring relevant changes
during lowerFormalArguments, lowerReturn and
lowerCall to make use of this feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92953
2021-01-06 10:30:50 +05:30
Qiu Chaofan
53d9081fa4 [NFC] [PowerPC] Remove dead code in BUILD_VECTOR peephole
The piece of code tries to use splat+shift to lower build_vector with
repeating bit pattern. And immediate field of vector splat is only 5
bits (-16~15). It iterates over them one by one to find which
shifts/rotates to number in build_vector.

This patch removes code to try matching constant with algebraic
right-shift because that's meaningless - any negative number's algebraic
right-shift won't produce result smaller than itself. Besides, code
(int)((unsigned)i >> j) means logical shift-right in C.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93937
2021-01-05 11:35:00 +08:00
Kai Luo
ab0a19a8de [PowerPC] Do not fold cmp(d|w) and subf instruction to subf. if nsw is not present
In `PPCInstrInfo::optimizeCompareInstr` we seek opportunities to fold `cmp(d|w)` and `subf` as an `subf.`. However, if `subf.` gets overflow, `cr0` can't reflect the correct order, violating the semantics of `cmp(d|w)`.

Fixed https://bugs.llvm.org/show_bug.cgi?id=47830.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D90156
2021-01-04 07:54:15 +00:00
Brandon Bergren
3adc8af0ca [PowerPC] Add the LLVM triple for powerpcle [1/5]
Add a triple for powerpcle-*-*.

This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations:

1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs.
Such a loader is implemented as a freestanding ELF32 LSB binary.

2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls.

3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93918
2021-01-02 12:17:22 -06:00
Kai Luo
72298af4c5 [PowerPC] Remaining KnownBits should be constant when performing non-sign comparison
In `PPCTargetLowering::DAGCombineTruncBoolExt`, when checking if it's correct to perform the transformation for non-sign comparison, as the comment says
```
      // This is neither a signed nor an unsigned comparison, just make sure
      // that the high bits are equal.
```
Origin check
```
      if (Op1Known.Zero != Op2Known.Zero || Op1Known.One != Op2Known.One)
        return SDValue();
```
is not strong enough. For example,
```
Op1Known = 111x000x;
Op2Known = 111x000x;
```
Bit 4, besides bit 0, is still unknown and affects the final result.

This patch fixes https://bugs.llvm.org/show_bug.cgi?id=48388.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D93092
2020-12-30 02:00:47 +00:00
Nemanja Ivanovic
3dd84ac1da [PowerPC] Provide patterns for permuted scalar to vector for pre-P8
We will emit these permuted nodes on all VSX little endian subtargets
but don't have the patterns available to match them on subtargets
that don't have direct moves.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47916
2020-12-29 06:49:25 -06:00
Nemanja Ivanovic
8f5a28bb71 [PowerPC] Disable CTR loops containing operations on half-precision
On subtargets prior to Power9, conversions to/from half precision
are lowered to libcalls. This makes loops containing such operations
invalid candidates for HW loops.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=48519
2020-12-29 05:12:50 -06:00
Nemanja Ivanovic
d147200a9e [PowerPC] Do not emit HW loop when TLS var accessed in PHI of loop exit
If any PHI nodes in loop exit blocks have incoming values from the
loop that are accesses of TLS variables with local dynamic or general
dynamic TLS model, the address will be computed inside the loop. Since
this includes a call to __tls_get_addr, this will in turn cause the
CTR loops verifier to complain.
Disable CTR loops in such cases.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=48527
2020-12-28 20:36:16 -06:00
Fangrui Song
5935d9acd7 [PowerPC] Parse and ignore .machine
glibc/sysdeps/powerpc/powerpc64 has .machine
{altivec,power4,power5,power6,power7,power8} (.machine power9 is planned in
sysdeps/powerpc/powerpc64/power9/strcmp.S).
The diagnostic is not useful anyway so just delete it.
2020-12-28 12:20:40 -08:00
Nemanja Ivanovic
bb2bb5b29c [PowerPC] Remove redundant COPY_TO_REGCLASS introduced by 8a58f21f5b6c 2020-12-28 09:26:51 -06:00