1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

3003 Commits

Author SHA1 Message Date
Jinsong Ji
068fb4deaa [AIX] Print printable byte list as quoted string
.byte supports string, so if the whole byte list are printable,
we can actually print the string for readability and LIT tests maintainence.

        .byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d
->
        .byte "Hello, world"

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D102814
2021-05-21 02:37:55 +00:00
Jessica Clarke
8c42ad8897 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Re-landed again after D102819 fixed PowerPC to correctly zero-extend all
of its atomics as it claimed to do, since the combination of that bug
and this optimisation caused buildbot regressions.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-20 20:34:23 +01:00
Stefan Pintilie
7cef6d55d4 [PowerPC] Add fix to partword atomic operations
Partword atomic binaries are not zero extended as they should be.
This patch fixes them to ensure that they are zero extended.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D102819
2021-05-20 12:36:37 -05:00
Chen Zheng
3988f183b6 [PowerPC] only check the load instruction result number 0.
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D102596
2021-05-18 00:49:37 -04:00
Nemanja Ivanovic
8655d76475 [PowerPC] Add patterns for vselect of v1i128
These patterns are missing even though the underlying instruction
doesn't really care about the type. Added these patterns to resolve
https://bugs.llvm.org/show_bug.cgi?id=50084
2021-05-17 06:37:46 -05:00
Nemanja Ivanovic
5fe52009aa [PowerPC] Do not emit dssall on AIX
This instruction is a nop on all server cores (certainly on all
cores that AIX supports) so it is fine to emit a nop instead of it.
In fact, that is exactly what XL emits. So we emit a nop on AIX
and we leave the codegen as is on other platforms since there may
indeed be cores out there for which this actually does some prefetching.
2021-05-17 06:08:06 -05:00
Chen Zheng
9ccc0ec15a [PowerPC] add a testcase for reverse memory op; nfc 2021-05-17 03:29:14 -04:00
Stefan Pintilie
20d928df72 [PowerPC] Add ROP Protection to prologue and epilogue
Added hashst to the prologue and hashchk to the epilogue.
The hash for the prologue and epilogue must always be stored as the first
element in the local variable space on the stack.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D99377
2021-05-13 12:54:44 -05:00
Stefan Pintilie
da8040707c [PowerPC] Handle inline assembly clobber of link regsiter
This patch adds the handling of clobbers of the link register LR for inline
assembly.

This patch is to fix:
https://bugs.llvm.org/show_bug.cgi?id=50147

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101657
2021-05-13 07:43:37 -05:00
Nemanja Ivanovic
9adf3b1a1e [PowerPC] Provide doubleword vector predicate form comparisons on Power7
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*

*Without this patch, the following causes a selection error:

int test(vector signed long a, vector signed long b) {
  return a < b;
}

This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
2021-05-13 04:56:56 -05:00
Stefan Pintilie
3947a17f41 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This reverts commit 6c80361b8474535852afb2f7201370fb5f410091.
Breaks PowerPC Big Endian buildbots.
2021-05-12 09:46:18 -05:00
Albion Fung
64e5b1ce09 [PowerPC] Improve codegen for int-to-fp conversion of subword vector extract
When an integer is converted into floating point in subword vector extract,
it can be done in 2 instructions instead of the 3+ instructions it generates
right now. This patch removes the uncessary generation.

Differential: https://reviews.llvm.org/D100604
2021-05-11 15:00:11 -05:00
Stefan Pintilie
8ededaf80d [PowerPC][Bug] Fix Bug in Stack Frame Update Code
The stack frame update code does not take into consideration spilling
to registers for callee saved registers. The option -ppc-enable-pe-vector-spills
turns on spilling to registers for callee saved registers and may expose a bug
in the code that moves a stack frame pointer update instruction.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101366
2021-05-11 05:54:07 -05:00
Arthur Eubanks
e2b61d2be5 [test] Put aix-xcoff-huge-relocs.ll under expensive checks
It is an order of magnitude slower than the second slowest test
according to obj/llvm/test/.lit_test_times.txt.

The two slowest are:
 2.870437e+02 CodeGen/PowerPC/aix-xcoff-huge-relocs.ll
 2.850697e+01 tools/llvm-readobj/ELF/file-header-machine-types.test

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D102190
2021-05-10 13:44:29 -07:00
Stefan Pintilie
3408e9e758 [PowerPC] Spilling to registers does not require frame index scavenging
If spills are to registers instead of to the stack then a copy will be used
and frame index scavenging is not required.

This patch adds debug info to frame index scavenging and makes sure that
spilling to registers does not cause frame index scavenging.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101360
2021-05-10 14:42:39 -05:00
Zarko Todorovski
b797f9d9ad [PowerPC] Enable safe for 32bit vins* P10 instructions
Correctly emit `vins`instructions that are safe in 32bit mode.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101383
2021-05-10 10:13:13 -04:00
Chen Zheng
7189f2452c [XCOFF] handle string constants generation for AIX
This follows https://www.ibm.com/docs/en/aix/7.2?topic=constants-string

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D101280
2021-05-07 06:43:36 +00:00
Qiu Chaofan
f654201998 [PowerPC] Remove extra swap for extract+vperm on LE
This is a simple fix on LE. On BE, vector shuffles are categorized into
different ops. We may need more work to eliminate these in
tablegen/pre-isel.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D101605
2021-05-07 13:48:08 +08:00
Victor Huang
328fd59846 [AIX][TLS] Add support for TLSGD relocations to XCOFF objects
- Add branch absolute reloction R_RBA, R_TLS relocation for the variable offset
  for the tlsgd model and R_TLSM for the region handle for the tlsgd model
- Properly set the relocation fixed values for R_TLS and R_TLSM
- Emit the TCEntry with the variant kind in the XCOFFStreamer

Reviewed by: sfertile, nemanjai, DiggerLin

Differential Revision: https://reviews.llvm.org/D100214
2021-05-06 09:01:47 -05:00
Jessica Clarke
cebf1171e8 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-06 04:01:20 +01:00
Jessica Clarke
3018d0bd19 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This seems to have broken sanitizers, giving lots of

  Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.

failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.

This reverts commit 6e876f9dedf00b24a96b8781e3b39d5282c43e91.
2021-05-05 17:02:05 +01:00
Jessica Clarke
d5dcd075dc [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-05 16:34:45 +01:00
Ahsan Saghir
46a30bedec [PowerPC] Prevent argument promotion of types with size greater than 128 bits
This patch prevents argument promotion of types having
type size greater than 128 bits.

Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=49952

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D101188
2021-05-04 12:09:25 -05:00
Zarko Todorovski
1bccdc2f48 [AIX] Remove unused vector registers from allocation order in the default AltiVec ABI
The previous implementation of the default AltiVec ABI marked registers V20-V31
as reserved.  This failed to prevent reserved VFRC registers being allocated.
In this patch instead of marking the registers reserved we remove unallowed
registers from the allocation order completely.

This is a slight rework of an implementation by @nemanjai

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D100050
2021-05-03 13:50:51 -04:00
Nemanja Ivanovic
315bbd239b [PowerPC] Add missing requirement to test case
Commit 70c433a184a54819835e54c62c3e6891e7069861 added this
test case that has -stop-before that mentions a pass that is
only added for non-release builds. Add the requirement for asserts.
2021-04-30 19:36:58 -05:00
Jon Roelofs
738d110269 [EarlyIfConversion] Avoid producing selects with identical operands
This extends the early-ifcvt pass to avoid a few more cases where the resulting
select instructions would have matching operands.  Additionally, we now use TII
to determine "sameness" of the operands so that as TII gets smarter, so too
will ifcvt.

The attached test case was bugpoint-reduced down from CINT2000/252.eon in the
test-suite. See: https://clang.godbolt.org/z/WvnrcrGEn

Differential Revision: https://reviews.llvm.org/D101508
2021-04-30 15:51:14 -07:00
Jon Roelofs
0386626f22 [PowerPC] modernize test via update_llc_test_checks.py. NFC 2021-04-30 15:51:13 -07:00
Amy Kwan
f189e0c45f [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns.
This patch introduces a new infrastructure that is used to select the load and
store instructions in the PPC backend.

The primary motivation is that the current implementation of selecting load/stores
is dependent on the ordering of patterns in TableGen. Given this limitation, we
are not able to easily and reliably generate the P10 prefixed load and stores
instructions (such as when the immediates that fit within 34-bits). This
refactoring is meant to provide us with more control over the patterns/different
forms to exploit, as well as eliminating dependency of pattern declaration in TableGen.

The idea of this refactoring is that it introduces a set of addressing modes that
correspond to different instruction formats of a particular load and store
instruction, along with a set of common flags that describes a load/store.
Whenever a load/store instruction is being selected, we analyze the instruction
and compute a set of flags for it. The computed flags are then used to
select the most optimal load/store addressing mode.

This patch is the first of a series of patches to be committed - it contains the
initial implementation of the refactored load/store selection infrastructure and
also updates P8/P9 patterns to adopt this infrastructure. The idea is that
incremental patches will add more implementation and support, and eventually
the old implementation will be removed.

Differential Revision: https://reviews.llvm.org/D93370
2021-04-30 09:53:19 -05:00
Sidharth Baveja
f8a477d8bf [XCOFF][AIX] Add Global Variables Directly to TOC for 32 bit AIX
Summary:
This patch implements the backend implementation of adding global variables
directly to the table of contents (TOC), rather than adding the address of the
variable to the TOC.
Currently, this patch will look for the "toc-data" attribute on symbols in the
IR, and then add those symbols to the TOC.
ATM, this is implemented for 32 bit AIX.

Reviewers: sfertile
Differential Revision: https://reviews.llvm.org/D101178
2021-04-30 14:48:02 +00:00
Qiu Chaofan
85b7a12c42 Pre-commit test for PPC vector extraction test 2021-04-30 12:02:37 +08:00
jasonliu
f6fc5dd17d [XCOFF] Handle the case when personality routine is an alias
Summary:
Personality routine could be an alias to another personality routine.
Fix the situation when we compile the file that contains the personality
routine and the file also have functions that need to refer to the
personality routine.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D101401
2021-04-29 22:03:30 +00:00
Victor Huang
de9b762793 [AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects
- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
  the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
  region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property

Reviewed by: sfertile

Differential Revision: https://reviews.llvm.org/D100956
2021-04-29 13:18:59 -05:00
Qiu Chaofan
c9d96287d5 [SPE] Support constrained float operations on SPE
This patch enables support on SPE for constrained arithmetic and
comparison operations. This fixes bugzilla 50070.

One thing not covered is fcmp vs. fcmps on SPE. Some condition code
generates singaling comparison while some not. In this patch, all are
considered as singaling. So there might be still some issue when
compiling from C code.

Reviewed By: jhibbits

Differential Revision: https://reviews.llvm.org/D101282
2021-04-29 16:34:10 +08:00
Qiu Chaofan
162a7c0f22 [PowerPC] Fix SELECT_CC with i64 operand on PPC32
This patch fixes the infinite loop in legalization of PPC32 SELECT_CC
with 64-bit operand.
2021-04-28 17:48:33 +08:00
Victor Huang
cca8b9f1d5 [AIX][Power10] Restrict prefixed instructions from crossing the 64byte boundary
This patch adds the support to restrict prefixed instruction from
crossing the 64 byte boundary:
- Add the infrastructure to register a custom XCOFF streamer
- Add a custom XCOFF streamer for PowerPC to allow us to
  intercept instructions as they are being emitted and align all 8 byte
  instructions to a 64 byte boundary if required by adding a 4 byte nop.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D101107
2021-04-27 11:55:18 -05:00
Zarko Todorovski
04b8e84be8 [AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches
In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most
VSX pattern matching.  We enable some safe ones for 32bit in this patch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97503
2021-04-27 07:27:43 -04:00
Chen Zheng
611f6b8043 [XCOFF] make .file directive have directory info
The .file directive is changed to only have basename in D36018 for
ELF.

But on AIX, we require the .file directive to also contain the
directory info. This aligns with other AIX compiler like XLC and is
required by some AIX tool like DBX.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D99785
2021-04-27 00:15:23 -04:00
Nemanja Ivanovic
572869bea6 [PowerPC] Add vec_ctsl and vec_ctul to altivec.h
These are added for compatibility with XLC. They are similar to
vec_cts and vec_ctu except that the result is a doubleword vector
regardless of the parameter type.
2021-04-23 11:03:38 -05:00
Nemanja Ivanovic
422f9c37c7 [PowerPC] Improve codegen for vector fp to int widening conversions
We currently do not utilize instructions that convert single
precision vectors to doubleword integer vectors. These conversions
come up in code occasionally and this improvement allows us to
open code some functions that need to be added to altivec.h.
2021-04-22 05:04:06 -05:00
Nemanja Ivanovic
9a29810062 [PowerPC] Canonicalize shuffles on big endian targets as well
Extend shuffle canonicalization and conversion of shuffles fed by vectorized
scalars to big endian subtargets. For big endian subtargets, loads and direct
moves of scalars into vector registers put the data in the correct element for
SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is
narrower, the value still ends up in the wrong place - althouth a different
wrong place than on little endian targets.

This patch extends the combine that keeps values where they are if they feed a
shuffle to big endian targets.

Differential revision: https://reviews.llvm.org/D100478
2021-04-20 07:29:47 -05:00
Qiu Chaofan
ae1dd01644 [PowerPC] Use mtvsrdd to put callee-saved GPR into VSR
This patch exploits mtvsrdd instruction (available in ISA3.0+) to save
two callee-saved GPR registers into a single VSR, making it more
efficient.

Reviewed By: jsji, nemanjai

Differential Revision: https://reviews.llvm.org/D62565
2021-04-20 16:43:24 +08:00
Qiu Chaofan
d26cbdc561 [PowerPC] Support f128 under VSX
This patch is the last one in backend to support fp128 type in
pre-POWER9 subtargets with VSX, removing temporary option and updating
remaining tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92374
2021-04-20 15:49:52 +08:00
Nemanja Ivanovic
d506c40420 [PowerPC] Minor improvement for insert_vector_elt codegen
For v2f64, all VSX subtargets can insert an element with a single
XXPERMDI.
2021-04-16 18:52:37 -05:00
Zarko Todorovski
e32ee2f48c [AIX] Allow safe for 32bit P8 VSX pattern matching
Pull some of the safe for 32bit pattern matching for Pwr8 and above.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97909
2021-04-14 08:12:48 -04:00
Nemanja Ivanovic
5112edb603 [PowerPC] Use correct node to get a super register from a subreg
The VSX tablegen file has some rather eggregious uses of
COPY_TO_REGCLASS even in situations where it needs to use
SUBREG_TO_REG. While this produces correct code, it often doesn't
allow the register coalescer to coalesce copies and the resulting
code ends up being suboptimal. This patch just changes over
patterns that should use SUBREG_TO_REG.
2021-04-13 19:52:21 -05:00
Chen Zheng
4b5d0c05f7 [PowerPC] stop reverse mem op generation for some cases.
We should consider the feeder user number when we do reverse memory
operation transformation. Otherwise, we may get negative impact.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D100166
2021-04-12 22:41:28 -04:00
Qiu Chaofan
bb04f59476 [PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled
XSCMPUQP is not available for pre-P9 subtargets. This patch will lower
them into libcall for correct behavior on power7/power8.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92083
2021-04-12 10:33:32 +08:00
Stefan Pintilie
2a9cd5ea44 Add correct types to the xxsplti32dx pattern.
Regiser types for xxsplti32dx for two td file patterns was incorrect.
Fixed the two types and added a test case that was reduced from a larger
failing test.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D100223
2021-04-09 14:11:34 -05:00
Thomas Preud'homme
c47d28913a [PowerPC, test] Fix use of undef FileCheck var
LLVM test CodeGen/PowerPC/ctrloops-softfloat.ll tries to check for the
absence of sequences of instructions with several CHECK-NOT with one of
those directives using a variable defined in another. However CHECK-NOT
are checked independently so that is using a variable defined in a
pattern that should not occur in the input.

This commit changes occurence of the variable for the regex used in its
definition, thereby making each CHECK-NOT independent.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D99881
2021-04-09 12:55:02 +01:00
Thomas Preud'homme
2ff342dd16 [PowerPC, test] Fix use of undef FileCheck var
Commit 6ad3d05b681b36f6ecc98523257d154053e4116d disables the definition
of CSR that a follow-up CHECK-NOT directive depends on. This commit
replaces the undefined CSR variable use by the regex used to define it.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D99870
2021-04-09 12:54:35 +01:00