1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

39181 Commits

Author SHA1 Message Date
Krzysztof Parzyszek
8406e1b3bc [RDF] Ignore undef use operands
llvm-svn: 280717
2016-09-06 17:03:13 +00:00
Chris Dewhurst
763d8f9aaa [Sparc][Leon] Corrected supported atomics size for processors supporting Leon CASA instruction back to 32 bits.
This was erroneously checked-in for 64 bits while trying to find if there was a way to get 64 bit atomicity in Leon processors. There is not and this change should not have been checked-in. There is no unit test for this as the existing unit tests test for behaviour to 32 bits, which was the original intention of the code.

llvm-svn: 280710
2016-09-06 14:41:09 +00:00
Simon Dardis
6429b834f8 [mips] Tighten FastISel restrictions
LLVM PR/29052 highlighted that FastISel for MIPS attempted to lower
arguments assuming that it was using the paired 32bit registers to
perform operations for f64. This mode of operation is not supported
for MIPSR6.

This patch resolves the reported issue by adding additional checks
for unsupported floating point unit configuration.

Thanks to mike.k for reporting this issue!

Reviewers: seanbruno, vkalintiris

Differential Review: https://reviews.llvm.org/D23795

llvm-svn: 280706
2016-09-06 12:36:24 +00:00
Krzysztof Parzyszek
dcb86018ea [PPC] Claim stack frame before storing into it, if no red zone is present
Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it 
there is no guarantee that this part of the stack will not be modified 
by any interrupt. To avoid this, make sure to claim the stack frame first
before storing into it.

This fixes https://llvm.org/bugs/show_bug.cgi?id=26519.

Differential Revision: https://reviews.llvm.org/D24093

llvm-svn: 280705
2016-09-06 12:30:00 +00:00
Craig Topper
5da343caae [AVX-512] Fix masked VPERMI2PS isel when the index comes from a bitcast.
We need to bitcast the index operand to a floating point type so that it matches the result type. If not then the passthru part of the DAG will be a bitcast from the index's original type to the destination type. This makes it very difficult to match. The other option would be to add 5 sets of patterns for every other possible type.

llvm-svn: 280696
2016-09-06 06:56:59 +00:00
Craig Topper
8f98df0f37 [X86] Remove unused encoding from IntrinsicType enum.
llvm-svn: 280694
2016-09-06 05:45:24 +00:00
Craig Topper
8000aefe35 [X86] Fix indentation. NFC
llvm-svn: 280693
2016-09-06 05:45:21 +00:00
Saleem Abdulrasool
4fbf1a0ae8 ARM: workaround bundled operation predication
This is a Windows ARM specific issue.  If the code path in the if conversion
ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up
with a bundle to ensure that the mov.w/mov.t pair is not split up.  This is
normally fine, however, if the branch is also predicated, then we end up trying
to predicate the bundle.

For now, report a bundle as being unpredicatable.  Although this is false, this
would trigger a failure case previously anyways, so this is no worse.  That is,
there should not be any code which would previously have been if converted and
predicated which would not be now.

Under certain circumstances, it may be possible to "predicate the bundle".  This
would require scanning all bundle instructions, and ensure that the bundle
contains only predicatable instructions, and converting the bundle into an IT
block sequence.  If the bundle is larger than the maximal IT block length (4
instructions), it would require materializing multiple IT blocks from the single
bundle.

llvm-svn: 280689
2016-09-06 04:00:12 +00:00
Craig Topper
4680f2d028 [AVX-512] Fix v8i64 shift by immediate lowering on 32-bit targets.
llvm-svn: 280684
2016-09-06 00:31:10 +00:00
Craig Topper
b3116ccd3a [AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double.
Still need to fix the register classes to allow the extended range of registers.

llvm-svn: 280682
2016-09-05 23:58:40 +00:00
Craig Topper
e5f780472a [AVX-512] Integrate mask register copying more completely into X86InstrInfo::copyPhysReg and simplify. No functional change intended.
The code is now written in terms of source and dest classes with feature checks inside each type of copy instead of having separate functions for each feature set.

llvm-svn: 280673
2016-09-05 20:34:50 +00:00
Benjamin Kramer
48d2159c40 [WebAssembly] Unbreak the build.
Not sure why ADL isn't working here.

llvm-svn: 280656
2016-09-05 12:06:47 +00:00
Valery Pykhtin
616e586d42 [AMDGPU] Refactor FLAT TD instructions
Differential revision: https://reviews.llvm.org/D24072

llvm-svn: 280655
2016-09-05 11:22:51 +00:00
James Molloy
a0cf4d86b2 [Thumb1] Add relocations for fixups fixup_arm_thumb_{br,bcc}
These need to be mapped through to R_ARM_THM_JUMP{11,8} respectively.

Fixes PR30279.

llvm-svn: 280651
2016-09-05 08:29:15 +00:00
Igor Breger
ba087d916d [AVX512] Fix v8i1 /v16i1 zext + bitcast lowering pattern. Explicitly zero upper bits.
Differential Revision: http://reviews.llvm.org/D23983

llvm-svn: 280650
2016-09-05 08:26:51 +00:00
Craig Topper
4ac30e51b2 [X86] Make some static arrays of opcodes const and shrink to uint16_t. NFC
llvm-svn: 280649
2016-09-05 07:14:21 +00:00
Craig Topper
b6c15620f8 [AVX-512] Simplify X86InstrInfo::copyPhysReg for 128/256-bit vectors with AVX512, but not VLX. We should use the VEX opcodes and trust the register allocator to not use the extended XMM/YMM register space.
Previously we were extending to copying the whole ZMM register. The register allocator shouldn't use XMM16-31 or YMM16-31 in this configuration as the instructions to spill them aren't available.

llvm-svn: 280648
2016-09-05 06:43:06 +00:00
Craig Topper
fef4374983 [X86] Remove FsVMOVAPSrm/FsVMOVAPDrm/FsMOVAPSrm/FsMOVAPDrm. Due to their placement in the td file they had lower precedence than (V)MOVSS/SD and could almost never be selected.
The only way to select them was in AVX512 mode because EVEX VMOVSS/SD was below them and the patterns weren't qualified properly for AVX only. So if you happened to have an aligned FR32/FR64 load in AVX512 you could get a VEX encoded VMOVAPS/VMOVAPD.

I tried to search back through history and it seems like these instructions were probably unselectable for at least 5 years, at least to the time the VEX versions were added. But I can't prove they ever were.

llvm-svn: 280644
2016-09-05 02:20:49 +00:00
Craig Topper
8d44c07194 [AVX-512] Add EVEX encoded scalar FMA intrinsic instructions to isNonFoldablePartialRegisterLoad.
llvm-svn: 280636
2016-09-04 19:33:47 +00:00
Craig Topper
e6bea68a97 [AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div intrinsics and upgrade to native IR.
llvm-svn: 280633
2016-09-04 18:13:33 +00:00
Hal Finkel
c40902b979 [PowerPC] During branch relaxation, recompute padding offsets before each iteration
We used to compute the padding contributions to the block sizes during branch
relaxation only at the start of the transformation. As we perform branch
relaxation, we change the sizes of the blocks, and so the amount of inter-block
padding might change. Accordingly, we need to recompute the (alignment-based)
padding in between every iteration on our way toward the fixed point.

Unfortunately, I don't have a test case (and none was provided in the bug
report), and while this obviously seems needed, algorithmically, I don't have
any way of generating a small and/or non-fragile regression test.

llvm-svn: 280626
2016-09-04 14:18:29 +00:00
Igor Breger
a37a45a263 revert r279960.
https://llvm.org/bugs/show_bug.cgi?id=30249

llvm-svn: 280625
2016-09-04 14:03:52 +00:00
Simon Pilgrim
ddb7b99be6 Strip trailing whitespace
llvm-svn: 280623
2016-09-04 13:28:46 +00:00
Hal Finkel
bf0592c975 [PowerPC] Zero-extend constants in FastISel
As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which
are illegal types promoted to i32 on PowerPC, is a choice constrained by
assumptions within the infrastructure. Specifically, the logic in
FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI
operands will be zero extended, and so, at least when materializing constants
that are PHI operands, we must do the same.

The rest of our fast-isel implementation does not appear to depend on the fact
that we were sign-extending i8/i16 constants, and all other targets also appear
to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we
had been doing this only for i1 constants, and sign-extending the others).

Fixes PR27721.

llvm-svn: 280614
2016-09-04 06:07:19 +00:00
Craig Topper
7bf68ae691 [AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to native IR.
llvm-svn: 280611
2016-09-04 02:09:53 +00:00
Simon Pilgrim
9eb9c00241 Strip trailing whitespace
llvm-svn: 280598
2016-09-03 20:36:05 +00:00
Matt Arsenault
ecba57a1e7 AMDGPU: Set sizes of spill pseudos
llvm-svn: 280595
2016-09-03 17:25:44 +00:00
Matt Arsenault
2751d90018 AMDGPU: Fix adding duplicate implicit exec uses
I'm not sure if this should be considered a bug in
copyImplicitOps or not, but implicit operands that are part
of the static instruction definition should not be copied.

llvm-svn: 280594
2016-09-03 17:25:39 +00:00
Craig Topper
b662036043 [AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test.
llvm-svn: 280593
2016-09-03 17:20:07 +00:00
Craig Topper
8b7f2911fc [AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent.
llvm-svn: 280592
2016-09-03 16:28:03 +00:00
Nicolai Haehnle
bfd5fc8a84 AMDGPU: Reduce the duration of whole-quad-mode
Summary:
This contains two changes that reduce the time spent in WQM, with the
intention of reducing bandwidth required by VMEM loads:

1. Sampling instructions by themselves don't need to run in WQM, only their
   coordinate inputs need it (unless of course there is a dependent sampling
   instruction). The initial scanInstructions step is modified accordingly.

2. When switching back from WQM to Exact, switch back as soon as possible.
   This affects the logic in processBlock.

This should always be a win or at best neutral.

There are also some cleanups (e.g. remove unused ExecExports) and some new
debugging output.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D22092

llvm-svn: 280590
2016-09-03 12:26:38 +00:00
Nicolai Haehnle
472bdb08df AMDGPU: Fix an interaction between WQM and polygon stippling
Summary:
This fixes a rare bug in polygon stippling with non-monolithic pixel shaders.

The underlying problem is as follows: the prolog part contains the polygon
stippling sequence, i.e. a kill. The main part then enables WQM based on the
_reduced_ exec mask, effectively undoing most of the polygon stippling.

Since we cannot know whether polygon stippling will be used, the main part
of a non-monolithic shader must always return to exact mode to fix this
problem.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23131

llvm-svn: 280589
2016-09-03 12:26:32 +00:00
Matt Arsenault
7f6114dde5 AMDGPU: Fix spilling of m0
readlane/writelane do not support using m0 as the output/input.
Constrain the register class of spill vregs to try to avoid this,
but also handle spilling of the physreg when necessary by inserting
an additional copy to a normal SGPR.

llvm-svn: 280584
2016-09-03 06:57:55 +00:00
Craig Topper
5a9e876f61 [AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables.
llvm-svn: 280581
2016-09-03 04:37:50 +00:00
Hal Finkel
68985b72e3 [PowerPC] Support asm parsing for bc[l][a][+-] mnemonics
PowerPC assembly code in the wild, so it seems, has things like this:

  bc+     12, 28, .L9

This is a bit odd because the '+' here becomes part of the BO field, and the BO
field is otherwise the first operand. Nevertheless, the ISA specification does
clearly say that the +- hint syntax applies to all conditional-branch mnemonics
(that test either CTR or a condition register, although not the forms which
check both), both basic and extended, so this is supposed to be valid.

This introduces some asm-parser-only definitions which take only the upper
three bits from the specified BO value, and the lower two bits are implied by
the +- suffix (via some associated aliases).

Fixes PR23646.

llvm-svn: 280571
2016-09-03 02:31:44 +00:00
Hal Finkel
a0698b2905 [PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev
These few book-III instructions are used by the Linux kernel.

Partially fixes PR24796.

llvm-svn: 280560
2016-09-02 23:42:01 +00:00
Hal Finkel
4e3ca08f33 [PowerPC] Add support for the extended dcbf form and mnemonics
dcbf has an optional hint-like field, add support for the extended form and the
associated mnemonics (dcbfl and dcbflp).

Partially fixes PR24796.

llvm-svn: 280559
2016-09-02 23:41:54 +00:00
Ron Lieberman
bb7aee52af Make sure to maintain register liveness when generating predicated instructions.
Author: Krzysztof Parzyszek <kparzysz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D24209

llvm-svn: 280552
2016-09-02 22:56:24 +00:00
Hal Finkel
3b1203e54e [PowerPC] For larger offsets, when possible, fold offset into addis toc@ha
When we have an offset into a global, etc. that is accessed relative to the TOC
base pointer, and the offset is larger than the minimum alignment of the global
itself and the TOC base pointer (which is 8-byte aligned), we can still fold
the @toc@ha into the memory access, but we must update the addis instruction's
symbol reference with the offset as the symbol addend. When there is only one
use of the addi to be folded and only one use of the addis that would need its
symbol's offset adjusted, then we can make the adjustment and fold the @toc@l
into the memory access.

llvm-svn: 280545
2016-09-02 21:37:07 +00:00
James Y Knight
0f37385b9a [Sparc] Mark i128 shift libcalls unavailable in 32-bit mode.
Recently, llvm wants to emit calls to these functions, while it didn't
seem to be an issue before. Not sure why. Nor do I know why only these
three are important to disable, out of all of the i128 libcalls.

Nevertheless, many other targets have this snippet of code, so, just
copying it to sparc as well, to unbreak things.

llvm-svn: 280537
2016-09-02 20:29:11 +00:00
Jan Vesely
4f3509de92 AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements.
Fixes R600 piglit regressions since r280298

Differential Revision: https://reviews.llvm.org/D24174

llvm-svn: 280535
2016-09-02 20:13:19 +00:00
Sjoerd Meijer
45855e4ce6 Setting fp trapping mode and denormal type: this an improvement of
r280246 and calculates compatibility of functions attributes in 
a better way.

Differential Revision: https://reviews.llvm.org/D24070

llvm-svn: 280534
2016-09-02 19:51:34 +00:00
Jan Vesely
8cecd17db6 AMDGPU/R600: Expand unaligned writes to local and global AS
LOCAL and GLOBAL AS only
PRIVATE needs special treatment

Differential Revision: https://reviews.llvm.org/D23971

llvm-svn: 280526
2016-09-02 19:07:06 +00:00
Wei Mi
6f063606d0 Split the store of a wide value merged from an int-fp pair into multiple stores.
For the store of a wide value merged from a pair of values, especially int-fp pair,
sometimes it is more efficent to split it into separate narrow stores, which can
remove the bitwise instructions or sink them to colder places.

Now the feature is only enabled on x86 target, and only store of int-fp pair is
splitted. It is possible that the application scope gets extended with perf evidence
support in the future.

Differential Revision: https://reviews.llvm.org/D22840

llvm-svn: 280505
2016-09-02 17:17:04 +00:00
Derek Schuff
69e2fd648f [WebAssembly] Update known test failures
Fixed an issue with the experimental C headers

llvm-svn: 280498
2016-09-02 16:26:24 +00:00
Craig Topper
a8fc395658 [AVX-512] Remove floating point logical operation instrinsics and replace them with native IR.
llvm-svn: 280466
2016-09-02 05:29:17 +00:00
Craig Topper
e8b71640c2 [AVX-512] Add more patterns for masked and broadcasted logical operations where the select or broadcast has a floating point type.
These are needed in order to remove the masked floating point logical operation intrinsics and use native IR.

llvm-svn: 280465
2016-09-02 05:29:13 +00:00
Craig Topper
e56c6e74a7 [AVX-512] Add execution domain fixing for logical operations with broadcast loads. This builds on the handling of masked ops since we need to keep element size the same.
llvm-svn: 280464
2016-09-02 05:29:09 +00:00
Craig Topper
48a385b049 [X86] Strengthen some SDNode type constraints.
llvm-svn: 280463
2016-09-02 04:25:33 +00:00
Craig Topper
c1f4ecba29 [AVX-512] Add NoVLX Predicates to some patterns so they don't rely on pattern ordering to be lower priority than their equivalent VLX pattern.
llvm-svn: 280462
2016-09-02 04:25:30 +00:00