1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

239 Commits

Author SHA1 Message Date
Marek Olsak
d46640b0e5 AMDGPU/SI: handle undef for llvm.SI.packf16
llvm-svn: 251632
2015-10-29 15:29:09 +00:00
Marek Olsak
7916819368 AMDGPU/SI: use S_OR for fneg (fabs f32)
llvm-svn: 251631
2015-10-29 15:29:05 +00:00
Marek Olsak
8b37a7065a AMDGPU/SI: use S_AND for i1 trunc
llvm-svn: 251630
2015-10-29 15:05:03 +00:00
Matt Arsenault
e48224a128 AMDGPU: Print modifiers when dumping AMDGPUOperand
llvm-svn: 251160
2015-10-24 00:12:56 +00:00
Matt Arsenault
c3af29c80c AMDGPU: Fix parsing of 32-bit literals with sign bit set
llvm-svn: 251132
2015-10-23 18:07:58 +00:00
Matt Arsenault
20a866624d AMDGPU: Fix adding redundant m0 uses
BuildMI already adds these since they are defined correctly now.

llvm-svn: 250961
2015-10-21 22:37:51 +00:00
Matt Arsenault
34e6b29b92 AMDGPU: Fix verifier error in SIFoldOperands
There may be other use operands that also need their kill flags cleared.

This happens in a few tests when SIFoldOperands is moved after
PeepholeOptimizer.

PeepholeOptimizer rewrites cases that look like:
%vreg0 = ...
%vreg1 = COPY %vreg0
use %vreg1<kill>
%vreg2 = COPY %vreg0
use %vreg2<kill>

to use the earlier source to
%vreg0 = ...
use %vreg0
use %vreg0

Currently SIFoldOperands sees the copied registers, so there is
only one use. So far I haven't managed to come up with a test
that currently has multiple uses of a foldable VGPR -> VGPR copy.

llvm-svn: 250960
2015-10-21 22:37:50 +00:00
Matt Arsenault
7b241d838d AMDGPU: Split DiagnosticInfoUnsupported into its own file
llvm-svn: 250959
2015-10-21 22:37:46 +00:00
Matt Arsenault
d1baf0fb57 AMDGPU: Simplify VOP3 operand legalization.
This was checking for a variety of situations that should
never happen. This saves a tiny bit of compile time.

We should not be selecting instructions with invalid operands in the
first place. Most of the time for registers copys are inserted
to the correct operand register class.

For VOP3, since all operand types are supported and literal
constants never are, we just need to verify the constant bus
requirements (all immediates should be legal inline ones).

The only possibly tricky case to maybe worry about is if when
legalizing operands in moveToVALU with s_add_i32 and similar
instructions. If the original s_add_i32 had a literal constant
and we need to replace it with v_add_i32_e64 we would have an
unsupported literal operand.  However, I don't think we should worry
about that because SIFoldOperands should handle folding literal
constant operands into the SALU instructions based on the uses.
At SIFoldOperands time, the legality and profitability of
operand types is a bit different.

llvm-svn: 250951
2015-10-21 21:51:02 +00:00
Matt Arsenault
a710403d18 AMDGPU: Fix not checking implicit operands in verifyInstruction
When verifying constant bus restrictions, this wasn't catching
uses in implicit operands.

llvm-svn: 250948
2015-10-21 21:15:01 +00:00
Matt Arsenault
aa9e5394b5 AMDGPU: Add MachineInstr overloads for instruction format tests
llvm-svn: 250797
2015-10-20 04:35:43 +00:00
Matt Arsenault
6e5d4b912c AMDGPU: Stop reserving v[254:255]
This wasn't doing anything useful. They weren't explicitly used
anywhere, and the RegScavenger ignores reserved registers.

This for some reason caused a random scheduling change in the test.
Getting the check lines to pass is too frustrating, and there's probably
not too much value in checking the vector case's operands N times.

llvm-svn: 250794
2015-10-20 03:59:58 +00:00
Craig Topper
c8409f4435 Make a bunch of static arrays const.
llvm-svn: 250642
2015-10-18 05:15:34 +00:00
Artyom Skrobov
fef2a04838 Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector types; it can't
Reviewers: arsenm, jvesely, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13734

llvm-svn: 250384
2015-10-15 09:18:47 +00:00
Duncan P. N. Exon Smith
95eb9cab96 AMDGPU: Remove implicit ilist iterator conversions, NFC
One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new
one.  Previously, bundle iterators and single-instruction iterators
could be compared to each other (comparing on underlying pointers).
I changed a comparison from using `MBB->end()` to using
`MBB->instr_end()`, since both end iterators should point at the some
place anyway.

I don't think the implicit conversion between the two iterator types is
a good idea since it's fairly easy to accidentally compare to the wrong
thing (they aren't always end iterators).  Otherwise I would have just
added the conversion.

Even with that, no there should be functionality change here.

llvm-svn: 250218
2015-10-13 20:07:10 +00:00
Matt Arsenault
303e42b592 AMDGPU: Refactor isVGPRToSGPRCopy
It should now correctly handle physical registers and make
it easier to identify the other direction.

llvm-svn: 250132
2015-10-13 00:07:54 +00:00
Matt Arsenault
85dd075020 DAGCombiner: Combine extract_vector_elt from build_vector
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.

InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.

llvm-svn: 250129
2015-10-12 23:59:50 +00:00
Matt Arsenault
a39d32ec2d AMDGPU: Register some more passes so -print-before works
llvm-svn: 250071
2015-10-12 17:43:59 +00:00
Justin Bogner
43aff57984 CodeGen: print and verify after TargetPassConfig::insertPass by default
In r224059, we started verifying after addPass, but missed doing so on
insertPass. There isn't a good reason for the discrepancy, and
skipping the verifier in these cases causes bugs.

This also exposes a verifier error that was introduced in r249087, but
the verifier doesn't run until after the register coalescer, when the
issue happens to have been resolved. I've skipped the verifier after
SIFixSGPRLiveRangesID to avoid the failures for now and will follow up
with Matt for a proper fix.

llvm-svn: 249643
2015-10-08 00:36:22 +00:00
Matt Arsenault
628b85dd0f AMDGPU: Fix missing implicit m0 uses on movrel instructions
llvm-svn: 249577
2015-10-07 17:46:32 +00:00
Matt Arsenault
304de9ef09 AMDGPU: Add comment for VOP2b operand class
Because of the constant bus requirement, it is never legal to
use a literal constant for these instructions despite the encoding
allowing it. This was already doing the right thing, but note why.

llvm-svn: 249500
2015-10-07 01:36:00 +00:00
Matt Arsenault
1bde052e2e AMDGPU: Properly register passes
llvm-svn: 249495
2015-10-07 00:42:53 +00:00
Matt Arsenault
28c28a361a AMDGPU: Use explicit register size indirect pseudos
This stops using an unknown reg class operand.

Currently build_vector selection has a broken looking check
where it tries to use a VGPR reg class and an SGPR one if it
sees an SGPR use.

With the source operand has an explicit VGPR class,
illegal copies will be inserted that SIFixSGPRCopies will take care
of normally later, which will allow removing the weird check
of build_vector users. Without this, when removed v_movrels_b32 would
still be emitted even though all of the values were only stored in
SGPRs.

llvm-svn: 249494
2015-10-07 00:42:51 +00:00
Matt Arsenault
3e5538c9ff AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefs
I'm not sure why this would be necessary, and no tests fail with
them removed. Looking at the uses is suspect as well because
the use reg classes will likely change when the users are moved
as a result of moving this instruction.

llvm-svn: 249493
2015-10-07 00:42:31 +00:00
Tom Stellard
6c21c7bcdf AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments()
Summary:
We currently ignore the calling convention, so there is no real reason to
assert on the calling convention of functions.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13367

llvm-svn: 249468
2015-10-06 21:16:34 +00:00
Tom Stellard
0610aa5644 AMDGPU/SI: Add 64-bit versions of v_nop and v_clrexcp
Summary:
The assembly printing of these is still missing the encoding size
suffix, but this will be fixed in a later commit.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13436

llvm-svn: 249424
2015-10-06 15:57:53 +00:00
Tom Stellard
bcc4205948 AMDGPU/SI: Add a helper for creating aliases for the _e32 instructions
Summary:
We are currently only using these aliases for VOPC instructions,
but this helper will make it easier to use them everywhere.

These aliases allow for the automatic matching of instructions
with forced 32-bit encoding.  Eventually, we should be able to remove
the custom C++ logic we have for this in the assembler.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13396

llvm-svn: 249330
2015-10-05 17:57:39 +00:00
Tom Stellard
d683063aaa AMDGPU/SI: Remove unused tablegen multiclass
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13395

llvm-svn: 249221
2015-10-03 00:29:50 +00:00
Matt Arsenault
93549f5707 AMDGPU/SI: Add verifier check for exec reads
Make sure we aren't accidentally not setting
these in the instruction definitions.

llvm-svn: 249170
2015-10-02 18:58:37 +00:00
Matt Arsenault
27753ff832 AMDGPU: Fix unused variable warning in release build
llvm-svn: 249091
2015-10-01 22:40:35 +00:00
Matt Arsenault
0410cd21d2 AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass
Replace LiveInterval usage with LiveVariables. LiveIntervals
computes far more information than is needed for this pass
which just needs to find if an SGPR is live out of the
defining block.

LiveIntervals are not usually available that early, requiring
computing them twice which is very expensive. The extra run of
LiveIntervals/LiveVariables/SlotIndexes was costing in total
about 5% of compile time.

Continuing to use LiveIntervals is problematic. It seems
there is an option (early-live-intervals) to run the analysis
about where it should go to avoid recomputing LiveVariables,
but it seems to be completely broken with subreg liveness
enabled. There are also problems from trying to recompute
LiveIntervals since this seems to undo LiveVariables
and clearing kill flags, causing TwoAddressInstructions
to make bad decisions.

Insert the pass right after live variables and preserve it.
The tricky case to worry about might be phis since
LiveVariables doesn't count a register as live out if
in the successor block it is only used in a phi,
but I don't think this is a concern right now
because SIFixSGPRCopies replaces SGPR phis.

llvm-svn: 249087
2015-10-01 22:10:03 +00:00
Matt Arsenault
696eb45b72 AMDGPU: Merge if and switch
llvm-svn: 249082
2015-10-01 21:51:59 +00:00
Matt Arsenault
2a2d4a2b53 AMDGPU: Remove dead code
There's no point in checking VReg_1 because all uses
of it should already have been removed by SILowerI1Copies.

llvm-svn: 249081
2015-10-01 21:51:57 +00:00
Matt Arsenault
168a5152ad AMDGPU: Make SIInsertWaits about a factor of 4 faster
This was the slowest target custom pass and was spending 80%
of the time in getMinimalPhysRegClass which was called
for every register operand.

Try to use the statically known register class when possible from
the instruction's MCOperandInfo. There are a few pseudo instructions
which are not well behaved with unknown register classes which still
require the expensive physical register class search.

There are a few other possibilities for making this even faster,
such as not inspecting implicit operands. For now those are checked
because it is technically possible to have a scalar load into
exec or vcc which can be implicitly used.

llvm-svn: 249079
2015-10-01 21:43:15 +00:00
Tom Stellard
597d1f5f9b AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass
Summary:
Instead of asserting when the kernel metadata is different than we expect,
we should just skip lowering that function.  This fixes assertion
failures with OpenCL argument metadata from older LLVM releases.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13356

llvm-svn: 249073
2015-10-01 21:16:05 +00:00
Tom Stellard
e835f56682 AMDGPU: Add MEM_RAT STORE_TYPED.
v2: Add test (Matt).
    Fix capitalization of isEOP (Matt).
    Move pattern to class parameter (Matt).
    Make the instruction available to Cayman (Matt).
    Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED.

Patch by: Zoltan Gilian

llvm-svn: 249042
2015-10-01 17:51:34 +00:00
Tom Stellard
2ee4df3f9e AMDGPU: Factor out EOP query.
v2: Fix brace placement and capitalization (Matt).

Patch by: Zoltan Gilian

llvm-svn: 249041
2015-10-01 17:51:29 +00:00
Tom Stellard
b6581787e9 AMDGPU/SI: Re-order PreloadedValue enum and number entries based on init order
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12451

llvm-svn: 248978
2015-10-01 02:02:46 +00:00
Marek Olsak
074d497be0 AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set
to prevent setting a huge stride, because DATA_FORMAT has a different
meaning if ADD_TID_ENABLE is set.

This is a candidate for stable llvm 3.7.

Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 248858
2015-09-29 23:37:32 +00:00
Matt Arsenault
9ca4652ae6 AMDGPU: Factor switch into separate function
llvm-svn: 248742
2015-09-28 20:54:57 +00:00
Matt Arsenault
0376c2dc85 AMDGPU: Fix splitting x16 SMRD loads
When used recursively, this would set the kill flag
on the intermediate step from first splitting
x16 to x8.

llvm-svn: 248741
2015-09-28 20:54:52 +00:00
Matt Arsenault
f3f42b5b21 AMDGPU: Fix moving SMRD loads with literal offsets on CI
llvm-svn: 248740
2015-09-28 20:54:46 +00:00
Matt Arsenault
28211ed700 AMDGPU: Fix splitting SMRD with large offset
The splitting of > 4 dword SMRD instructions
if using an offset in an SGPR instead of an immediate
was not setting the destination register,
resulting an an instruction missing an operand
which would assert later.

Test will be included in a following commit
which fixes a related issue.

llvm-svn: 248739
2015-09-28 20:54:42 +00:00
Andrew Kaylor
8d27e2d077 Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.org/D11370

llvm-svn: 248735
2015-09-28 20:33:22 +00:00
Matt Arsenault
fb1ff93ba4 AMDGPU: Remove hasPostISelHook from most instructions
Since this is only needed for VOP3 and a few other special
case instructions, stop setting it on everything.

llvm-svn: 248657
2015-09-26 05:06:48 +00:00
Matt Arsenault
16b445f6b4 AMDGPU: Switch over reg class size instead of checking all super classes
This gets isSGPRClass out of my profile of SIFixSGPRCopies.

llvm-svn: 248656
2015-09-26 04:59:04 +00:00
Matt Arsenault
21b183d12c AMDGPU: Don't handle invalid reg classes in helper functions
No tests hit these and it would be better to have checks like
this explicit where they are used.

llvm-svn: 248655
2015-09-26 04:53:30 +00:00
Saleem Abdulrasool
167c693a73 AMDGPU: address -Winconsistent-missing-override
Add missing override.  NFC.

llvm-svn: 248652
2015-09-26 04:34:52 +00:00
Matt Arsenault
8248804482 AMDGPU: Set CopyCost of register classes
These require multiple mov instructions to copy,
but the default value is that 1 instruction is needed.
I'm not sure if this actually changes anything.

llvm-svn: 248651
2015-09-26 04:09:34 +00:00
Matt Arsenault
eb0d6b9ea5 AMDGPU: VOP3b definition cleanups
llvm-svn: 248647
2015-09-26 02:25:48 +00:00