1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

47647 Commits

Author SHA1 Message Date
George Burgess IV
547d3b9eb0 Replace AA's uses of uint64_t with LocationSize; NFC.
The uint64_ts that we pass around AA to represent MemoryLocation sizes
are logically an Optional<uint64_t>. In D44748, we want to add an extra
'imprecise' bit to this Optional<uint64_t> to represent whether a given
MemoryLocation size is an upper-bound or an exact size. For more context
on why, please see D44748.

That patch is quite large, but reviewers seem to be OK with the
approach. In D45581 (my first attempt to split 'noise' out of D44748),
reames asked that I land a precursor that is solely replacing uint64_t
with LocationSize, which starts out as `using LocationSize = uint64_t;`.
He also gave me the OK to submit this rename without further review.

llvm-svn: 333314
2018-05-25 21:16:58 +00:00
Mark Searles
38d5d10a14 [AMDGPU][Waitcnt] Remove obsolete waitcnt option
With the removal of the old waitcnt pass, the '-enable-si-insert-waitcnts' option is obsolete. Remove it.

Differential Revision: https://reviews.llvm.org/D47378

llvm-svn: 333303
2018-05-25 20:24:08 +00:00
Stanislav Mekhanoshin
9cfeb10772 [AMDGPU] Fixed test failure with AMDGPUPerfHint
We shall not keep iterator to a map while map is modified,
this leads to a broken map.

llvm-svn: 333298
2018-05-25 18:46:58 +00:00
Reid Kleckner
8697c9c142 Fix -Winconsistent-missing-overrides in AMDGPU code
llvm-svn: 333291
2018-05-25 17:46:24 +00:00
Stanislav Mekhanoshin
c29c4efaa7 [AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:

1. Function is memory bound.
2. Kernel can use wave limiter.

Currently these hints are used in the scheduler. If a function is suspected
to be memory bound we allow occupancy to decrease to 4 waves in the course
of scheduling.

Differential Revision: https://reviews.llvm.org/D46992

llvm-svn: 333289
2018-05-25 17:25:12 +00:00
Simon Dardis
7995ddd2b6 [mips] Fix the definitions of lwp, swp
Rather than using a regpair operand of these instructions, use two seperate
operands and a custom converter to handle the implicit second register operand.

Additionally, remove the microMIPS32R6 definition as its redundant.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D47255

llvm-svn: 333288
2018-05-25 16:15:48 +00:00
Krzysztof Parzyszek
977520b723 [Hexagon] Fix packing source vectors in shufflevector selection
When the shuffle mask selected a subvector of the second input vector,
and aligning of the source was performed, the shuffle mask was updated
incorrectly, resulting in an ICE further in the selection process.

llvm-svn: 333279
2018-05-25 14:53:14 +00:00
Simon Pilgrim
9761cc5cf6 [X86][SNB] Fix differences between vex/non-vex XMM vector moves (PR37286)
As confirmed by llvm-exegesis, there is no scheduler difference between MOVDQA/MOVDQU and VMOVDQA/VMOVDQU xmm reg-reg moves

Another chapter in the never ending crusade to remove useless InstRW overrides from the x86 scheduler models......

llvm-svn: 333271
2018-05-25 12:18:11 +00:00
Sander de Smalen
1dcbd3929f Fix ubsan errors introduced by r333263 re. left-shifting negative values.
llvm-svn: 333270
2018-05-25 11:41:04 +00:00
Sander de Smalen
9b5c7781f7 [AArch64][SVE] Asm: Support for DUP (immediate) instructions.
Unpredicated copy of optionally-shifted immediate to SVE vector,
along with MOV-aliases.

This patch contains parsing and printing support for
cpy_imm8_opt_lsl_(i8|i16|i32|i64). This operand allows a signed value in
the range -128 to +127. For element widths of 16 bits or higher it may
also be a signed multiple of 256 in the range -32768 to +32512.
For element-width of 8 bits a range of -128 to 255 is accepted, since a copy
of a byte can be considered either signed/unsigned.

Note: This patch renames tryParseAddSubImm() -> tryParseImmWithOptionalShift()
and moves the behaviour of trying to shift a plain immediate by an allowed
shift-value to its addImmWithOptionalShiftOperands() method, so that the
parsing itself is generic and allows immediates from multiple shifted operands.
This is done because an immediate can be divisible by both shifted operands.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47309

llvm-svn: 333263
2018-05-25 09:47:52 +00:00
Jonas Paulsson
adc691d8dc [SystemZ] Bugfix in combineSTORE().
Remember to check if store is truncating before calling
combineTruncateExtract().

Review: Ulrich Weigand
llvm-svn: 333262
2018-05-25 09:01:23 +00:00
Tim Renouf
6428f13232 [AMDGPU] Fixed incorrect break from loop
Summary:
Lower control flow did not correctly handle the case that a loop break
in if/else was on a condition that was not guaranteed to be masked by
exec. The first test kernel shows an example of this going wrong; after
exiting the loop, exec is all ones, even if it was not before the loop.

The fix is for lowering of if-break and else-break to insert an
S_AND_B64 to mask the break condition with exec. This commit also
includes the optimization of not inserting that S_AND_B64 if it is
obviously not needed because the break condition is the result of a
V_CMP in the same basic block.

V2: Addressed some review comments.
V3: Test fixes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44046

Change-Id: I0fc56a01209a9e99d1d5c9b0ffd16f111caf200c
llvm-svn: 333258
2018-05-25 07:55:04 +00:00
Gabor Buella
311b0dd621 [x86] invpcid LLVM intrinsic
Re-add the feature flag for invpcid, which was removed in r294561.
Add an intrinsic, which always uses a 32 bit integer as first argument,
while the instruction actually uses a 64 bit register in 64 bit mode
for the INVPCID_TYPE argument.

Reviewers: craig.topper

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D47141

llvm-svn: 333255
2018-05-25 06:32:05 +00:00
Tom Stellard
eee57c60ef AMDGPU: Remove AMDGPUMCInstLower.h
Summary:
The AMDGPUMCInstLower class is not used outside AMDGPUMCInstLower.cpp,
so we don't need a header file.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47264

llvm-svn: 333254
2018-05-25 04:57:02 +00:00
Tom Stellard
094abf2283 AMDGPU: Split R600 AsmPrinter code into its own class
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D47245

llvm-svn: 333219
2018-05-24 20:02:01 +00:00
Eli Friedman
381e806df7 [AArch64] Improve orr+movk sequences for MOVi64imm.
The existing code has three different ways to try to lower a 64-bit
immediate to the sequence ORR+MOVK.  The result is messy: it misses
some possible sequences, and the order of the checks means we sometimes
emit two MOVKs when we only need one.

Instead, just use a simple loop to try all possible two-instruction
ORR+MOVK sequences.

Differential Revision: https://reviews.llvm.org/D47176

llvm-svn: 333218
2018-05-24 19:38:23 +00:00
Geoff Berry
33474b29b1 [AArch64] Take advantage of variable shift/rotate amount implicit mod operation.
Summary:
Optimize code generated for variable shifts/rotates by taking advantage
of the implicit and/mod done on the variable shift amount register.

Resolves bug 27582 and bug 37421.

Reviewers: t.p.northover, qcolombet, MatzeB, javed.absar

Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D46844

llvm-svn: 333214
2018-05-24 18:29:42 +00:00
Simon Pilgrim
fb8b3278fa [X86][SSE] Pull out (AND (XOR X, -1), Y) matching into a helper function. NFC.
llvm-svn: 333201
2018-05-24 16:16:42 +00:00
Simon Pilgrim
624afd356c Fix unused variable warnings. NFCI.
llvm-svn: 333195
2018-05-24 15:34:50 +00:00
Simon Pilgrim
f9c0aecf43 [X86][SSE] Pull out OR(AND(~MASK,X),AND(MASK,Y)) matching into a helper function. NFC.
First stage towards matching more variants of the bitselect pattern for combineLogicBlendIntoPBLENDV (PR37549)

llvm-svn: 333191
2018-05-24 15:12:48 +00:00
Simon Pilgrim
cf784a1a98 [X86][BtVer2] Added Jaguar cpu cycle counter to permit llvm-exegesis latency testing
Ideally we'd be able to test a CPU by using __builtin_readcyclecounter()/RDTSC instead (PR37193) if a model/cycle-counter is not specified.

NOTE: Jaguar PMCs don't give good coverage of resource pipes specified in the model (at the macro-vs-micro-op levels) but we should be able to cover at least a few resources.
llvm-svn: 333190
2018-05-24 14:54:32 +00:00
Simon Atanasyan
b05e850f6d [mips] Remove duplicated code from the expandLoadInst. NFC
llvm-svn: 333164
2018-05-24 07:36:18 +00:00
Simon Atanasyan
1e2bc551fd [mips] Remove redundant argument from expandLoadInst/expandStoreInst. NFC
llvm-svn: 333163
2018-05-24 07:36:11 +00:00
Simon Atanasyan
8630b238fa [mips] Add precondition asserts to the expandLoadInst/expandStoreInst. NFC
llvm-svn: 333162
2018-05-24 07:36:06 +00:00
Simon Atanasyan
5b2855c72c [mips] Cleanup the code a bit. NFC
llvm-svn: 333161
2018-05-24 07:36:00 +00:00
Shiva Chen
c0b05a0956 [RISCV] Support linker relax function call from auipc and jalr to jal
To do this:
1. Add fixup_riscv_relax fixup types which eventually will
   transfer to R_RISCV_RELAX relocation types.

2. Insert R_RISCV_RELAX relocation types to auipc function call
   expression when linker relaxation enabled.

Differential Revision: https://reviews.llvm.org/D44886

llvm-svn: 333158
2018-05-24 06:21:23 +00:00
Tom Stellard
4b5731b4b0 AMDGPU/R600: Remove code for handling AMDGPUISD::CLAMP
Summary:
We don't generate AMDGPUISD::CLAMP for R600 now that llvm.AMDGPU.clamp
is gone.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47181

llvm-svn: 333153
2018-05-24 05:28:34 +00:00
Lei Huang
861e8a0eb7 [PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX
The match pattern in the definition of LXSDX is xoaddr, so the Pseudo
instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post
RA based on the register pressure. To avoid ambiguity, we need to remove the
select pattern for LXSDX, same as what was done for LXSD. STXSDX also have
the same issue.

Patch by Qing Shan Zhang (steven.zhang).

Differential Revision: https://reviews.llvm.org/D47178

llvm-svn: 333150
2018-05-24 03:20:28 +00:00
Mandeep Singh Grang
0d47a2dcc0 [RISCV] Lower the tail pseudoinstruction
This patch lowers the tail pseudoinstruction. This has been modeled after ARM's
tail call opt.

llvm-svn: 333137
2018-05-23 22:44:08 +00:00
Sameer AbuAsal
078a9b9c6c [RISCV] Set CostPerUse for registers
Summary:
 Set CostPerUse higher for registers that are not used in the compressed
 instruction set. This will influence the greedy register allocator to reduce
 the use of registers that can't be encoded in 16 bit instructions. This
 affects register allocation even when compressed instruction isn't targeted,
 we see no major negative codegen impact.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang

Differential Revision: https://reviews.llvm.org/D47039

llvm-svn: 333132
2018-05-23 21:34:30 +00:00
Lei Huang
ad6f9fcebb [Power9]Legalize and emit code for W vector extract and convert to QP
Implemente patterns to extract [Un]signed Word vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46536

llvm-svn: 333115
2018-05-23 19:31:54 +00:00
Lei Huang
ee7bf6cb03 [Power9]Legalize and emit code for DW vector extract and convert to QP
Implemente patterns to extract [Un]signed DWord vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46333

llvm-svn: 333112
2018-05-23 18:36:51 +00:00
Chad Rosier
2e93b5ba53 [CodeGen][AArch64] Use RegUnits to track register aliases. (NFC)
Use RegUnits to track register aliases in AArch64RedundantCopyElimination.

Differential Revision: https://reviews.llvm.org/D47269

llvm-svn: 333107
2018-05-23 17:49:38 +00:00
Petar Jovanovic
beb5e46405 Silence warnings introduced with r333093
r333093 introduced several warnings (-Wlogical-not-parentheses,
-Wbool-compare).
Adding parentheses in MipsSEInstrInfo::isCopyInstr() to silence it.

llvm-svn: 333097
2018-05-23 16:27:51 +00:00
Petar Jovanovic
8c311a61bc [X86][MIPS][ARM] New machine instruction property 'isMoveReg'
This property is needed in order to follow values movement between
registers. This property is used in TII to implement method that
returns true if simple copy like instruction is recognized, along
with source and destination machine operands.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D45204

llvm-svn: 333093
2018-05-23 15:28:28 +00:00
Nicola Zaghen
80a94d5002 Remove DEBUG macro.
Now that the LLVM_DEBUG() macro landed on the various sub-projects
the DEBUG macro can be removed.
Also change the new uses of DEBUG to LLVM_DEBUG.

Differential Revision: https://reviews.llvm.org/D46952

llvm-svn: 333091
2018-05-23 15:09:29 +00:00
Alex Bradbury
ebc93587cb [RISCV] Add symbol diff relocation support for RISC-V
For RISC-V it is desirable to have relaxation happen in the linker once 
addresses are known, and as such the size between two instructions/byte 
sequences in a section could change.

For most assembler expressions, this is fine, as the absolute address results 
in the expression being converted to a fixup, and finally relocations. 
However, for expressions such as .quad .L2-.L1, the assembler folds this down 
to a constant once fragments are laid out, under the assumption that the 
difference can no longer change, although in the case of linker relaxation the 
differences can change at link time, so the constant is incorrect. One place 
where this commonly appears is in debug information, where the size of a 
function expression is in a form similar to the above.

This patch extends the assembler to allow an AsmBackend to declare that it 
does not want the assembler to fold down this expression, and instead generate 
a pair of relocations that allow the linker to carry out the calculation. In 
this case, the expression is not folded, but when it comes to emitting a 
fixup, the generic FK_Data_* fixups are converted into a pair, one for the 
addition half, one for the subtraction, and this is passed to the relocation 
generating methods as usual. I have named these FK_Data_Add_* and 
FK_Data_Sub_* to indicate which half these are for.

For RISC-V, which supports this via e.g. the R_RISCV_ADD64, R_RISCV_SUB64 pair 
of relocations, these are also set to always emit relocations relative to 
local symbols rather than section offsets. This is to deal with the fact that 
if relocations were calculated on e.g. .text+8 and .text+4, the result 12 
would be stored rather than 4 as both addends are added in the linker.

Differential Revision: https://reviews.llvm.org/D45181
Patch by Simon Cook.

llvm-svn: 333079
2018-05-23 12:36:18 +00:00
Alex Bradbury
f6582865f4 [Sparc] Use addAliasForDirective to support data directives
The Sparc asm parser currently has custom parsing logic for .half, .word, 
.nword and .xword. Rather than use this custom logic, we can just use 
addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue.

https://reviews.llvm.org/D47003

llvm-svn: 333078
2018-05-23 11:20:28 +00:00
Alex Bradbury
4baebba9fe [AArch64] Use addAliasForDirective to support data directives
The AArch64 asm parser currently has custom parsing logic for .hword, .word, 
and .xword. Rather than use this custom logic, we can just use 
addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue.

Differential Revision: https://reviews.llvm.org/D47000

llvm-svn: 333077
2018-05-23 11:17:20 +00:00
Alex Bradbury
261ef87b82 [RISCV] Correctly report sizes for builtin fixups
This is a different approach to fixing the problem described in D46746. 
RISCVAsmBackend currently depends on the getSize helper function returning the 
number of bytes a fixup may change (note: some other backends have a similar 
helper named getFixupNumKindBytes). As noted in that review, this doesn't 
return the correct size for FK_Data_1, FK_Data_2, or FK_Data_8 meaning that 
too few bytes will be written in the case of FK_Data_8, and there's the 
potential of writing outside the Data array for the smaller fixups.

D46746 extends getSize to recognise some of the builtin fixup types. Rather 
than having a function that needs to be kept up to date as new builtin or 
target-specific fixups are added, We can calculate an appropriate bound on the 
number of bytes that might be touched using Info.TargetSize and 
Info.TargetOffset.

Differential Revision: https://reviews.llvm.org/D46965

llvm-svn: 333076
2018-05-23 10:53:56 +00:00
Daniel Cederman
271c0cb6c0 [Sparc] Add mnemonic aliases for flush, stb, stba, sth, and stha
Reviewers: jyknight

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D47140

llvm-svn: 333068
2018-05-23 08:26:49 +00:00
Roman Tereshin
2d9d4134d1 [GlobalISel][ARM] Adding HPR and QPR regclasses to FPRB regbank
Also bringing ARMRegisterBankInfo::getRegBankFromRegClass
implementation up to speed with the *.td-definition.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D43982

llvm-svn: 333056
2018-05-23 02:59:31 +00:00
Matt Arsenault
b6457804d0 AMDGPU: Fix v2f16 fneg/fabs pattern
The integer operation convertion for some reason only happens
if the source is a bitcast from an integer, which happens to
always be the situation when the result is loaded. Add
an additional pattern for when the source operation is really
an FP operation.

llvm-svn: 333019
2018-05-22 20:13:34 +00:00
Eli Friedman
e9e2822fc3 Delete unused variable from r333015.
(The assertion suppressed the unused variable warning on
Release+Asserts builds, so I didn't notice.)

llvm-svn: 333018
2018-05-22 19:38:07 +00:00
Tom Stellard
6955081e45 AMDGPU: Move AMDGPUTargetLowering::isFPExtFoldable() into SITargetLowering
Summary: This is always false for R600.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47180

llvm-svn: 333016
2018-05-22 19:37:55 +00:00
Eli Friedman
f156ca7ef2 [MachineOutliner] Add "thunk" outlining for AArch64.
When we're outlining a sequence that ends in a call, we can save up to
three instructions in the outlined function by turning the call into
a tail-call. I refer to this as thunk outlining because the resulting
outlined function looks like a thunk; suggestions welcome for a better
name.

In addition to making the outlined function shorter, thunk outlining
allows outlining calls which would otherwise be illegal to outline:
we don't need to save/restore LR, so we don't need to prove anything
about the stack access patterns of the callee.

To make this work effectively, I also added
MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner
code; this allows treating an arbitrary instruction as a terminator in
the suffix tree.

Differential Revision: https://reviews.llvm.org/D47173

llvm-svn: 333015
2018-05-22 19:11:06 +00:00
Krzysztof Parzyszek
309f570ea8 [Hexagon] Add patterns for accumulating HVX compares
llvm-svn: 333009
2018-05-22 18:27:02 +00:00
Aleksandar Beserminji
6d75c08f47 [mips] Merge MipsLongBranch and MipsHazardSchedule passes
MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass
because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it
potentially breaks some jumps, so they have to be expanded to long
branches. When some branch is expanded to long branch, it potentially
creates a hazard situation, which should be fixed by adding nops.
New pass is called MipsBranchExpansion, it combines these two passes,
and runs them alternately until one of them reports no changes were made.

Differential Revision: https://reviews.llvm.org/D46641

llvm-svn: 332977
2018-05-22 13:24:38 +00:00
Simon Dardis
e3c6da9e00 [mips] Correct the predicates of the cache and pref instructions
Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46949

llvm-svn: 332970
2018-05-22 10:55:05 +00:00
Simon Pilgrim
c7f0b62bae [TTI] Add uniform/non-uniform constant Pow2 detection to TargetTransformInfo::getInstructionThroughput
This enables us to detect more fast path sdiv cases under cost analysis.

This patch also enables us to handle non-uniform-constant pow2 cases for X86 SDIV costs.

Found while working on D46276

Future patches can then extend the vectorizers to more fully support non-uniform pow2 cases.

Differential Revision: https://reviews.llvm.org/D46637

llvm-svn: 332969
2018-05-22 10:40:09 +00:00