1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

47227 Commits

Author SHA1 Message Date
Simon Atanasyan
6b86c30a0b [mips] Accept 32-bit offsets for lb and lbu commands
`lb` and `lbu` commands accepts 16-bit signed offsets. But GAS accepts
larger offsets for these commands. If an offset does not fit in 16-bit
range, `lb` command is translated into lui/lb or lui/addu/lb series.
It's interesting that initially LLVM assembler supported this feature,
but later it was broken.

This patch restores support for 32-bit offsets. It replaces `mem_simm16`
operand for `LB` and `LBu` definitions by the new `mem_simmptr` operand.
This operand is intended to check that offset fits to the same size as
using for pointers. Later we will be able to extend this rule and
accepts 64-bit offsets when it is possible.

Some issues remain:
- The regression also affects LD, SD, LH, LHU commands. I'm going
  to fix them by a separate patch.

- GAS accepts any 32-bit values as an offset. Now LLVM accepts signed
  16-bit values and this patch extends the range to signed 32-bit offsets.
  In other words, the following code accepted by GAS and still triggers
  an error by LLVM:
```
  lb      $4, 0x80000004

  # gas
  lui     a0, 0x8000
    lb      a0, 4(a0)
```

- In case of 64-bit pointers GAS accepts a 64-bit offset and translates
  it to the li/dsll/lb series of commands. LLVM still rejects it.
  Probably this feature has never been implemented in LLVM. This issue
  is for a separate patch.
```
  lb      $4, 0x800000001

  # gas
  li      a0, 0x8000
  dsll    a0, a0, 0x14
  lb      a0, 4(a0)
```

Differential Revision: https://reviews.llvm.org/D45020

llvm-svn: 330983
2018-04-26 19:55:28 +00:00
Sam Clegg
14655b5aa9 [WebAssembly] Write DWARF data into wasm object file
- Writes ".debug_XXX" into corresponding custom sections.
- Writes relocation records into "reloc.debug_XXX" sections.

Patch by Yury Delendik!

Differential Revision: https://reviews.llvm.org/D44184

llvm-svn: 330982
2018-04-26 19:27:28 +00:00
Matt Arsenault
d22a5c6c61 DAG: Fix not legalizing vector fcanonicalizes
If an fcanoncialize was done on a vector type that was legal,

llvm-svn: 330981
2018-04-26 19:21:37 +00:00
Matt Arsenault
2e083290ad AMDGPU: Extend extract_vector_elt fneg combine to fabs
Fixes a regression in a future commit.

llvm-svn: 330980
2018-04-26 19:21:32 +00:00
Matt Arsenault
ed58114a21 AMDGPU: Consolidate SubtargetPredicate definitions
llvm-svn: 330979
2018-04-26 19:21:26 +00:00
Geoff Berry
dfadfc4259 [AArch64] Fix scavenged spill slot base when stack realignment required.
Summary:
Use the FP for scavenged spill slot accesses to prevent corruption of
the callee-save region when the SP is re-aligned.

Based on problem and patch reported by @paulwalker-arm

This is an alternative to solution proposed in D45770

Reviewers: t.p.northover, paulwalker-arm, thegameg, javed.absar

Subscribers: qcolombet, mcrosier, paulwalker-arm, kristof.beyls, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46063

llvm-svn: 330976
2018-04-26 18:50:45 +00:00
Mark Searles
b2c3ea9624 [AMDGPU][Waitcnt] As of gfx7, VMEM operations do not increment the export counter and the input registers are available in the next instruction; update the waitcnt pass to take this into account.
Differential Revision: https://reviews.llvm.org/D46067

llvm-svn: 330954
2018-04-26 16:11:19 +00:00
Simon Dardis
36cd99ffe9 [mips] Correct the definitions of some control instructions
Correct the definitions of ei, di, eret, deret, wait, syscall and break.
Also provide microMIPS specific aliases to match the MIPS aliases.

Additionally correct the definition of the wait instruction so that
it is present in the instruction mapping tables.

Reviewers: smaksimovic, abeserminji, atanasyan

Differential Revision: https://reviews.llvm.org/D45939

llvm-svn: 330952
2018-04-26 16:06:34 +00:00
Alex Bradbury
1407041ee9 [RISCV] Implement isLoadFromStackSlot and isStoreToStackSlot
This causes some slight shuffling but no meaningful codegen differences on the 
corpus I used for testing, but it has a larger impact when combined with e.g. 
rematerialisation. Regardless, it makes sense to report as accurate 
target-specific information as possible.

llvm-svn: 330949
2018-04-26 15:34:27 +00:00
Benjamin Kramer
9bbeaa7c56 [NVPTX] Make the legalizer expand shufflevector of <2 x half>
There's no direct instruction for this, but it's trivially implemented
with two movs. Without this the code generator just dies when
encountering a shufflevector.

Differential Revision: https://reviews.llvm.org/D46116

llvm-svn: 330948
2018-04-26 15:26:29 +00:00
Alex Bradbury
0f0487bdb7 [RISCV] Implement isZextFree
This returns true for 8-bit and 16-bit loads, allowing LBU/LHU to be selected
and avoiding unnecessary masks.

llvm-svn: 330943
2018-04-26 14:04:18 +00:00
Matthew Simpson
0ecc1283b2 [TTI, AArch64] Add transpose shuffle kind
This patch adds a new shuffle kind useful for transposing a 2xn matrix. These
transpose shuffle masks read corresponding even- or odd-numbered vector
elements from two n-dimensional source vectors and write each result into
consecutive elements of an n-dimensional destination vector. The transpose
shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such,
this patch also considers transpose shuffles in the AArch64 implementation of
getShuffleCost.

Differential Revision: https://reviews.llvm.org/D45982

llvm-svn: 330941
2018-04-26 13:48:33 +00:00
Alex Bradbury
57ab217363 [RISCV] Implement isTruncateFree
Adapted from ARM's implementation introduced in r313533 and r314280.

llvm-svn: 330940
2018-04-26 13:37:00 +00:00
Lama Saba
b88c98427f [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153
Differential Revision: https://reviews.llvm.org/D45823

Change-Id: Icf6f34f6babc3cb2ff5292fde003472473037a71
llvm-svn: 330939
2018-04-26 13:16:11 +00:00
Alex Bradbury
1d14f19193 [RISCV] Implement isLegalICmpImmediate
I'm unable to construct a representative test case that demonstrates the 
advantage, but it seems sensible to report accurate target-specific 
information regardless.

llvm-svn: 330938
2018-04-26 13:15:17 +00:00
Alex Bradbury
758bf698f4 [RISCV] Implement isLegalAddImmediate
This causes a trivial improvement in the recently added lsr-legaladdimm.ll 
test case.

llvm-svn: 330937
2018-04-26 13:00:37 +00:00
Sander de Smalen
abaf09d654 [AArch64][SVE] Enable DiagnosticPredicates for SVE LD1 instructions.
This patch extends the PredicateMethod of AsmOperands used in SVE's 
LD1 instructions with a DiagnosticPredicate. This makes them 'context
sensitive' to the operand that has been parsed and tells the user to
use the right register (with expected shift/extend), rather than telling 
the immediate is out of range when it actually parsed a register.

Patch [2/2] in a series to improve assembler diagnostics for SVE:
-  Patch [1/2]: https://reviews.llvm.org/D45879
-  Patch [2/2]: https://reviews.llvm.org/D45880

Reviewers: olista01, stoklund, craig.topper, mcrosier, rengolin, echristo, fhahn, SjoerdMeijer, evandro, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D45880

llvm-svn: 330934
2018-04-26 12:54:42 +00:00
Benjamin Kramer
fb47b8e4ca [NVPTX] Deduplicate code. No functionality change.
llvm-svn: 330933
2018-04-26 12:30:16 +00:00
Alex Bradbury
befb90b5be [RISCV] Implement isLegalAddressingMode for RISC-V
This has no impact on codegen for the current RISC-V unit tests or my small 
benchmark set and very minor changes in a few programs in the GCC torture 
suite. Based on this, I haven't been able to produce a representative test 
program that demonstrates a benefit from isLegalAddressingMode. I'm committing 
the patch anyway, on the basis that presenting accurate information to the 
target-independent code is preferable to relying on incorrect generic 
assumptions.

llvm-svn: 330932
2018-04-26 12:13:48 +00:00
Sander de Smalen
9545d884bd [AArch64][SVE] Asm: Support for gather LD1/LDFF1 (scalar + vector) load instructions.
Patch [2/3] in series to add support for SVE's gather load instructions
that use scalar+vector addressing modes:
- Patch [1/3]: https://reviews.llvm.org/D45951
- Patch [2/3]: https://reviews.llvm.org/D46023
- Patch [3/3]: https://reviews.llvm.org/D45958

Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D46023

llvm-svn: 330928
2018-04-26 08:19:53 +00:00
Craig Topper
b15fe856ad [X86] Print 'tbyte ptr' instead of 'xword ptr' for f80mem in Intel syntax.
This matches objdump.

llvm-svn: 330922
2018-04-26 05:07:40 +00:00
Craig Topper
d62f374024 [X86] Remove alignment restriction on loading folding of pcmp[ei]str* during isel too.
This is a follow up to the changes in r330896 which enabled folding after isel during peephole and register allocation.

llvm-svn: 330897
2018-04-26 03:53:39 +00:00
Chandler Carruth
ae39a72002 [x86] Allow folding unaligned memory operands into pcmp[ei]str*
instructions.

These have special permission according to the x86 manual to read
unaligned memory, and this folding is done by ICC and GCC as well.

This corrects one of the issues identified in PR37246.

llvm-svn: 330896
2018-04-26 03:17:25 +00:00
Simon Pilgrim
7daf843edf [CostModel][X86] Remove hard coded SDIV/UDIV vector costs
Algorithmically compute the 'x20' SDIV/UDIV vector costs - this is necessary for PR36550 when DIV costs will be driven from the scheduler models.

llvm-svn: 330870
2018-04-25 20:59:16 +00:00
Tom Stellard
dfa11ced56 AMDGPU/R600: Move int_r600_store_stream_output to the public intrinsic file
Summary:
The TableGen'd GlobalISel instruction selector assumes all intrinsics are in
the public Intrinsic:: namespace.

Reviewers: jvesely, nhaehnle

Reviewed By: jvesely, nhaehnle

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45989

llvm-svn: 330866
2018-04-25 20:02:53 +00:00
Mark Searles
c35348c12e [AMDGPU] Waitcnt pass: add debug options
- Add "amdgpu-waitcnt-forcezero" to force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)

- Add debug counters to control force emit of s_waitcnt instrs; debug counters:
si-insert-waitcnts-forceexp: force emit s_waitcnt expcnt(0) instrs
si-insert-waitcnts-forcevm: force emit s_waitcnt lgkmcnt(0) instrs
si-insert-waitcnts-forcelgkm: force emit s_waitcnt vmcnt(0) instrs

- Add some debug statements

Note that a variant of this patch was previously committed/reverted.

Differential Revision: https://reviews.llvm.org/D45888

llvm-svn: 330862
2018-04-25 19:21:26 +00:00
Craig Topper
abf3793f70 [X86] Form MUL_IMM for multiplies with 3/5/9 to encourage LEA formation over load folding.
Previously we only formed MUL_IMM when we split a constant. This blocked load folding on those cases. We should also form MUL_IMM for 3/5/9 to favor LEA over load folding.

Differential Revision: https://reviews.llvm.org/D46040

llvm-svn: 330850
2018-04-25 17:35:03 +00:00
Alex Bradbury
3c59371336 [RISCV] Allow call pseudoinstruction to be used to call a function name that coincides with a register name
Previously `call zero`, `call f0` etc would fail. This leads to compilation 
failures if building programs that define functions with those names and using 
-save-temps.

llvm-svn: 330846
2018-04-25 17:25:29 +00:00
Simon Pilgrim
b8bc04dff2 [CostModel][X86] Recursive call for cost of imul for packed v16i16 constant shift left.
Don't just assume cost = 1.

llvm-svn: 330834
2018-04-25 15:22:03 +00:00
Amara Emerson
d4697c6c9c [AArch64][GlobalISel] Implement selection for the llvm.trap intrinsic.
rdar://38674040

llvm-svn: 330831
2018-04-25 14:43:59 +00:00
Shiva Chen
351fbaf236 [RISCV] Expand function call to "call" pseudoinstruction
To do this:
1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer
   split the symbol.

2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer
   split the symbol.

3. Let PseudoCALL match direct call with target operand TargetGlobalAddress
   and TargetExternalSymbol.

Differential Revision: https://reviews.llvm.org/D44885

llvm-svn: 330827
2018-04-25 14:19:12 +00:00
Shiva Chen
8d327080ee [RISCV] Support "call" pseudoinstruction in the MC layer
To do this:
1. Add PseudoCALLIndirct to match indirect function call.

2. Add PseudoCALL to support parsing and print pseudo `call` in assembly

3. Expand PseudoCALL to the following form with R_RISCV_CALL relocation type
   while encoding:
        auipc ra, func
        jalr ra, ra, 0

If we expand PseudoCALL before emitting assembly, we will see auipc and jalr
pair when compile with -S. It's hard for assembly parser to parsing this
pair and identify it's semantic is function call and then insert R_RISCV_CALL
relocation type. Although we could insert R_RISCV_PCREL_HI20 and
R_RISCV_PCREL_LO12_I relocation types instead of R_RISCV_CALL.
Due to RISCV relocation design, auipc and jalr pair only can relax to jal with
R_RISCV_CALL + R_RISCV_RELAX relocation types.

We expand PseudoCALL as late as encoding(RISCVMCCodeEmitter) instead of before
emitting assembly(RISCVAsmPrinter) because we want to preserve call
pseudoinstruction in assembly code. It's more readable and assembly parser
could identify call assembly and insert R_RISCV_CALL relocation type.

Differential Revision: https://reviews.llvm.org/D45859

llvm-svn: 330826
2018-04-25 14:18:55 +00:00
Simon Dardis
9c37e1ee47 [mips] Teach the delay slot filler to transform 'jal' for microMIPS
ISel is currently picking 'JAL' over 'JAL_MM' for calling a function when
targeting microMIPS. A later patch will correct this behaviour.

This patch extends the mechanism for transforming instructions into their short
delay to recognise 'JAL_MM' for transforming into 'JALS_MM'.

llvm-svn: 330825
2018-04-25 14:12:57 +00:00
Simon Pilgrim
8429d5f513 [X86] Split WriteFMA into XMM, Scalar and YMM/ZMM scheduler classes
This removes all the FMA InstRW overrides.

If we ever get PR36924, then we can remove many of these declarations from models.

llvm-svn: 330820
2018-04-25 13:07:58 +00:00
Alexander Timofeev
a4d045ea56 [AMDGPU] Revert b0efc4fd6 (https://reviews.llvm.org/D40556)
llvm-svn: 330818
2018-04-25 12:32:46 +00:00
Simon Pilgrim
e4f6d0b45f [X86][SKX] Setup WriteFAdd and remove unnecessary InstRW scheduler overrides.
llvm-svn: 330813
2018-04-25 10:51:19 +00:00
Simon Pilgrim
22b5b8f3ee [X86][SNB] Remove unnecessary WriteFBlendLd InstRW scheduler overrides.
llvm-svn: 330812
2018-04-25 10:50:39 +00:00
Simon Dardis
dbed0f6c1f [mips] Fix the definition of sync, synci
Also, fix the disassembly of synci for microMIPS.

Reviewers: abeserminji, smaksimovic, atanasyan

Differential Revision: https://reviews.llvm.org/D45870

llvm-svn: 330810
2018-04-25 10:19:22 +00:00
Sander de Smalen
3ea48ae0b4 [AArch64][SVE] Asm: Add AsmOperand classes for SVE gather/scatter addressing modes.
This patch adds parsing support for 'vector + shift/extend' and
corresponding asm operand classes, needed for implementing SVE's
gather/scatter addressing modes.

The added combinations of vector (ZPR) and Shift/Extend are:

Unscaled:
  ZPR64ExtLSL8:           signed 64-bit offsets  (z0.d)
  ZPR32ExtUXTW8:        unsigned 32-bit offsets  (z0.s, uxtw)
  ZPR32ExtSXTW8:          signed 32-bit offsets  (z0.s, sxtw)

Unpacked and unscaled:
  ZPR64ExtUXTW8:        unsigned 32-bit offsets  (z0.d, uxtw)
  ZPR64ExtSXTW8:          signed 32-bit offsets  (z0.d, sxtw)

Unpacked and scaled:
  ZPR64ExtUXTW<scale>:  unsigned 32-bit offsets  (z0.d, uxtw #<shift>)
  ZPR64ExtSXTW<scale>:    signed 32-bit offsets  (z0.d, sxtw #<shift>)

Scaled:
  ZPR32ExtUXTW<scale>:  unsigned 32-bit offsets  (z0.s, uxtw #<shift>)
  ZPR32ExtSXTW<scale>:    signed 32-bit offsets  (z0.s, sxtw #<shift>)
  ZPR64ExtLSL<scale>:   unsigned 64-bit offsets  (z0.d,  lsl #<shift>)
  ZPR64ExtLSL<scale>:     signed 64-bit offsets  (z0.d,  lsl #<shift>)


Patch [1/3] in series to add support for SVE's gather load instructions
that use scalar+vector addressing modes:
- Patch [1/3]: https://reviews.llvm.org/D45951
- Patch [2/3]: https://reviews.llvm.org/D46023
- Patch [3/3]: https://reviews.llvm.org/D45958

Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D45951

llvm-svn: 330805
2018-04-25 09:26:47 +00:00
Jessica Paquette
f7f9e664cd [MachineOutliner] Check for explicit uses of LR/W30 in MI operands
Before, the outliner would grab ADRPs that used LR/W30. This patch fixes
that by checking for explicit uses of those registers before the special-casing
for ADRPs.

This also adds a test that ensures that those sorts of ADRPs won't be outlined.

llvm-svn: 330783
2018-04-24 22:38:15 +00:00
Warren Ristow
d952afa83a [X86] Account for partial stack slot spills (PR30821)
Previously, _any_ store or load instruction was considered to be
operating on a spill if it had a frameindex as an operand, and thus
was fair game for optimisations such as "StackSlotColoring". This
usually works, except on architectures where spills can be partially
restored, for example on X86 where a spilt vector can have a single
component loaded (zeroing the rest of the target register). This can be
mis-interpreted and the zero extension unsoundly eliminated, see
pr30821.

To avoid this, this commit optionally provides the caller to
isLoadFromStackSlot and isStoreToStackSlot with the number of bytes
spilt/loaded by the given instruction. Optimisations can then determine
that a full spill followed by a partial load (or vice versa), for
example, cannot necessarily be commuted.

Patch by Jeremy Morse!

Differential Revision: https://reviews.llvm.org/D44782

llvm-svn: 330778
2018-04-24 22:01:50 +00:00
Tom Stellard
8b4c489d56 AMDGPU: Remove deprecated llvm.AMDGPU.kilp intrinsic
Summary: This is no longer used by mesa since its 18.0.0 release.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D45988

llvm-svn: 330775
2018-04-24 21:37:57 +00:00
Tom Stellard
80ef2ac42b AMDGPU/GlobalISel: Fall-back to SelectionDAG for non-void functions
Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45843

llvm-svn: 330774
2018-04-24 21:29:36 +00:00
Tom Stellard
71dcc5d45e AMDGPU/GlobalISel: Add support for amdgpu_ps calling convention
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45837

llvm-svn: 330767
2018-04-24 20:51:28 +00:00
Simon Pilgrim
d8676be5d6 [X86][SKX] Setup WriteFMul and remove unnecessary InstRW scheduler overrides.
llvm-svn: 330760
2018-04-24 19:22:01 +00:00
Simon Pilgrim
c3baeaefd6 [X86] Split off PHMINPOSUW to their own schedule class
This also fixes Jaguar's schedule which was treating it as the WriteVecIMul default. 

llvm-svn: 330756
2018-04-24 18:49:25 +00:00
Stanislav Mekhanoshin
3fff7b2c27 [AMDGPU] Truncate packed inline constant
If a packed inline constant is sign extended it must be truncated
after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented
as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended
to 64 bit. After the value shifted right by 16 to use it in a low part
with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline
constant any longer.

Fixed the error and added verification code. Without the fix and with
the verification bug is causing pk_max_f16_literal.ll to fail.

Differential Revision: https://reviews.llvm.org/D45987

llvm-svn: 330752
2018-04-24 18:17:55 +00:00
Simon Pilgrim
cc0db87d22 [XOP] v4i32 IFMA 'VPMACS' instructions should use the WritePMULLD schedule class
llvm-svn: 330751
2018-04-24 18:13:57 +00:00
Simon Pilgrim
8a87ea7a2b [AVX512] VPERMQ/VPERMPD/VPERMIL single op shuffles are not variable shuffles
These variants all take an immediate shuffle mask value and should be scheduled as such.

llvm-svn: 330747
2018-04-24 17:59:54 +00:00
Simon Dardis
2ae002a01c Reland "[mips] Guard traps for microMIPS correctly"
This is part of fixing the instruction predicates for MIPS.

Reviewers: atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D44212


This patch relands r327409, hopefully without the problematic part of the
tests that cause FileCheck to assert on the windows expensive checks bot.

llvm-svn: 330741
2018-04-24 17:11:37 +00:00