1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

3830 Commits

Author SHA1 Message Date
Jessica Paquette
8e321c8c5a [GlobalISel] Combine (logic_op (op x...), (op y...)) -> (op (logic_op x, y))
This implements

```
(logic_op (op x...), (op y...)) -> (op (logic_op x, y))
```

when `op` is an extend, a shift, or an and.

This is similar to `DAGCombiner::hoistLogicOpWithSameOpcodeHands`
(with a bunch of missing cases, e.g. G_TRUNC, G_BITCAST, etc.)

This is implemented so it works both pre and post-legalization.

This also adds a general way to add a series of instructions in a combine.
(`applyBuildInstructionSteps`).

Differential Revision: https://reviews.llvm.org/D85050
2020-08-11 10:40:06 -07:00
Jay Foad
928c1dd7ef [GlobalISel] Add G_ABS
This is equivalent to the new llvm.abs intrinsic added by D84125 with
is_int_min_poison=0.

Differential Revision: https://reviews.llvm.org/D85718
2020-08-11 16:34:37 +01:00
Matt Arsenault
c9ab6823ab TableGen/GlobalISel: Hack the operand order for atomic_store
ISD::ATOMIC_STORE arbitrarily has the operands in the opposite order
from regular ISD::STORE, which always introduced an annoying
duplication of patterns to handle both cases. Since in GlobalISel
there's just the one G_STORE, we need to swap the operands to
correctly emit the type check for the pointer operand.

Some work started in 20aafa31569b5157e792daa8860d71dd0df8a53a to
migrate SelectionDAG to use ISD::STORE for atomics, but that work
seems to have stalled. Since this is the pretty much the last
operation which matters which isn't supported for AMDGPU, use this
compatibility hack to unblock declaring it functionally complete.

Not sure what's going on with the pending_phis AArch64 test. It seems
it didn't always use atomics, and I'm not sure what it was originally
testing matters anymore.
2020-08-11 10:22:44 -04:00
Kerry McLaughlin
a9073b6100 [SVE][CodeGen] Legalisation of INSERT_VECTOR_ELT for scalable vectors
When the result type of insertelement needs to be split,
SplitVecRes_INSERT_VECTOR_ELT will try to store the vector to a
stack temporary, store the element at the location of the stack
temporary plus the index, and reload the Lo/Hi parts.

This patch does the following to ensure this works for scalable vectors:
 - Sets the StackID with getStackIDForScalableVectors() in CreateStackTemporary
 - Adds an IsScalable flag to getMemBasePlusOffset() and scales the
    offset by VScale when this is true
 - Ensures the immediate is clamped correctly by clampDynamicVectorIndex
    so that we don't try to use an out of range index

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D84874
2020-08-11 12:57:28 +01:00
Paul Walker
04aa37ecb2 [SVE] Lower fixed length vector integer subtract operations.
Differential Revision: https://reviews.llvm.org/D85665
2020-08-11 11:32:12 +01:00
Puyan Lotfi
c4f084e133 [MachineOutliner][AArch64] WA for multiple stack fixup cases in MachineOutliner.
In cases where MachineOutliner candidates either are:

  * noreturn
  * have calls with no available LR or free regs
  * Don't use SP

we can end up hitting stack fixup code for the caller and the callee for
a FrameID of MachineOutlinerDefault. This triggers the assert:

  `assert(OF.FrameConstructionID != MachineOutlinerDefault &&
          "Can only fix up stack references once");`

in AArch64InstrInfo.cpp. This assert exists for now because a lot of the
fixup code is not tested to handle fixing up more than once and needs
some better checks and enhancements to avoid potentially generating
illegal code.

I've filed a Bugzilla report to track this until these cases are handled
by the AArch64 MachineOutliner: https://bugs.llvm.org/show_bug.cgi?id=46767

This diff detects cases that will cause these multiple stack fixups and
prune the Candidates from `RepeatedSequenceLocs`.

    Differential Revision: https://reviews.llvm.org/D83923
2020-08-10 15:43:30 -04:00
Paul Walker
b50a4a5bcf Fix "CHECK-LABEL: @" typos in llvm/test/CodeGen/AArch64/sve-fixed-length-*.ll 2020-08-10 20:07:45 +01:00
Vitaly Buka
6669d78639 Revert "[StackSafety] Skip ambiguous lifetime analysis"
This reverts commit 0b2616a8045cb776ea1514c3401d0a8577de1060.

Crashes with safe-stack.
2020-08-07 14:02:50 -07:00
Bevin Hansson
7c243aea4b [Intrinsic] Add sshl.sat/ushl.sat, saturated shift intrinsics.
Summary:
This patch adds two intrinsics, llvm.sshl.sat and llvm.ushl.sat,
which perform signed and unsigned saturating left shift,
respectively.

These are useful for implementing the Embedded-C fixed point
support in Clang, originally discussed in
http://lists.llvm.org/pipermail/llvm-dev/2018-August/125433.html
and
http://lists.llvm.org/pipermail/cfe-dev/2018-May/058019.html

Reviewers: leonardchan, craig.topper, bjope, jdoerfert

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83216
2020-08-07 15:09:24 +02:00
David Sherwood
396a61103c [SVE][CodeGen] Fix bug with store of unpacked FP scalable vectors
Fixed an incorrect pattern in lib/Target/AArch64/AArch64SVEInstrInfo.td
for storing out <vscale x 2 x f32> unpacked scalable vectors. Added
a couple of tests to

  test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll

Differential Revision: https://reviews.llvm.org/D85441
2020-08-07 07:19:09 +01:00
QingShan Zhang
f8c2a4c0d9 [Scheduling] Create the missing dependency edges for store cluster
If it is load cluster, we don't need to create the dependency edges(SUb->reg) from SUb to SUa
as they both depend on the base register "reg"

     +-------+
+---->  reg  |
|    +---+---+
|        ^
|        |
|        |
|        |
|    +---+---+
|    |  SUa  |  Load 0(reg)
|    +---+---+
|        ^
|        |
|        |
|    +---+---+
+----+  SUb  |  Load 4(reg)
     +-------+

But if it is store cluster, we need to create it as follow shows to avoid the instruction store
depend on scheduled in-between SUb and SUa.

     +-------+
+---->  reg  |
|    +---+---+
|        ^
|        |         Missing       +-------+
|        | +-------------------->+   y   |
|        | |                     +---+---+
|    +---+-+-+                       ^
|    |  SUa  |  Store x 0(reg)       |
|    +---+---+                       |
|        ^                           |
|        |  +------------------------+
|        |  |
|    +---+--++
+----+  SUb  |  Store y 4(reg)
     +-------+

Reviewed By: evandro, arsenm, rampitec, foad, fhahn

Differential Revision: https://reviews.llvm.org/D72031
2020-08-07 04:58:03 +00:00
Vitaly Buka
3b944733de [StackSafety] Skip ambiguous lifetime analysis
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D84630
2020-08-06 19:10:33 -07:00
Matt Arsenault
2f97a6b627 AArch64/GlobalISel: Fix verifier error after selecting returnaddress
This was caching the wrong register to re-use later.
2020-08-06 13:18:05 -04:00
Petar Avramovic
5de548eba7 [GlobalISel][InlineAsm] Fix matching input constraint to physreg
Add given input and mark it as tied.
Doesn't create additional copy compared to
matching input constraint to virtual register.

Differential Revision: https://reviews.llvm.org/D85122
2020-08-06 14:35:51 +02:00
Paul Walker
abb5aa7429 [SVE] Lower scalable vector mul operations.
This allows us to remove extra patterns from AArch64SVEInstrInfo.td
because we can reuse those required for fixed length vectors.

Differential Revision: https://reviews.llvm.org/D85328
2020-08-06 11:15:35 +01:00
Paul Walker
a80884d5ec [SVE] Implement lowering for fixed length vector multiplication.
NOTE: Also uses SVE code generation for NEON size vectors, instead
of expanding i64 based vector multiplications.

Differential Revision: https://reviews.llvm.org/D85327
2020-08-06 11:01:39 +01:00
Paul Walker
3c5c30be96 [SVE] Add lowering for fixed length vector and, or & xor operations.
Since there are no ill effects when performing these operations
with undefined elements, they are lowered to the already supported
unpredicated scalable vector equivalents.

Differential Revision: https://reviews.llvm.org/D85117
2020-08-05 11:28:34 +01:00
Sander de Smalen
8489fbb9a7 [AArch64][SVE] Disable tail calls if callee does not preserve SVE regs.
This fixes an issue triggered by the following code, where emitEpilogue
got confused when trying to restore the SVE registers after the call,
whereas the call to bar() is implemented as a TCReturn:

  int non_sve();
  int sve(svint32_t x) { return non_sve(); }

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84869
2020-08-05 09:38:54 +01:00
Eli Friedman
83274f02e1 [SelectionDAG][SVE] Support scalable vectors in getConstantFP()
Differential Revision: https://reviews.llvm.org/D85249
2020-08-04 15:32:43 -07:00
Matt Arsenault
ba4d17c159 GlobalISel: Add utilty for getting function argument live ins
Get the argument register and ensure there's a copy to the virtual
register. AMDGPU and AArch64 have similarish code to get the livein
value, and I also want to use this in multiple places.

This is a bit more aggressive about setting the register class than
the original function, but that's probably OK.

I think we're missing a few verifier checks for function live ins. I
noticed AArch64's calling convention code is not actually adding
liveins to functions, only the entry block (which apparently might not
matter that much?). There should probably be a verifier check that
entry block live ins are also live into the function. We also might
need a verifier check that the copy to the livein virtual register is
in the entry block.
2020-08-04 16:55:55 -04:00
Eli Friedman
84966ec143 [AArch64][SVE] Widen narrow sdiv/udiv operations.
The SVE instruction set only supports sdiv/udiv for 32-bit and 64-bit
integers.  If we see an 8-bit or 16-bit divide, widen the operands to 32
bits, and narrow the result.

Differential Revision: https://reviews.llvm.org/D85170
2020-08-04 13:22:15 -07:00
Matt Arsenault
a816bda54e GlobalISel: Handle llvm.localescape
This one is pretty easy and shrinks the list of unhandled
intrinsics. I'm not sure how relevant the insert point is. Using the
insert position of EntryBuilder will place this after
constants. SelectionDAG seems to end up emitting these after argument
copies and before anything else, but I don't think it really
matters. This also ends up emitting these in the opposite order from
SelectionDAG, but I don't think that matters either.

This also needs a fix to stop the later passes dropping this as a dead
instruction. DeadMachineInstructionElim's version of isDead special
cases LOCAL_ESCAPE for some reason, and I'm not sure why it's excluded
from MachineInstr::isLabel (or why isDead doesn't check it).

I also noticed DeadMachineInstructionElim never considers inline asm
as dead, but GlobalISel will drop asm with no constraints.
2020-08-04 15:19:02 -04:00
Cameron McInally
cbe6afb323 [GlobalISel] Remove redundant FNEG tests.
These tests were made redundant by D85139.
2020-08-04 11:32:15 -05:00
Cameron McInally
0f45fb4bbc [GlobalISel] Don't transform FSUB(-0, X) -> FNEG(X) in GlobalISel.
This patch stops unconditionally transforming FSUB(-0, X) into an FNEG(X) while building the MIR.

This corresponds with the SelectionDAGISel change in D84056.

Differential Revision: https://reviews.llvm.org/D85139
2020-08-04 11:27:09 -05:00
Sander de Smalen
7465703e27 [AArch64][SVE] Add missing unwind info for SVE registers.
This patch adds a CFI entry for each SVE callee saved register
that needs unwind info at an offset from the CFA. The offset is
a DWARF expression because the offset is partly scalable.

The CFI entries only cover a subset of the SVE callee-saves and
only encodes the lower 64-bits, thus implementing the lowest
common denominator ABI. Existing unwinders may support VG but
only restore the lower 64-bits.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84044
2020-08-04 11:47:06 +01:00
Sander de Smalen
d910125a2c [AArch64][SVE] Fix CFA calculation in presence of SVE objects.
The CFA is calculated as (SP/FP + offset), but when there are
SVE objects on the stack the SP offset is partly scalable and
should instead be expressed as the DWARF expression:

     SP + offset + scalable_offset * VG

where VG is the Vector Granule register, containing the
number of 64bits 'granules' in a scalable vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84043
2020-08-04 11:47:06 +01:00
Paul Walker
44870bb979 [SVE] Replace remaining _MERGE_OP1 nodes with _PRED variants.
This is the final bit of work to relax the register allocation
requirements when code generating normal LLVM IR, which rarely
care about the result of inactive lanes. By using _PRED nodes
we can make better use of SVE's reversed instructions.

Also removes a redundant parameter from the min/max tests.

Differential Revision: https://reviews.llvm.org/D85142
2020-08-04 11:19:17 +01:00
Florian Hahn
beaa269335 [AArch64] Consider instruction-level contract FMFs in combiner patterns.
Currently, instruction level fast math flags are not considered when
generating patterns for the machine combiner.

This currently leads to some missed opportunities to generate FMAs in
combination with `#pragma clang fp contract (fast)`.

For example, when building the example below with -O3 for AArch64, no
FMADD is generated. If built with -O2 and the DAGCombiner is used
instead of the MachineCombiner for FMAs, an FMADD is generated.

With this patch, the same code is generated in both cases.

    float madd_contract(float a, float b, float c) {
    #pragma clang fp contract (fast)
      return (a * b) + c;
    }

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D84930
2020-08-04 10:25:16 +01:00
Mitch Phillips
89e3449531 [HWASan] [GlobalISel] Add +tagged-globals backend feature for GlobalISel
GlobalISel is the default ISel for aarch64 at -O0. Prior to D78465, GlobalISel
didn't have support for dealing with address-of-global lowerings, so it fell
back to SelectionDAGISel.

HWASan Globals require special handling, as they contain the pointer tag in the
top 16-bits, and are thus outside the code model. We need to generate a `movk`
in the instruction sequence with a G3 relocation to ensure the bits are
relocated properly. This is implemented in SelectionDAGISel, this patch does
the same for GlobalISel.

GlobalISel and SelectionDAGISel differ in their lowering sequence, so there are
differences in the final instruction sequence, explained in
`tagged-globals.ll`. Both of these implementations are correct, but GlobalISel
is slightly larger code size / slightly slower (by a couple of arithmetic
instructions). I don't see this as a problem for now as GlobalISel is only on
by default at `-O0`.

Reviewed By: aemerson, arsenm

Differential Revision: https://reviews.llvm.org/D82615
2020-08-03 14:28:44 -07:00
Eli Friedman
baa9fabf16 [AArch64] Add missing isel patterns for fcvtzs/u intrinsic on v1f64.
Fixes test-suite compile failure caused by 8dfb5d7.

While I'm in the area, add some more test coverage to related
operations, to make sure we aren't missing any other patterns.
2020-08-03 13:04:59 -07:00
Matt Arsenault
c177fc936f GlobalISel: Handle arbitrary FewerElementsVector for G_IMPLICIT_DEF 2020-08-03 09:14:08 -04:00
Huihui Zhang
864c46a52c [AArch64][SVE] Allow vector of pointers as legal type for masked load/store.
Refer to LangRef http://llvm.org/docs/LangRef.html#llvm-masked-load-intrinsics
'llvm.masked.load/store.*’ intrinsics are overloaded intrinsic, which allow the
load/store data to be a vector of any integer, floating-point or pointer data type.

Therefore, allow pointer data type when checking 'isLegalMaskedLoadStore()'.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D85045
2020-07-31 17:30:23 -07:00
Aditya Nandakumar
313711b2cf [GISel] Add combiners for G_INTTOPTR and G_PTRTOINT
https://reviews.llvm.org/D84909

Patch adds two new GICombinerRules, one for G_INTTOPTR and one for
G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if
the cast is within the same address space. The G_PTRTOINT elides
int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.

Patch by mkitzan
2020-07-31 10:13:36 -07:00
Eli Friedman
244f68004d [LegalizeTypes][SVE] Support widen/split legalization for SPLAT_VECTOR
Just the obvious implementation that rewrites the result type. Also fix
warning from EXTRACT_SUBVECTOR legalization that triggers on the test.

Differential Revision: https://reviews.llvm.org/D84706
2020-07-30 16:17:45 -07:00
Amara Emerson
dd34b50099 [AArch64][GlobalISel] Add legalization & selection support for G_INTRINSIC_LRINT.
Differential Revision: https://reviews.llvm.org/D84552
2020-07-30 16:14:56 -07:00
Jon Roelofs
50a1ea2ba8 [SelectionDAG] Fix lowering of vector geps
This fixes an assertion failure that was being triggered in
SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32>
to i64 (which should have been <2xi64>).

Fixes: rdar://66016901

Differential Revision: https://reviews.llvm.org/D84884
2020-07-30 14:56:53 -06:00
Florian Hahn
28a367c7c3 [AArch64] Add machine-combiner tests with instruction level FMFs. 2020-07-30 11:41:09 +01:00
David Sherwood
cf36aeaf65 [SVE][CodeGen] At -O0 fallback to DAG ISel when translating alloca with scalable types
When building code at -O0 We weren't falling back to DAG ISel correctly
when encountering alloca instructions with scalable vector types. This
is because the alloca has no operands that are scalable. I've fixed this by
adding a check in AArch64ISelLowering::fallBackToDAGISel for alloca
instructions with scalable types.

Differential Revision: https://reviews.llvm.org/D84746
2020-07-30 08:40:53 +01:00
Matt Arsenault
71c6e8a505 GlobalISel: Handle assorted no-op intrinsics
SelectionDAGBuilder just drops these, so do the same.
2020-07-29 21:26:20 -04:00
Matt Arsenault
e2b102c48a GlobalISel: Handle llvm.roundeven
I still think it's highly questionable that we have two intrinsics
with identical behavior and only vary by the name of the libcall used
if it happens to be lowered that way, but try to reduce the feature
delta between SDAG and GlobalISel for recently added intrinsics. I'm
not sure which opcode should be considered the canonical one, but
lower roundeven back to round.
2020-07-29 20:01:12 -04:00
Amara Emerson
d7527ffdb2 [GlobalISel] Add G_INTRINSIC_LRINT and translate from llvm.lrint
Differential Revision: https://reviews.llvm.org/D84551
2020-07-29 11:51:04 -07:00
Amara Emerson
9d49eb2e5c [AArch64][GlobalISel] Selection support for vector DUP[X]lane instructions.
In future, we'd like to use the perfect-shuffle mechanism to deal with these
shuffle permutations. For now, this improves performance by avoiding the
super-expensive const-pool load + tbl instruction.

Differential Revision: https://reviews.llvm.org/D84866
2020-07-29 11:41:37 -07:00
Jessica Paquette
6898e4fc01 [AArch64][GlobalISel] Select XRO addressing mode with wide immediates
Port the wide immediate case from AArch64DAGToDAGISel::SelectAddrModeXRO.

If we have a wide immediate which can't be represented in an add, we can end up
with code like this:

```
mov  x0, imm
add x1, base, x0
ldr  x2, [x1, 0]
```

If we use the [base, xN] addressing mode instead, we can produce this:

```
mov  x0, imm
ldr  x2, [base, x0]
```

This saves 0.4% code size on 7zip at -O3, and gives a geomean code size
improvement of 0.1% on CTMark.

Differential Revision: https://reviews.llvm.org/D84784
2020-07-29 11:02:10 -07:00
David Sherwood
db7d870d70 [SVE][CodeGen] Add simple integer add tests for SVE tuple types
I have added tests to:

  CodeGen/AArch64/sve-intrinsics-int-arith.ll

for doing simple integer add operations on tuple types. Since these
tests introduced new warnings due to incorrect use of
getVectorNumElements() I have also fixed up these warnings in the
same patch. These fixes are:

1. In narrowExtractedVectorBinOp I have changed the code to bail out
early for scalable vector types, since we've not yet hit a case that
proves the optimisations are profitable for scalable vectors.
2. In DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS I have replaced
calls to getVectorNumElements with getVectorMinNumElements in cases
that work with scalable vectors. For the other cases I have added
asserts that the vector is not scalable because we should not be
using shuffle vectors and build vectors in such cases.

Differential revision: https://reviews.llvm.org/D84016
2020-07-29 13:32:10 +01:00
David Sherwood
6c7452123a [SVE] Add checks for no warnings in CodeGen/AArch64/sve-sext-zext.ll
Previous patches fixed up all the warnings in this test:

  llvm/test/CodeGen/AArch64/sve-sext-zext.ll

and this change simply checks that no new warnings are added in future.

Differential revision: https://reviews.llvm.org/D83205
2020-07-29 13:06:39 +01:00
Matt Arsenault
d26779aca3 GlobalISel: Translate llvm.convert.{to|from}.fp16 intrinsics
I think these were added as a workaround for SelectionDAG lacking half
legalization support in the past. I think they should probably be
removed from the IR, but clang does still have a target control to
emit these instead of the native half fpext/fptrunc.
2020-07-28 11:46:05 -04:00
Sander de Smalen
d04f687d3b [AArch64][SVE] Fix epilogue for SVE when the stack is realigned.
While deallocating the stackframe, the offset used to reload the
callee-saved registers was not pointing to the SVE callee-saves,
but rather to the whole SVE area.

   +--------------+
   | GRP callee   |
   |     saves    |
   +--------------+ <- FP
   | SVE callee   |
   |     saves    |
   +--------------+ <- Should restore SVE callee saves from here
   |  SVE Spills  |
   |  and Locals  |
   +--------------+ <- instead of from here.
   |              |
   :              :
   |              |
   +--------------+ <- SP

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D84539
2020-07-28 15:45:53 +01:00
Sander de Smalen
7093403ad8 [AArch64][SVE] Don't align the last SVE callee save.
Instead of aligning the last callee-saved-register slot to the stack
alignment (16 bytes), just align the SVE callee-saved block. This also
simplifies the code that allocates space for the callee-saves.

This change is needed to make sure the offset to which the callee-saved
register is spilled, corresponds to the offset used for e.g. unwind call
frame instructions.

Reviewers: efriedma, paulwalker-arm, david-arm, rengolin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84042
2020-07-28 15:45:53 +01:00
Sander de Smalen
275d51c545 [AArch64][SVE] Don't support fixedStack for SVE objects.
Fixed stack objects are preallocated and defined to be allocated before
any of the regular stack objects. These are normally used to model stack
arguments.

The AAPCS does not support passing SVE registers on the stack by value
(only by reference). The current layout also doesn't place them before
all stack objects, but rather before all SVE objects. Removing this
simplifies the code that emits the allocation/deallocation
around callee-saved registers (D84042).

This patch also removes all uses of fixedStack from from
framelayout-sve.mir, where this was used purely for testing purposes.

Reviewers: paulwalker-arm, efriedma, rengolin

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D84538
2020-07-28 15:45:53 +01:00
Francesco Petrogalli
79f86bfe26 [llvm][CodeGen] Addressing modes for SVE ldN.
Reviewers: c-rhodes, efriedma, sdesmalen

Subscribers: huihuiz, tschuett, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77251
2020-07-27 22:18:28 +00:00