1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

572 Commits

Author SHA1 Message Date
Craig Topper
6a7de7e3ad [X86] Teach fastisel to handle zext/sext i8->i16 and sext i1->i8/i16/i32/i64
Summary:
ZExt and SExt from i8 to i16 aren't implemented in the autogenerated fast isel table because normal isel does a zext/sext to 32-bits and a subreg extract to avoid a partial register write or false dependency on the upper bits of the destination. This means without handling in fast isel we end up triggering a fast isel abort.

We had no custom sign extend handling at all so while I was there I went ahead and implemented sext i1->i8/i16/i32/i64 which was also missing. This generates an i1->i8 sign extend using a mask with 1, then an 8-bit negate, then continues with a sext from i8. A better sequence would be a wider and/negate, but would require more custom code.

Fast isel tests are a mess and I couldn't find a good home for the tests so I created a new one.

The test pr34381.ll had to have fast-isel removed because it was relying on a fast isel abort to hit the bug. The test case still seems valid with fast-isel disabled though some of the instructions changed.

Reviewers: spatel, zvi, igorb, guyblank, RKSimon

Reviewed By: guyblank

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37320

llvm-svn: 312422
2017-09-02 18:53:46 +00:00
Craig Topper
fcc11275cd [X86] Remove some code from fast isel that is no longer needed with i1 being an illegal type.
llvm-svn: 312190
2017-08-30 23:05:54 +00:00
Craig Topper
b1c521d51b [X86] Remove unneed AVX512 check from fast isel.
This is no longer necessary now that i1 is illegal.

llvm-svn: 312146
2017-08-30 18:08:58 +00:00
Craig Topper
faa6f8b578 [AVX512] Remove leftover code for when i1 was a legal type from the fast isel load/store code.
Summary:
I don't think we need this code anymore. It only existed because i1 used to be legal.

There's probably more unneeded code in fast isel still.

Reviewers: guyblank, zvi

Reviewed By: guyblank

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36652

llvm-svn: 310843
2017-08-14 15:28:47 +00:00
Reid Kleckner
c1f69e6129 [X86] Teach fastisel to select calls to dllimport functions
Summary:
Direct calls to dllimport functions are very common Windows. We should
add them to the -O0 fast path.

Reviewers: rafael

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D36197

llvm-svn: 310152
2017-08-05 00:10:43 +00:00
Martin Storsjo
707a6e74b7 [AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well
Rename the enum value from X86_64_Win64 to plain Win64.

The symbol exposed in the textual IR is changed from 'x86_64_win64cc'
to 'win64cc', but the numeric value is kept, keeping support for
old bitcode.

Differential Revision: https://reviews.llvm.org/D34474

llvm-svn: 308208
2017-07-17 20:05:19 +00:00
Davide Italiano
228039ae12 [X86/FastIsel] Fall-back to SelectionDAG when lowering soft-floats.
FastIsel can't handle them, so we would end up crashing during
register class selection.
Fixes PR26522.

Differential Revision:  https://reviews.llvm.org/D35272

llvm-svn: 307797
2017-07-12 15:26:06 +00:00
Simon Pilgrim
e8d7a3a1f1 [X86][AVX1] Split 256-bit vector non-temporal FastISel loads to keep it non-temporal (PR32744)
Extension to D33728

llvm-svn: 304798
2017-06-06 14:18:39 +00:00
Guy Blank
4b4c372886 [X86][AVX512] Make i1 illegal in the CodeGen
This patch defines the i1 type as illegal in the X86 backend for AVX512.
For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended.
This should produce better scalar code for i1 types since GPRs will be used instead of mask registers.

Differential Revision: https://reviews.llvm.org/D32273

llvm-svn: 303421
2017-05-19 12:35:15 +00:00
Igor Breger
7faf279c36 [X86] Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp. NFC
Summary:
Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp so it can be used by GloabalIsel instruction selector.
This is a pre-commit for a patch I'm working on to support G_ICMP. NFC.

Reviewers: zvi, guyblank, delena

Reviewed By: guyblank, delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33038

llvm-svn: 302767
2017-05-11 06:36:37 +00:00
Serge Pavlov
b8ce9ec478 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Oren Ben Simhon
1f76347a16 [X86] Support of no_caller_saved_registers attribute
This patch implements the LLVM part for no_caller_saved_registers attribute as appears here: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=5ed3cc7b66af4758f7849ed6f65f4365be8223be.
In order to implement the attribute, we use the dynamic CSR mechanism to remove returned/passed arguments from the function regmask/CSR list.

Differential Revision: https://reviews.llvm.org/D31876

llvm-svn: 302020
2017-05-03 13:07:19 +00:00
Reid Kleckner
0dced4cc25 Use Argument::hasAttribute and AttributeList::ReturnIndex more
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.

NFC

llvm-svn: 301666
2017-04-28 18:37:16 +00:00
Krzysztof Parzyszek
ce1e95e40d Move size and alignment information of regclass to TargetRegisterInfo
1. RegisterClass::getSize() is split into two functions:
   - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
   - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
   - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783

llvm-svn: 301221
2017-04-24 18:55:33 +00:00
Reid Kleckner
4559e65607 [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367
2017-04-14 20:19:02 +00:00
Simon Pilgrim
786bd13a82 [X86][MMX] Add fast-isel support for MMX non-temporal writes
Differential Revision: https://reviews.llvm.org/D31754

llvm-svn: 299852
2017-04-10 16:58:07 +00:00
Craig Topper
d8e999615b [AVX-512] Fix bad comment from r299112. NFC
llvm-svn: 299114
2017-03-30 21:05:33 +00:00
Craig Topper
74c3cc2d1c [AVX-512] Fix another case where fastisel was generating a GR8 to VK1 copy. This time after calls returning i1.
Fixes PR32472.

llvm-svn: 299112
2017-03-30 21:02:52 +00:00
Craig Topper
c62d4ea689 [AVX-512] Punt on fast-isel of truncates to i1 when AVX512 is enabled.
We should be masking the value and emitting a register copy like we do in non-fast isel. Instead we were just updating the value map and emitting nothing.

After r298928 we started seeing cases where we would create a copy from GR8 to GR32 because the source register in a VK1 to GR32 copy was replaced by the GR8 going into a truncate.

This fixes PR32451.

llvm-svn: 298957
2017-03-28 23:20:37 +00:00
Craig Topper
d41951c118 [AVX-512] Fix accidental uses of AH/BH/CH/DH after copies to/from mask registers
We've had several bugs(PR32256, PR32241) recently that resulted from usages of AH/BH/CH/DH either before or after a copy to/from a mask register.

This ultimately occurs because we create COPY_TO_REGCLASS with VK1 and GR8. Then in CopyToFromAsymmetricReg in X86InstrInfo we find a 32-bit super register for the GR8 to emit the KMOV with. But as these tests are demonstrating, its possible for the GR8 register to be a high register and we end up doing an accidental extra or insert from bits 15:8.

I think the best way forward is to stop making copies directly between mask registers and GR8/GR16. Instead I think we should restrict to only copies between mask registers and GR32/GR64 and use EXTRACT_SUBREG/INSERT_SUBREG to handle the conversion from GR32 to GR16/8 or vice versa.

Unfortunately, this complicates fastisel a bit more now to create the subreg extracts where we used to create GR8 copies. We can probably make a helper function to bring down the repitition.

This does result in KMOVD being used for copies when BWI is available because we don't know the original mask register size. This caused a lot of deltas on tests because we have to split the checks for KMOVD vs KMOVW based on BWI.

Differential Revision: https://reviews.llvm.org/D30968

llvm-svn: 298928
2017-03-28 16:35:29 +00:00
Craig Topper
9982fc8657 [AVX-512] Pre-emptively fix more places in fastisel where we might copy a VK1 register into a AH/BH/CH/DH register.
llvm-svn: 297704
2017-03-14 04:18:25 +00:00
Craig Topper
d664044360 [AVX-512] Fix another case where we are copying from a mask register using AH/BH/CH/DH with fastisel.
Fixes PR32256. Still planning to do an audit for other possible cases.

llvm-svn: 297678
2017-03-13 21:58:54 +00:00
Craig Topper
d9b7c494e1 [AVX-512] Fix a bad use of a high GR8 register after copying from a mask register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask.
I'm pretty sure there are more problems lurking here. But I think this fixes PR32241.

I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again.

llvm-svn: 297574
2017-03-12 03:37:37 +00:00
Ayman Musa
b9a27f087a [X86] Fix creating vreg def after use.
llvm-svn: 296601
2017-03-01 10:20:48 +00:00
Ayman Musa
b06a1c0393 [X86][AVX] Disable VCVTSS2SD & VCVTSD2SS memory folding and fix the register class of their first input when creating node in fast-isel.
(Quick fix to buildbot failure after rL295940 commit).

llvm-svn: 295970
2017-02-23 13:15:44 +00:00
Craig Topper
94a3ec0053 [X86] Remove scalar logical op alias instructions. Just use COPY_FROM/TO_REGCLASS and the normal packed instructions instead
Summary:
This patch removes the scalar logical operation alias instructions. We can just use reg class copies and use the normal packed instructions instead. This removes the need for putting these instructions in the execution domain fixing tables as was done recently.

I removed the loadf64_128 and loadf32_128 patterns as DAG combine creates a narrower load for (extractelt (loadv4f32)) before we ever get to isel.

I plan to add similar patterns for AVX512DQ in a future commit to allow use of the larger register class when available.

Reviewers: spatel, delena, zvi, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27401

llvm-svn: 288771
2016-12-06 04:58:39 +00:00
Craig Topper
f2a4047172 [X86] Remove unnecessary explicit uses of .SimpleTy just to do an equality comparison. MVT's operator== already takes care of this. NFCI
llvm-svn: 288646
2016-12-05 06:09:55 +00:00
Craig Topper
dcf7fd0dd3 [AVX-512] Teach fast isel to handle 512-bit vector bitcasts.
llvm-svn: 288641
2016-12-05 05:50:51 +00:00
Craig Topper
1b341af8f4 [AVX-512] Teach fast isel to use masked compare and movss for handling scalar cmp and select sequence when AVX-512 is enabled. This matches the behavior of normal isel.
llvm-svn: 288636
2016-12-05 04:51:31 +00:00
Peter Collingbourne
bc87b9fd38 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Zvi Rackover
efb78f1751 [X86][FastISel] Assert that we are dealing with arithmetic with overflow intrinsics. NFC
llvm-svn: 286961
2016-11-15 13:50:35 +00:00
Zvi Rackover
c18fb7a8cd [X86][FastISel] Fix lowering of overflow result on AVX512 targets
Summary:
    Fix a case where the overflow value of type i1, which is legal on AVX512, was assigned to a VK1 register class.
    We always want this value to be assigned to a GPR since the overflow return value is lowered to a SETO instruction.

    Fixes pr30981.

    Reviewers: mkuper, igorb, craig.topper, guyblank, qcolombet

    Subscribers: qcolombet, llvm-commits

    Differential Revision: https://reviews.llvm.org/D26620

llvm-svn: 286958
2016-11-15 13:29:23 +00:00
Guy Blank
4de6e44893 [X86][FastISel] Use a COPY from K register to a GPR instead of a K operation
The KORTEST was introduced due to a bug where a TEST instruction used a K register.
but, turns out that the opposite case of KORTEST using a GPR is now happening

The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR.

Differential Revision: https://reviews.llvm.org/D24953

llvm-svn: 282580
2016-09-28 11:22:17 +00:00
Craig Topper
b3116ccd3a [AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double.
Still need to fix the register classes to allow the extended range of registers.

llvm-svn: 280682
2016-09-05 23:58:40 +00:00
Craig Topper
4ac30e51b2 [X86] Make some static arrays of opcodes const and shrink to uint16_t. NFC
llvm-svn: 280649
2016-09-05 07:14:21 +00:00
Guy Blank
6e9549a062 [AVX512][FastISel] Do not use K registers in TEST instructions
In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal.
Changed to using KORTEST when dealing with K regs.

Differential Revision: https://reviews.llvm.org/D23163

llvm-svn: 279393
2016-08-21 08:02:27 +00:00
Justin Bogner
507d362929 Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Justin Bogner
b5f5b0ef6d Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Matthias Braun
91722d430e MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
Nico Weber
37fadfd6e6 Teach fast isel about the win64 calling convention.
This mostly just works.

Vectorcall rets are still not supported.

The win64_eh test change is because fast isel doesn't use rsi for temporary
computations, so it doesn't need to be pushed. The test case I'm changing was
originally added to test pushes, but by now there are other test cases in that
file exercising that code path.

https://reviews.llvm.org/D22422

llvm-svn: 275607
2016-07-15 20:18:37 +00:00
Nico Weber
2cf597abfa Teach fast isel calls and rets about stdcall.
stdcall is callee-pop like thiscall, so the thiscall changes already did most
of the work for this.  This change only opts stdcall in and adds tests.

llvm-svn: 275414
2016-07-14 13:54:26 +00:00
Nico Weber
d4fcb8d00e Teach fast isel about thiscall (and callee-pop) calls.
http://reviews.llvm.org/D22315

llvm-svn: 275360
2016-07-14 01:52:51 +00:00
Nico Weber
4da55bdb13 Teach FastISel about thiscall (and, hence, about callee-pop).
http://reviews.llvm.org/D22115

llvm-svn: 275135
2016-07-12 01:30:35 +00:00
Elena Demikhovsky
aa7fc35df7 Re-commit of 274613.
The prev commit failed on compilation.
A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure.

llvm-svn: 274626
2016-07-06 14:15:43 +00:00
Elena Demikhovsky
2176251798 Reverted 274613 due to compilation failue.
llvm-svn: 274615
2016-07-06 09:11:49 +00:00
Elena Demikhovsky
a0b86f7044 AVX-512: Optimization for patterns with i1 scalar type
The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc".
I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction.
I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions.

This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173.

Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails).

Differential revision: http://reviews.llvm.org/D21956

llvm-svn: 274613
2016-07-06 09:01:20 +00:00
Rafael Espindola
5d2ec2042d Delete unused includes. NFC.
llvm-svn: 274225
2016-06-30 12:19:16 +00:00
Duncan P. N. Exon Smith
193410d6d7 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189
2016-06-30 00:01:54 +00:00
David Majnemer
f2b411238d Switch more loops to be range-based
This makes the code a little more concise, no functional change is
intended.

llvm-svn: 273644
2016-06-24 04:05:21 +00:00
Benjamin Kramer
e80783f62f Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00