1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

1235 Commits

Author SHA1 Message Date
Chandler Carruth
29cc44c07b [x86] Fix nasty bug in the x86 backend that is essentially impossible to
hit from IR but creates a minefield for MI passes.

The x86 backend has fairly powerful logic to try and fold loads that
feed register operands to instructions into a memory operand on the
instruction. This is almost always a good thing, but there are specific
relocated loads that are only allowed to appear in specific
instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and
`addq`. This patch blocks folding of memory operands using this
relocation unless the target is in fact `addq`.

The particular relocation indicates why we simply don't hit this under
normal circumstances. This relocation is only used for TLS, and it gets
used in very specific ways in conjunction with %fs-relative addressing.
The result is that loads using this relocation are essentially never
eligible for folding into an instruction's memory operands. Unless, of
course, you have an MI pass that inserts usage of such a load. I have
exactly such an MI pass and was greeted by truly mysterious miscompiles
where the linker replaced my instruction with a completely garbage byte
sequence. Go team.

This is the only such relocation I'm aware of in x86, but there may be
others that need to be similarly restricted.

Fixes PR36165.

Differential Revision: https://reviews.llvm.org/D42732

llvm-svn: 324546
2018-02-07 23:59:14 +00:00
Craig Topper
fcf8112fe2 [X86] When doing callee save/restore for k-registers make sure we don't use KMOVQ on non-BWI targets
If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass.

Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug.

Fixes PR36256

Differential Revision: https://reviews.llvm.org/D42989

llvm-svn: 324533
2018-02-07 21:41:50 +00:00
Nirav Dave
c405b8636c [X86] Teach DAG unfoldMemoryOperand to reconvert CMPs to tests
Summary:
Copy MI-level cmp->test conversion to SelectionDAG-level memory unfold.
This fixes a regression from upcoming D41293 change.

Reviewers: craig.topper, RKSimon

Reviewed By: craig.topper

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42808

llvm-svn: 324261
2018-02-05 18:58:58 +00:00
Amaury Sechet
33cce86cf0 [X86] Avoid using high register trick for test instruction
Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.

Reviewers: craig.topper, niravd, spatel, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42646

llvm-svn: 323888
2018-01-31 16:48:54 +00:00
Eric Liu
96b3480268 Revert "[X86] Avoid using high register trick for test instruction"
This reverts commit r323690. This causes crash in llc. See the original commit thread for details.

llvm-svn: 323761
2018-01-30 14:18:33 +00:00
Amaury Sechet
90c261493f [X86] Avoid using high register trick for test instruction
Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.

The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 .

Reviewers: craig.topper, niravd, spatel, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42646

llvm-svn: 323690
2018-01-29 20:54:33 +00:00
Craig Topper
fae7861fa0 [X86] Name the MMX phaddd instruction with 3 Ds instead of just 2. NFC
llvm-svn: 323403
2018-01-25 04:45:32 +00:00
Craig Topper
cc7c2fbcc1 [X86] Remove 64/128/256 from MMX/SSE/AVX instruction names for overall consistency. NFC
MMX instrutions all start with MMX_ so the 64 isn't needed for disambigutation.
SSE/AVX1 instructions are assumed 128-bit so we don't need to say 128.
AVX2 instructions should use a Y to indicate 256-bits.

llvm-svn: 323402
2018-01-25 04:45:30 +00:00
Craig Topper
34467739ab [X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 for consistency and to maybe enable more regular expression compaction in the scheduler models. NFCI
llvm-svn: 323352
2018-01-24 17:58:51 +00:00
Craig Topper
dba9cc357a [X86] Rename 256-bit VFRCZ instructions to have the Y before the rr/rm to match other instructions. NFC
llvm-svn: 323304
2018-01-24 05:14:39 +00:00
Craig Topper
9c69554132 [X86] Move 'Int_' to the end of the name of the VCOMISS/VUCOMISS and instructions to get them picked up by the scheduler model regexs.
All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up.

llvm-svn: 323261
2018-01-23 21:37:51 +00:00
Craig Topper
407f25d133 [X86] Add missing MOVSX/MOVZX instructions to load folding tables.
I'm not sure there's any way to generate these folding cases especially the movzx ones since even the register form is never emitted by codegen.

I'm just adding them to remove the difference with the autogenerated version of the folding table.

llvm-svn: 323200
2018-01-23 14:09:22 +00:00
Marina Yatsina
a62d1432a7 Fix bug in commit 323096 exposed by test in test-suite-verify-machineinstrs-x86_64h-O3
Change-Id: I0a4b10d0d6c8de606d989c567ec07944ae283a87
llvm-svn: 323126
2018-01-22 15:31:05 +00:00
Marina Yatsina
3cb5f18fb0 Break false dependencies for POPCNT, LZCNT, TZCNT
Add POPCNT, LZCNT, TZCNT to the list of instructions that have false dependency.
Add a test to make sure BreakFalseDeps breaks the dependencies for these instructions.
Update affected tests.

This fixes bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869

This is the final of multiple patches that fix this bugzilla.
Most of the patches are intended at refactoring the existent code.

Reviews of the refactoring done to enable this change:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333

Differential Revision: https://reviews.llvm.org/D40334

Change-Id: If95cbf1a3f5c7dccff8f1b22ecb397542147303d
llvm-svn: 323096
2018-01-22 10:07:01 +00:00
Marina Yatsina
994e1aad23 Separate ExecutionDepsFix into 4 parts:
1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains).
2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings.
3. BreakFalseDeps - Breaks false dependencies.
4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks.

This also included the following changes to ExcecutionDepsFix original logic:
1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class.
2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class.

Additional changes in affected files:
1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate.
2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate.

Additional refactoring changes will follow.

This commit is (almost) NFC.
The only functional change is that now BreakFalseDeps will break dependency for all register classes.
Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet.
In a future commit several instructions (and tests) will be added.

This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40330

Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275
llvm-svn: 323087
2018-01-22 10:05:23 +00:00
Simon Pilgrim
8a7237fd4f Avoid Wparentheses warning.
llvm-svn: 322526
2018-01-15 22:40:06 +00:00
Simon Pilgrim
a8d108687b [X86][MMX] Add support for MMX zero vector creation
As mentioned on PR35869, (and came up recently on D41517) we don't create a MMX zero register via the PXOR but instead perform a spill to stack from a XMM zero register.

This patch adds support for direct MMX zero vector creation and should make it easier to add better constant vector creation in the future as well.

Differential Revision: https://reviews.llvm.org/D41908

llvm-svn: 322525
2018-01-15 22:32:40 +00:00
Simon Pilgrim
9a1d5eeb3e [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873)
Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW.

Differential Revision: https://reviews.llvm.org/D42042

llvm-svn: 322524
2018-01-15 22:18:45 +00:00
Jessica Paquette
3075bb4222 [MachineOutliner] AArch64: Handle instrs that use SP and will never need fixups
This commit does two things. Firstly, it adds a collection of flags which can
be passed along to the target to encode information about the MBB that an
instruction lives in to the outliner.

Second, it adds some of those flags to the AArch64 outliner in order to add
more stack instructions to the list of legal instructions that are handled
by the outliner. The two flags added check if

- There are calls in the MachineBasicBlock containing the instruction
- The link register is available in the entire block

If the link register is available and there are no calls, then a stack
instruction can always be outlined without fixups, regardless of what it is,
since in this case, the outliner will never modify the stack to create a
call or outlined frame.

The motivation for doing this was checking which instructions are most often
missed by the outliner. Instructions like, say

%sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy

are very common, but cannot be outlined in the case that the outliner might
modify the stack. This commit allows us to outline instructions like this.
  

llvm-svn: 322048
2018-01-09 00:26:18 +00:00
Craig Topper
df5dbc87a9 [X86] Add VSHUFF32X4 and similar instructions to load folding tables.
llvm-svn: 321978
2018-01-07 23:30:20 +00:00
Craig Topper
0141386f6a [X86] Add the 16 and 8-bit CRC32 instructions to the load folding tables.
llvm-svn: 321958
2018-01-07 06:48:20 +00:00
Craig Topper
769fb85b49 [X86] Correct the load folding flags for xmm fp->mmx conversion instructions.
The instructions that load 64-bits or an xmm register should be TB_NO_REVERSE to avoid the load being widened during unfold. The instructions that load 128-bits need to ensure 128-bit alignment.

llvm-svn: 321956
2018-01-07 06:24:30 +00:00
Craig Topper
fbb7173edc [X86] Add TB_NO_REVERSE to some scalar intrinsic instructions in the load folding table.
llvm-svn: 321955
2018-01-07 06:24:29 +00:00
Craig Topper
3dce50d8f0 [X86] Don't put any EVEX_B instructions in the tablegen generated load folding tables.
EVEX_B means different things for memory and register forms. The instructions should not be considered equivalent.

llvm-svn: 321954
2018-01-07 06:24:28 +00:00
Craig Topper
dfe08ed09d [X86] Add 128 and 256-bit VPOPCNTD/Q instructions to load folding tables.
llvm-svn: 321953
2018-01-07 06:24:27 +00:00
Craig Topper
f03febaf72 [X86] Add some 8 and 16-bit instructions to the load folding tables.
llvm-svn: 321952
2018-01-07 06:24:25 +00:00
Craig Topper
b7cd82f9cb [X86] Add EVEX vcvtph2ps to the load folding tables.
llvm-svn: 321951
2018-01-07 06:24:24 +00:00
Craig Topper
b7cde4edac [X86] Remove cvtps2ph xmm->xmm from store folding tables. Add the evex versions of cvtps2ph to the store folding tables.
The memory form of the xmm->xmm version only writes 64-bits. If we use it in the folding tables and its get used for a stack spill, only half the slot will be written. Then a reload may read all 128-bits which will pull in garbage. But without the spill the upper bits of the register would have been zero. By not folding we would preserve the zeros.

llvm-svn: 321950
2018-01-07 06:24:23 +00:00
Craig Topper
c8b420137b [X86] Add CMP8ri8 to load folding tables.
llvm-svn: 321949
2018-01-07 06:24:21 +00:00
Simon Pilgrim
679cc36c77 Remove superfluous break after a return. NFCI.
llvm-svn: 320941
2017-12-17 11:01:33 +00:00
Matthias Braun
ddd8ed6709 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Craig Topper
03fd9201c8 [X86] Rename some instructions that start with Int_ to have the _Int at the end.
This matches AVX512 version and is more consistent overall. And improves our scheduler models.

In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses.

llvm-svn: 320325
2017-12-10 19:47:56 +00:00
Craig Topper
b45529f305 [X86] Fix a few instructions that were named Z512 instead of just Z.
This makes things consistent with our normal instruction naming.

llvm-svn: 320316
2017-12-10 17:42:39 +00:00
Craig Topper
c0b0d99a9c [X86] Rename some instructions so that 'b' is added as a suffix instead of replacing an 'r'
llvm-svn: 320290
2017-12-10 09:14:38 +00:00
Craig Topper
0b0595cce7 [X86] Rename the rb form of scalar ADD/SUB/MUL/DIV to include _Int since they can only be selected by intrinsics.
llvm-svn: 320283
2017-12-10 04:07:28 +00:00
Francis Visoiu Mistrih
a2d7c39420 [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022
2017-12-07 10:40:31 +00:00
Hans Wennborg
038a8b425f Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Hans Wennborg
96b1a36cd4 Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 319706
2017-12-04 22:21:15 +00:00
Zachary Turner
dec9bd8187 Mark all library options as hidden.
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.

This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.

Differential Revision: https://reviews.llvm.org/D40674

llvm-svn: 319505
2017-12-01 00:53:10 +00:00
Reid Kleckner
caef969e5d XOR the frame pointer with the stack cookie when protecting the stack
Summary: This strengthens the guard and matches MSVC.

Reviewers: hans, etienneb

Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits

Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319490
2017-11-30 22:41:21 +00:00
Francis Visoiu Mistrih
961f3df27b [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187
2017-11-28 17:15:09 +00:00
Craig Topper
ccc25a5474 [X86] Don't invert NewCC variable while processing the jcc/setcc/cmovcc instructions in optimizeCompareInstr.
The NewCC variable is calculated outside of the loop that processes jcc/setcc/cmovcc instructions. If we invert it during the loop it can cause an incorrect value to be used by a later iteration. Instead only read it during the loop and use a new variable to store the possibly inverted value.

Fixes PR35399.

llvm-svn: 318934
2017-11-23 19:25:45 +00:00
Coby Tayree
836d1e6a37 [x86][icelake]vpclmulqdq introduction
an icelake promotion of pclmulqdq
Differential Revision: https://reviews.llvm.org/D40101

llvm-svn: 318741
2017-11-21 09:30:33 +00:00
Craig Topper
1561471af3 [X86] Add scalar register class versions of VRNDSCALE instructions and rename the existing versions to _Int.
This is consistent with out normal implementation of scalar instructions.

While there disable load folding for the patterns with IMPLICIT_DEF unless optimizing for size which is also our standard practice.

llvm-svn: 317977
2017-11-11 08:24:15 +00:00
Craig Topper
34c0da4315 [X86] Prevent fast isel from folding loads into the instructions listed in hasPartialRegUpdate.
This patch moves the check for opt size and hasPartialRegUpdate into the lower level implementation of foldMemoryOperandImpl to catch the entry point that fast isel uses.

We're still folding undef register instructions in AVX that we should also probably disable, but that's a problem for another patch.

Unfortunately, this requires reordering a bunch of functions which is why the diff is so large. I can do the function reordering separately if we want.

Differential Revision: https://reviews.llvm.org/D39402

llvm-svn: 317112
2017-11-01 18:10:06 +00:00
Craig Topper
232eac1fb9 [X86] Make AVX512_512_SET0 XMM16-31 lower to 128-bit XOR when AVX512VL is enabled. Use 128-bit VLX instruction when VLX is enabled.
Unfortunately, this weakens our ability to do domain fixing when AVX512DQ is not enabled, but it is consistent with our 256-bit behavior.

Maybe we should add custom handling to domain fixing to allow EVEX integer XOR/AND/OR/ANDN to switch to VEX encoded fp instructions if the high registers aren't being used?

llvm-svn: 316978
2017-10-31 06:01:04 +00:00
Craig Topper
bd4dda833e [X86] Rearrange code in X86InstrInfo.cpp to put all the foldMemoryOperandImpl methods together without partial/undef register handling in the middle. NFC
I have a future patch that wants to make use of the one of the partial functions in one of the earlier memory folding methods and the current ordering prevents that.

llvm-svn: 316883
2017-10-30 04:39:18 +00:00
Simon Pilgrim
58c86ea085 Strip trailing whitespace. NFCI.
llvm-svn: 316277
2017-10-21 20:40:49 +00:00
Simon Pilgrim
b78c13ea43 [X86][SSE] Add extractps/pextrd equivalence to domain tables
Differential Revision: https://reviews.llvm.org/D39135

llvm-svn: 316274
2017-10-21 20:19:48 +00:00
Craig Topper
e9842daa62 [X86] Add AVX512 versions of VCVTPD2PS to load folding tables.
llvm-svn: 315801
2017-10-14 05:55:43 +00:00