1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

145475 Commits

Author SHA1 Message Date
Xin Tong
dec22204f0 Fix a bug when unswitching on partial LIV for SwitchInst
Summary: Fix a bug when unswitching on partial LIV for SwitchInst.

Reviewers: hfinkel, efriedma, sanjoy

Reviewed By: sanjoy

Subscribers: david2050, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D29107

llvm-svn: 296363
2017-02-27 18:00:13 +00:00
Rong Xu
f5043842df Fix comments. NFC.
Change "Thin-LTO" to "ThinLTO" in the comments for consistency.

llvm-svn: 296362
2017-02-27 17:59:01 +00:00
Steven Wu
4d961ac155 Fix LLVM module build
Add WasmRelocs/WebAssembly.def to textual include header.

llvm-svn: 296356
2017-02-27 16:56:37 +00:00
Craig Topper
093751b7fa [X86] Use APInt instead of SmallBitVector tracking undef elements from getTargetConstantBitsFromNode and getConstVector.
Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30392

llvm-svn: 296355
2017-02-27 16:15:32 +00:00
Craig Topper
27f2fe5729 [X86] Use APInt instead of SmallBitVector for tracking Zeroable elements in shuffle lowering
Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30390

llvm-svn: 296354
2017-02-27 16:15:30 +00:00
Craig Topper
5eb13883a0 [X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap allocation
Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized.

Differential Revision: https://reviews.llvm.org/D30387

llvm-svn: 296353
2017-02-27 16:15:27 +00:00
Craig Topper
f2749ccb21 [X86] Use APInt instead of SmallBitVector for tracking undef elements in constant pool shuffle decoding
Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30386

llvm-svn: 296352
2017-02-27 16:15:25 +00:00
Amaury Sechet
dd0f513601 Remove an empty line in icmp-illegal.ll . NFC
llvm-svn: 296350
2017-02-27 16:09:44 +00:00
Alexey Bataev
5fe52fcb8f [SLP] A test for a fix of PR32038.
llvm-svn: 296349
2017-02-27 16:07:10 +00:00
Artur Pilipenko
fd7893d800 Loop predication expand both sides of the widened condition
This is a fix for a loop predication bug which resulted in malformed IR generation.

Loop invariant side of the widened condition is not guaranteed to be available in the preheader as is, so we need to expand it as well. See added unsigned_loop_0_to_n_hoist_length test for example.

Reviewed By: sanjoy, mkazantsev

Differential Revision: https://reviews.llvm.org/D30099

llvm-svn: 296345
2017-02-27 15:44:49 +00:00
Sjoerd Meijer
1544720770 AArch64InstPrinter: rewrite of printSysAlias
This is a cleanup/rewrite of the printSysAlias function. This was not using the
tablegen instruction descriptions, but was "manually" decoding the
instructions. This has been replaced with calls to lookup_XYZ_ByEncoding
tablegen calls.

This revealed several problems. First, instruction IVAU had the wrong encoding.
This was cancelled out by the parser that incorrectly matched the wrong
encoding. Second, instruction CVAP was missing from the SystemOperands tablegen
descriptions, so this has been added. And third, the required target features
were not captured in the tablegen descriptions, so support for this has also
been added.

Differential Revision: https://reviews.llvm.org/D30329

llvm-svn: 296343
2017-02-27 14:45:34 +00:00
John Brawn
3b54618ff1 [ARM] LSL #0 is an alias of MOV
Currently we handle this correctly in arm, but in thumb we don't which leads to
an unpredictable instruction being emitted for LSL #0 in an IT block and SP not
being permitted in some cases when it should be.

For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the
.td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to
get the IT handling right. We also need to adjust the handling of
MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We
should also adjust it to allow SP in the same way that it is allowed in
MOV rd, rn, but I haven't done that here because it looks like it would take
quite a lot of work to get right.

Additionally correct the selection of the 16-bit shift instructions in
processInstruction, where it was checking if the two registers were equal when
it should have been checking if they were low. It appears that previously this
code was never executed and the 16-bit encoding was selected by default, but
the other changes I've done here have somehow made it start being used.

Differential Revision: https://reviews.llvm.org/D30294

llvm-svn: 296342
2017-02-27 14:40:51 +00:00
Artur Pilipenko
79793a245a [DAGCombine] Fix for a load combine bug with non-zero offset patterns on BE targets
This pattern is essentially a i16 load from p+1 address:

  %p1.i16 = bitcast i8* %p to i16*
  %p2.i8 = getelementptr i8, i8* %p, i64 2
  %v1 = load i16, i16* %p1.i16
  %v2.i8 = load i8, i8* %p2.i8
  %v2 = zext i8 %v2.i8 to i16
  %v1.shl = shl i16 %v1, 8
  %res = or i16 %v1.shl, %v2

Current implementation would identify %v1 load as the first byte load and would mistakenly emit a i16 load from %p1.i16 address. This patch adds a check that the first byte is loaded from a non-zero offset of the first load address. This way this address can be used as the base address for the combined value. Otherwise just give up combining.

llvm-svn: 296336
2017-02-27 13:04:23 +00:00
Artur Pilipenko
f0609ae712 [DAGCombine] NFC. MatchLoadCombine extract MemoryByteOffset lambda helper
This refactoring will simplify the upcoming change to fix the bug in folding patterns with non-zero offsets on BE targets.

llvm-svn: 296332
2017-02-27 11:42:54 +00:00
Artur Pilipenko
82c830ce88 [DAGCombine] NFC. MatchLoadCombine remember the first byte provider, not the load node
This refactoring will simplify the upcoming change to fix a bug in folding patterns with non-zero offsets on BE targets.

llvm-svn: 296331
2017-02-27 11:40:14 +00:00
Sjoerd Meijer
a33acc6b40 AArch64AsmParser: don't try to parse “[1]” for non-vector register operands
There are no instructions that have "[1]" as part of the assembly string;
FMOVXDhighr is out of date. This removes dead code.

Differential Revision: https://reviews.llvm.org/D30165

llvm-svn: 296327
2017-02-27 10:51:11 +00:00
Konstantin Zhuravlyov
22783c58d6 [AMDGPU] Runtime metadata fixes:
- Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it:
    .amdgpu_runtime_metadata
    { amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ...
  - Make IsaInfo optional, and always emit it.

Differential Revision: https://reviews.llvm.org/D30349

llvm-svn: 296324
2017-02-27 07:55:17 +00:00
Brian Cain
bcfe1c3171 llvm-mc-fuzzer: add support for assembly
This creates an llvm-mc-disassemble-fuzzer from the existing llvm-mc-fuzzer
and finishing the assemble support in llvm-mc-assemble-fuzzer.

llvm-svn: 296323
2017-02-27 06:22:17 +00:00
Craig Topper
67aa87241c [APInt] Use UINT64_MAX instead of ~integerPart(0). NFC
llvm-svn: 296322
2017-02-27 06:05:33 +00:00
Craig Topper
8f6c9e9564 [X86] Check for less than 0 rather than explicit compare with -1. NFC
llvm-svn: 296321
2017-02-27 06:05:30 +00:00
Amaury Sechet
0f0c173f03 Do full codegen for various tests. NFC
llvm-svn: 296305
2017-02-27 01:15:57 +00:00
Craig Topper
3de998317b [APInt] Use UINT64_MAX instead of ~uint64_t(0ULL). NFC
llvm-svn: 296301
2017-02-26 21:15:18 +00:00
Craig Topper
5fc36545f5 [APInt] Use UINT64_MAX instead of ~0ULL. NFC
llvm-svn: 296300
2017-02-26 19:28:48 +00:00
Craig Topper
0d258134e5 [APInt] Remove unnecessary early out from getLowBitsSet. The same case is handled equally well by the next check.
llvm-svn: 296299
2017-02-26 19:28:45 +00:00
Xin Tong
076cefc8b6 Update comments. NFCI
llvm-svn: 296298
2017-02-26 19:08:44 +00:00
Daniel Jasper
25b14cabc6 Revert "[CGP] Split some critical edges coming out of indirect branches"
This reverts commit r296149 as it leads to crashes when compiling for
PPC.

llvm-svn: 296295
2017-02-26 11:09:12 +00:00
Davide Italiano
88dbb0f3a6 [LoopDeletion] Modernize and simplify a bit. NFCI.
llvm-svn: 296294
2017-02-26 07:08:20 +00:00
Craig Topper
eb222a1f20 [X86] Fix execution domain for cmpss/sd instructions.
llvm-svn: 296293
2017-02-26 06:45:59 +00:00
Craig Topper
07beea400f [AVX-512] Fix execution domain for scalar commutable min/max instructions.
llvm-svn: 296292
2017-02-26 06:45:56 +00:00
Craig Topper
cd2e98d13c [AVX-512] Fix execution domain for vmovhpd/lpd/hps/lps.
llvm-svn: 296291
2017-02-26 06:45:54 +00:00
Craig Topper
72832af4b5 [AVX-512] Fix the execution domain for AVX-512 integer broadcasts.
llvm-svn: 296290
2017-02-26 06:45:51 +00:00
Craig Topper
034f289e99 [AVX-512] Disable the redundant patterns in the VPBROADCASTBr_Alt and VPBROADCASTWr_Alt instructions. NFC
llvm-svn: 296289
2017-02-26 06:45:48 +00:00
Craig Topper
8602c69a33 [AVX-512] Fix execution domain for VPMADD52 instructions.
llvm-svn: 296288
2017-02-26 06:45:45 +00:00
Craig Topper
579c4f1248 [AVX-512] Use update_llc_test_checks.py to regenerate a test.
llvm-svn: 296287
2017-02-26 06:45:43 +00:00
Craig Topper
f9f50758ca [AVX-512] Fix the execution domain for VSCALEF instructions.
llvm-svn: 296286
2017-02-26 06:45:40 +00:00
Craig Topper
2ae557f8c7 [AVX-512] Fix execution domain of scalar VRANGE/REDUCE/GETMANT with sae.
llvm-svn: 296285
2017-02-26 06:45:37 +00:00
Craig Topper
8cee5eba56 [X86] Fix the execution domain for scalar SQRT intrinsic instruction.
llvm-svn: 296284
2017-02-26 06:45:35 +00:00
Craig Topper
631081c886 [X86] Add an additional CHECK prefix to a test. Some of the cases used it, but it wasn't on the FileCheck command lines.
llvm-svn: 296283
2017-02-26 06:45:32 +00:00
Xin Tong
d27a181482 [SCCP] Remove manual folding of terminator instructions.
Summary:
BranchInst, SwitchInst (with non-default case) with Undef as input is not
possible at this point. As we always default-fold terminator to one target in
ResolvedUndefsIn and set the input accordingly.

So we should only have constantint/blockaddress here.

If ConstantFoldTerminator fails, that could mean 2 things.

1. ConstantFoldTerminator is doing something unexpected, i.e. not folding on constantint
or blockaddress and not making blocks that should be dead dead.
2. This is not a terminator on constantint or blockaddress. Its on a constant or
overdefined, then this block should not be dead.

In both cases, we should assert.

Reviewers: davide, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30381

llvm-svn: 296281
2017-02-26 02:11:24 +00:00
David L. Jones
980fb8997c [X86] Clean up test/CodeGen/X86/2006-03-02-InstrSchedBug.ll
Summary:
Migrated from grep to FileCheck.
Re-indented code, removed boilerplate comments.
Added 'entry' label at beginning of basic block.

Patch by Jorge Gorbe!

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30320

llvm-svn: 296280
2017-02-26 01:32:35 +00:00
Nirav Dave
e1556d9e43 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r296252 until 256-bit operations are more efficiently generated in X86.

llvm-svn: 296279
2017-02-26 01:27:32 +00:00
Eric Christopher
9ee2dba169 vec perm can go down either pipeline on P8.
No observable changes, spotted while looking at the scheduling description.

llvm-svn: 296277
2017-02-26 00:11:58 +00:00
Sanjoy Das
97ea3c33f1 Fix signed-unsigned comparison warning
llvm-svn: 296274
2017-02-25 22:25:48 +00:00
Sanjoy Das
059733f666 [ValueTracking] Don't do an unchecked shift in ComputeNumSignBits
Summary:
Previously we used to return a bogus result, 0, for IR like `ashr %val,
-1`.

I've also added an assert checking that `ComputeNumSignBits` at least
returns 1.  That assert found an already checked in test case where we
were returning a bad result for `ashr %val, -1`.

Fixes PR32045.

Reviewers: spatel, majnemer

Reviewed By: spatel, majnemer

Subscribers: efriedma, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30311

llvm-svn: 296273
2017-02-25 20:30:45 +00:00
Simon Pilgrim
879a5b0804 [APInt] Add APInt::extractBits() method to extract APInt subrange (reapplied)
The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.

Differential Revision: https://reviews.llvm.org/D30336

llvm-svn: 296272
2017-02-25 20:01:58 +00:00
Craig Topper
730c9e7603 [AVX-512] Fix the execution domain for scalar FMA instructions.
llvm-svn: 296271
2017-02-25 19:36:28 +00:00
Craig Topper
c3fdaa59e0 [AVX-512] Fix the execution domain on some instructions.
llvm-svn: 296270
2017-02-25 19:18:11 +00:00
Craig Topper
d15be1812e [AVX-512] Add an additional test case to show the execution domain for vrqsrtsd is wrong.
llvm-svn: 296269
2017-02-25 19:18:08 +00:00
Craig Topper
c04d5045bf [AVX-512] Use update_llc_test_checks.py to regenerate the avx512er intrinsic test.
llvm-svn: 296268
2017-02-25 19:18:04 +00:00
Nirav Dave
6060bdbe57 reenable accidentally disabled test NFC.
llvm-svn: 296266
2017-02-25 19:11:53 +00:00