1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

173920 Commits

Author SHA1 Message Date
Craig Topper
e40654788f [X86] Remove mask parameter from avx512 pmultishiftqb intrinsics. Use select in IR instead.
Fixes PR40259

llvm-svn: 351035
2019-01-14 08:46:45 +00:00
Craig Topper
17da75c559 [X86] Add new test file that was supposed to go with r351028.
llvm-svn: 351034
2019-01-14 08:46:42 +00:00
Craig Topper
5224b99e99 [X86] Update type profile for DBPSADBW to indicate the immediate is an i8 not just any int.
Removes some type checks from X86GenDAGISel.inc

llvm-svn: 351033
2019-01-14 02:59:08 +00:00
Craig Topper
2ced6c1929 [X86] Remove unused intrinsic handlers. NFC
llvm-svn: 351032
2019-01-14 01:56:59 +00:00
Craig Topper
7ac83d1889 [X86] Remove FPCLASS intrinsic handler. Use INTR_TYPE_2OP instead. NFC
llvm-svn: 351031
2019-01-14 01:44:09 +00:00
Craig Topper
bc9d78eb23 [X86] Remove mask parameter from vpshufbitqmb intrinsics. Change result to a vXi1 vector.
The input mask can be represented with an AND in IR.

Fixes PR40258

llvm-svn: 351028
2019-01-14 00:03:50 +00:00
Simon Pilgrim
628a9582a1 [DAGCombiner] If add_sat(x,y) can't overflow -> add(x,y)
NOTE: We need more powerful signed overflow detection in computeOverflowKind
llvm-svn: 351026
2019-01-13 22:08:26 +00:00
Simon Pilgrim
880a2a6f9a Fix unused variable warning. NFCI.
llvm-svn: 351025
2019-01-13 21:53:12 +00:00
Simon Pilgrim
06eec96f3d [DAGCombiner] Some very basic add/sub saturation combines.
Handle combines with zero and constant canonicalization for adds.

llvm-svn: 351024
2019-01-13 21:50:24 +00:00
Simon Pilgrim
6aecaa822f [X86] Add some basic add/sub saturation combine tests.
The actual combines will be added in a future commit.

llvm-svn: 351023
2019-01-13 21:21:46 +00:00
Craig Topper
4ec63bdfbb [LegalizeDAG] Remove 'NeedInvert' code from expansion of BR_CC. Replace with an assert.
I accidentally triggered this code while doing some experiments and it doesn't look lke it could possibly work.

It calls 'getNOT' on a node that should be a CondCode.

I think to do this right we would need to swap the branch target and the fallthrough target. But that's not easy to do. Or we could create an explicit SetCC and feed that into a new BR_CC?

llvm-svn: 351022
2019-01-13 19:33:30 +00:00
Nikita Popov
c3ae4ac612 [X86] Rename overly verbose method; NFC
As suggested on D56636.

llvm-svn: 351021
2019-01-13 16:41:26 +00:00
James Y Knight
c73d346797 Remove TypeBuilder.h, and fix the few locations using it.
This shortcut mechanism for creating types was added 10 years ago, but
has seen almost no uptake since then, neither internally nor in
external projects.

The very small number of characters saved by using it does not seem
worth the mental overhead of an additional type-creation API, so,
delete it.

Differential Revision: https://reviews.llvm.org/D56573

llvm-svn: 351020
2019-01-13 16:09:28 +00:00
Craig Topper
9caf716659 [X86] Add more ISD nodes to handle masked versions of VCVT(T)PD2DQZ128/VCVT(T)PD2UDQZ128 which only produce 2 result elements and zeroes the upper elements.
We can't represent this properly with vselect like we normally do. We also have to update the instruction definition to use a VK2WM mask instead of VK4WM to represent this.

Fixes another case from PR34877

llvm-svn: 351018
2019-01-13 02:59:59 +00:00
Craig Topper
ab219b2ccb [X86] Add X86ISD::VMFPROUND to handle the masked case of VCVTPD2PSZ128 which only produces 2 result elements and zeroes the upper elements.
We can't represent this properly with vselect like we normally do. We also have to update the instruction definition to use a VK2WM mask instead of VK4WM to represent this.

Fixes another case from PR34877.

llvm-svn: 351017
2019-01-13 02:59:57 +00:00
Benjamin Kramer
59aee633f2 Give helper classes/functions local linkage. NFC.
llvm-svn: 351016
2019-01-12 18:36:22 +00:00
Simon Pilgrim
a89e77f814 [X86] More aggressive shuffle mask widening in combineExtractWithShuffle
Use demanded extract index to set most of the shuffle mask to undef, making it easier to widen and peek through.

llvm-svn: 351013
2019-01-12 16:38:56 +00:00
Sanjay Patel
fc97f679c1 [LoopVectorizer] give more advice in remark about failure to vectorize call
Something like this is requested by:
https://bugs.llvm.org/show_bug.cgi?id=40265
...and it seems like a common enough case that we should acknowledge it.

Differential Revision: https://reviews.llvm.org/D56551

llvm-svn: 351010
2019-01-12 15:27:15 +00:00
Stephen Kelly
551238bc76 [Algorithm] Add make_const_ref corresponding to make_const_ptr
Reviewers: aaron.ballman

Subscribers: dexonsmith, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D56622

llvm-svn: 351009
2019-01-12 15:23:30 +00:00
Sanjay Patel
5e4d1e9d1f [DAGCombiner] fold insert_subvector of insert_subvector
This pattern:

    t33: v8i32 = insert_subvector undef:v8i32, t35, Constant:i64<0>
  t21: v16i32 = insert_subvector undef:v16i32, t33, Constant:i64<0>

...shows up in PR33758:
https://bugs.llvm.org/show_bug.cgi?id=33758
...although this patch doesn't make any difference to the final result on that yet.

In the affected tests here, it looks like it just makes RA wiggle. But we might 
as well squash this to prevent it interfering with other pattern-matching.

Differential Revision:
https://reviews.llvm.org/D56604

llvm-svn: 351008
2019-01-12 15:12:28 +00:00
George Rimar
a7eabaeadb [llvm-objdump] - Change the output for --all-headers.
This is for https://bugs.llvm.org/show_bug.cgi?id=40008,

it starts printing the file headers when --all-headers is given and
do a minor cosmetic change.

Differential revision: https://reviews.llvm.org/D56588

llvm-svn: 351006
2019-01-12 12:17:24 +00:00
Simon Pilgrim
3875b71ab3 Use getShiftAmountTy for shift amounts.
llvm-svn: 351005
2019-01-12 12:00:43 +00:00
Nico Weber
f1fb3280a5 gn build: Unbreak Windows build
I didn't break all that much during upstreaming, just needs two small fixes:

- fix spelling of MCJITTests.def file
- make libLTO a shared_library to put it in bin/ on Windows where it is in the
  CMake build too

Differential Revision: https://reviews.llvm.org/D56630

llvm-svn: 351004
2019-01-12 11:56:47 +00:00
Nikita Popov
f092a215a6 [X86] Add more usub.sat vector tests; NFC
Add additional vXi32 and vXi64 tests.

llvm-svn: 351003
2019-01-12 11:43:04 +00:00
Simon Atanasyan
27ba728e62 [ORC][MIPS] Fill delay-slot after jr instruction
MIPS `jr` instruction uses a delay-slot. To escape execution of
arbitrary instruction we should either fill the delay-slot by `nop`
instruction or swap `jr` instruction and logically preceding
instruction. This fix implements the second method to generate a bit
more effective code.

llvm-svn: 351001
2019-01-12 11:12:08 +00:00
Simon Atanasyan
f33f561baa [ORC][MIPS] Setup t9 register and call function through this register
MIPS ABI states that every function must be called through jalr $t9. In
other words, a function expect that t9 register points to the beginning
of its code. A function uses this register to calculate offset to the
Global Offset Table and save it to the `gp` register.
```
lui   $gp, %hi(_gp_disp)
addiu $gp, %lo(_gp_disp)
addu  $gp, $gp, $t9
```

If `t9` and as a result `$gp` point to the wrong place the following code
loads incorrect value from GOT and passes control to invalid code.
```
lw    $v0,%call16(foo)($gp)
jalr  $t9
```

OrcMips32 and OrcMips64 writeResolverCode methods pass control to the
resolved address, but do not setup `$t9` before the call. The `t9` holds
value of the beginning of `resolver` code so any attempts to call
routines via GOT failed.

This change fixes the problem. The `OrcLazy/hidden-visibility.ll` test
starts to pass correctly. Before the change it fails on MIPS because the
`exitOnLazyCallThroughFailure` called from the resolver code could not
call libc routine `exit` via GOT.

Differential Revision: http://reviews.llvm.org/D56058

llvm-svn: 351000
2019-01-12 11:12:04 +00:00
Simon Pilgrim
18fb2a5700 [X86] Improve vXi64 ISD::ABS codegen with SSE41+
Make use of vblendvpd to select on the signbit

Differential Revision: https://reviews.llvm.org/D56544

llvm-svn: 350999
2019-01-12 10:28:12 +00:00
Simon Pilgrim
a209a30ce9 [X86][AARCH64] Improve ISD::ABS support
This patch takes some of the code from D49837 to allow us to enable ISD::ABS support for all SSE vector types.

Differential Revision: https://reviews.llvm.org/D56544

llvm-svn: 350998
2019-01-12 09:59:32 +00:00
Nikita Popov
3247e0488e Reapply "[DemandedBits] Use SetVector for Worklist"
DemandedBits currently uses a simple vector for the worklist, which
means that instructions may be inserted multiple times into it.
Especially in combination with the deep lattice, this may cause
instructions too be recomputed very often. To avoid this, switch
to a SetVector.

Reapplying with a smaller number of inline elements in the
SmallSetVector, to avoid running into the SmallDenseMap issue
described in D56455.

Differential Revision: https://reviews.llvm.org/D56362

llvm-svn: 350997
2019-01-12 09:09:15 +00:00
Martin Storsjo
80cf764e04 [llvm-objcopy] [COFF] Remove pointless Reader/Writer base classes. NFC.
These were copied as part of the original design from the ELF
backend, but aren't necessary at the moment.

Differential Revision: https://reviews.llvm.org/D56431

llvm-svn: 350996
2019-01-12 08:30:09 +00:00
Craig Topper
2d7873b599 [X86] Remove X86ISD::SELECT as its no longer used by any of our intrinsic lowering.
llvm-svn: 350995
2019-01-12 08:15:54 +00:00
Craig Topper
1958ee8a43 [X86] Add ISD node for masked version of CVTPS2PH.
The 128-bit input produces 64-bits of output and fills the upper 64-bits with 0. The mask only applies to the lower elements. But we can't represent this with a vselect like we normally do.

This also avoids the need to have a special X86ISD::SELECT when avx512bw isn't enabled since vselect v8i16 isn't legal there.

Fixes another instruction for PR34877.

llvm-svn: 350994
2019-01-12 08:05:12 +00:00
Alex Bradbury
9164425b9b [RISCV] Introduce codegen patterns for RV64M-only instructions
As discussed on llvm-dev
<http://lists.llvm.org/pipermail/llvm-dev/2018-December/128497.html>, we have
to be careful when trying to select the *w RV64M instructions. i32 is not a
legal type for RV64 in the RISC-V backend, so operations have been promoted by
the time they reach instruction selection. Information about whether the
operation was originally a 32-bit operations has been lost, and it's easy to
write incorrect patterns.

Similarly to the variable 32-bit shifts, a DAG combine on ANY_EXTEND will
produce a SIGN_EXTEND if this is likely to result in sdiv/udiv/urem being
selected (and so save instructions to sext/zext the input operands).

Differential Revision: https://reviews.llvm.org/D53230

llvm-svn: 350993
2019-01-12 07:43:06 +00:00
Alex Bradbury
4a042ffefe [RISCV] Add patterns for RV64I SLLW/SRLW/SRAW instructions
This restores support for selecting the SLLW/SRLW/SRAW instructions, which was
removed in rL348067 as the previous patterns made some unsafe assumptions.
Also see the related llvm-dev discussion
<http://lists.llvm.org/pipermail/llvm-dev/2018-December/128497.html>

Ultimately I didn't introduce a custom SelectionDAG node, but instead added a
DAG combine that inserts an AssertZext i5 on the shift amount for an i32
variable-length shift and also added an ANY_EXTEND DAG-combine which will
instead produce a SIGN_EXTEND for an i32 variable-length shift, increasing the
opportunity to safely select SLLW/SRLW/SRAW.

There are obviously different ways of addressing this (a number discussed in
the llvm-dev thread), so I'd welcome further feedback and comments.

Note that there are now some cases in
test/CodeGen/RISCV/rv64i-exhaustive-w-insts.ll where sraw/srlw/sllw is
selected even though sra/srl/sll could be used without any extra instructions.
Given both are semantically equivalent, there doesn't seem a good reason to
prefer one vs the other. Given that would require more logic to still select
sra/srl/sll in those cases, I've left it preferring the *w variants.

Differential Revision: https://reviews.llvm.org/D56264

llvm-svn: 350992
2019-01-12 07:32:31 +00:00
Craig Topper
1fbb0d9ea4 [X86] Remove unnecessary code from getMaskNode.
We no longer need to extend mask scalars before bitcasting them to vXi1. This was only needed for the truncate intrinsics. And was really a bug in our lowering of them.

llvm-svn: 350991
2019-01-12 06:13:44 +00:00
Craig Topper
5785c01760 [X86] When lowering v1i1/v2i1/v4i1/v8i1 load/store with avx512f, but not avx512dq, use v16i1 as the intermediate mask type instead of v8i1.
We still use i8 for the load/store type. So we need to convert to/from i16 to around the mask type.

By doing this we get an i8->i16 extload which we can then pattern match to a KMOVW if the access is aligned.

llvm-svn: 350989
2019-01-12 02:22:10 +00:00
Craig Topper
306ecda0a3 [X86] Change some patterns that select MOVZX16rm8 to instead select MOVZX32rm8 and extract the subregister.
This should be a shorter encoding and is consistent with what we do for zext i8->i16

llvm-svn: 350988
2019-01-12 02:22:06 +00:00
Evandro Menezes
888af9dc55 [ARM] Fix typo
Fix typo in r350952.

llvm-svn: 350986
2019-01-12 01:06:43 +00:00
Craig Topper
45b13032ad [X86] Add ISD nodes for masked truncate so we can properly represent when the output has more elements than the input due to needing to be 128 bits.
We can't properly represent this with a vselect since the upper elements of the result are supposed to be zeroed regardless of the mask.

This also reuses the new nodes even when the result type fits in 128 bits if the input is q/d and the result is w/b since vselect w/b using k-register condition isn't legal without avx512bw. Currently we're doing this even when avx512bw is enabled, but I might change that.

This fixes some of PR34877

llvm-svn: 350985
2019-01-12 00:55:27 +00:00
Peter Collingbourne
2b7752ef46 gn build: Add a stage2 toolchain for Android.
This makes it possible to build llvm-symbolizer for Android, which
is one of the prerequisites for running the sanitizer tests on Android.

Differential Revision: https://reviews.llvm.org/D56577

llvm-svn: 350979
2019-01-11 23:18:51 +00:00
Peter Collingbourne
2308749462 gn build: Create a template for unix toolchains.
Also change the toolchain description to use current_os instead of
host_os so that the template can be used for cross builds, and add
a current_os to the win toolchain to match the unix toolchain.

Differential Revision: https://reviews.llvm.org/D56576

llvm-svn: 350977
2019-01-11 22:57:57 +00:00
Evandro Menezes
949fd90c8b [AArch64] Improve Exynos predicates
Expand the predicate using shifted arithmetic and logic instructions to also
consider the respective not shifted instructions.

llvm-svn: 350976
2019-01-11 22:39:47 +00:00
Peter Collingbourne
1b191b21b0 gn build: Merge r350958.
llvm-svn: 350974
2019-01-11 22:15:53 +00:00
Nikita Popov
8654837372 [ConstantFolding] Fold undef for integer intrinsics
This fixes https://bugs.llvm.org/show_bug.cgi?id=40110.

This implements handling of undef operands for integer intrinsics in
ConstantFolding, in particular for the bitcounting intrinsics (ctpop,
cttz, ctlz), the with.overflow intrinsics, the saturating math
intrinsics and the funnel shift intrinsics.

The undef behavior follows what InstSimplify does for the general cas
e of non-constant operands. For the bitcount intrinsics (where
InstSimplify doesn't do undef handling -- there cannot be a combination
of an undef + non-constant operand) I'm using a 0 result if the intrinsic
is defined for zero and undef otherwise.

Differential Revision: https://reviews.llvm.org/D55950

llvm-svn: 350971
2019-01-11 21:18:00 +00:00
Alexey Bataev
044651774c [SLP]Moved NVPTX test under NVPTX directory, NFC.
llvm-svn: 350969
2019-01-11 20:42:48 +00:00
Alexey Bataev
2803b2ff92 [SLP]Update test checks for the SPL vectorizer, NFC.
llvm-svn: 350967
2019-01-11 20:21:14 +00:00
Nirav Dave
cc2a715462 [X86] Fix incomplete handling of register-assigned variables in parsing.
Teach x86 assembly operand parsing to distinguish between assembler
variable assigned to named registers and those assigned to immediate
values.

Reviewers: rnk, nickdesaulniers, void

Subscribers: hiraditya, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D56287

llvm-svn: 350966
2019-01-11 20:17:36 +00:00
Peter Collingbourne
44ac195a8a gn build: Create a variable for the host toolchain and start using it in the tblgen template.
Differential Revision: https://reviews.llvm.org/D56575

llvm-svn: 350964
2019-01-11 19:53:06 +00:00
Peter Collingbourne
9a277e9cce gn build: s/root_out_dir/root_build_dir/g in llvm/utils/gn/build/write_cmake_config.gni.
This makes the generated files go to the right place when using a non-default toolchain.

Differential Revision: https://reviews.llvm.org/D56427

llvm-svn: 350963
2019-01-11 19:51:49 +00:00
Alex Bradbury
fa2f6e25bc [RISCV][NFC] Add CHECK lines for atomic operations on RV64I
As or RV32I, we include these for completeness. Committing now to make it
easier to review the RV64A patch.

llvm-svn: 350962
2019-01-11 19:46:48 +00:00