1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

173982 Commits

Author SHA1 Message Date
Nico Weber
6295f7103b gn build: Merge r351627, r351548, r351701
llvm-svn: 351757
2019-01-21 18:56:39 +00:00
Pavel Labath
38ec165096 Fix compilation error with gcc 4.8
This version of gcc seems to be having issues with raw literals inside macro
arguments. I change the string to use regular string literals instead.

llvm-svn: 351756
2019-01-21 18:21:03 +00:00
Simon Pilgrim
b99a324a28 [X86][BtVer2] Update latency of mmx horizontal operations
D56777 added +1cy local forwarding penalty for horizontal operations, but this penalty only affects sse2/xmm variants, the mmx variants don't suffer the penalty.

Confirmed with @andreadb

llvm-svn: 351755
2019-01-21 18:04:25 +00:00
Sanjay Patel
965b5f410f [AArch64] add more tests for buildvec to shuffle transform; NFC
These are copied from the sibling x86 file. I'm not sure which
of the current outputs (if any) is considered optimal, but
someone more familiar with AArch may want to take a look.

llvm-svn: 351754
2019-01-21 17:46:35 +00:00
Sanjay Patel
0962f23352 [DAGCombiner] fix crash when converting build vector to shuffle
The regression test is reduced from the example shown in D56281.
This does raise a question as noted in the test file: do we want
to handle this pattern? I don't have a motivating example for
that on x86 yet, but it seems like we could have that pattern 
there too, so we could avoid the back-and-forth using a shuffle.

llvm-svn: 351753
2019-01-21 17:30:14 +00:00
Andrea Di Biagio
3e3ec46699 [X86][BtVer2] Update the WriteLoad latency.
r327630 introduced new write definitions for float/vector loads.
Before that revision, WriteLoad was used by both integer/float (scalar/vector)
load. So, WriteLoad had to conservatively declare a latency to 5cy. That is
because the load-to-use latency for float/vector load is 5cy.

Now that we have dedicated writes for float/vector loads, there is no reason why
we should keep the latency of WriteLoad to 5cy. At the moment, WriteLoad is only
used by scalar integer loads only; we can assume an optimstic 3cy latency for
them.
This patch changes that latency from 5cy to 3cy, and regenerates the affected
scheduling/mca tests.

Differential Revision: https://reviews.llvm.org/D56922

llvm-svn: 351742
2019-01-21 12:04:10 +00:00
Simon Pilgrim
4aacb0da3a [CostModel][X86] Add XOP icmp cost tests (PR40376)
llvm-svn: 351741
2019-01-21 11:33:52 +00:00
Dmitry Venikov
df9b821340 [llvm-symbolizer] Add -no-demangle as alias for -demangle=false
Summary: Provides -no-demangle as alias for -demangle=false. Motivation: https://bugs.llvm.org/show_bug.cgi?id=40075

Reviewers: jhenderson, ruiu

Reviewed By: jhenderson

Subscribers: erik.pilkington, rupprecht, llvm-commits

Differential Revision: https://reviews.llvm.org/D56773

llvm-svn: 351735
2019-01-21 10:00:57 +00:00
Chandler Carruth
d4f3796eeb Fix typos throughout the license files that somehow I and my reviewers
all missed!

Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.

llvm-svn: 351731
2019-01-21 09:52:34 +00:00
Craig Topper
d3ab842eb8 [X86] Remove and autoupgrade vpmovqd/vpmovwb intrinsics using trunc+select.
llvm-svn: 351729
2019-01-21 08:16:59 +00:00
Max Kazantsev
589ead7620 [NFC] Make getExpressionSize unsigned short
llvm-svn: 351727
2019-01-21 07:36:55 +00:00
Max Kazantsev
8a48aae360 [NFC] Fix warnings in unit test of r351725
llvm-svn: 351726
2019-01-21 07:27:47 +00:00
Max Kazantsev
f0c38d90c7 [SCEV][NFC] Introduces expression sizes estimation
This patch introduces the field `ExpressionSize` in SCEV. This field is
calculated only once on SCEV creation, and it represents the complexity of
this SCEV from arithmetical point of view (not from the point of the number
of actual different SCEV nodes that are used in the expression). Roughly
saying, it is the number of operands and operations symbols when we print this
SCEV.

A formal definition is following: if SCEV `X` has operands
  `Op1`, `Op2`, ..., `OpN`,
then
  Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN).
Size of SCEVConstant and SCEVUnknown is one.

Expression size may be used as a universal way to limit SCEV transformations
for huge SCEVs. Currently, we have a bunch of options that represents various
limits (such as recursion depth limit) that may not make any sense from the
point of view of a LLVM users who is not familiar with SCEV internals, and all
these different options pursue one goal. A more general rule that may
potentially allow us to get rid of this redundancy in options is "do not make
transformations with SCEVs of huge size". It can apply to all SCEV traversals
and transformations that may need to visit a SCEV node more than once, hence
they are prone to combinatorial explosions.

This patch only introduces SCEV sizes calculation as NFC, its utilization will
be introduced in follow-up patches.

Differential Revision: https://reviews.llvm.org/D35989
Reviewed By: reames

llvm-svn: 351725
2019-01-21 06:19:50 +00:00
Kito Cheng
28868a45d5 [RISCV] Add R_RISCV_RELAX relocation to all possible relax candidates.
Summary:
Add R_RISCV_RELAX relocation to all possible relax candidates and
update corresponding testcase.

Reviewers: asb, apazos

Differential Revision: https://reviews.llvm.org/D46677

llvm-svn: 351723
2019-01-21 05:27:09 +00:00
Dylan McKay
e3d6fdcccf [AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough
This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.

Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.

The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.

This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.

Thanks to Carl Peto for reporting this bug.

This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.

llvm-svn: 351721
2019-01-21 04:32:02 +00:00
Dylan McKay
b12b974df1 [AVR] Enable emission of debug information
Prior to this, the code was missing AVR-specific relocation logic in
RelocVisitor.h.

This patch teaches RelocVisitor about R_AVR_16 and R_AVR_32.

Debug information is emitted in the final object file, and understood by
'avr-readelf --debug-dump' from AVR-GCC.

llvm-dwarfdump is yet to understand how to dump AVR DWARF symbols.

llvm-svn: 351720
2019-01-21 04:27:08 +00:00
Dylan McKay
762baddef7 Revert "[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough"
This reverts commit r351718.

Carl pointed out that the unit test could be improved.

This patch will be recommitted once the test is made more resilient.

llvm-svn: 351719
2019-01-21 02:46:13 +00:00
Dylan McKay
3d00c7399b [AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough
This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.

Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.

The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.

This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.

Thanks to Carl Peto for reporting this bug.

This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.

llvm-svn: 351718
2019-01-21 02:44:09 +00:00
Serge Guelton
cc3b00cb60 Tentative fix for r351701 and gcc 6.2 build on ubuntu
llvm-svn: 351705
2019-01-20 23:06:45 +00:00
Serge Guelton
b46059d8bc Add missing test file
llvm-svn: 351702
2019-01-20 21:24:05 +00:00
Serge Guelton
b20ef5f960 Replace llvm::isPodLike<...> by llvm::is_trivially_copyable<...>
As noted in https://bugs.llvm.org/show_bug.cgi?id=36651, the specialization for
isPodLike<std::pair<...>> did not match the expectation of
std::is_trivially_copyable which makes the memcpy optimization invalid.

This patch renames the llvm::isPodLike trait into llvm::is_trivially_copyable.
Unfortunately std::is_trivially_copyable is not portable across compiler / STL
versions. So a portable version is provided too.

Note that the following specialization were invalid:

    std::pair<T0, T1>
    llvm::Optional<T>

Tests have been added to assert that former specialization are respected by the
standard usage of llvm::is_trivially_copyable, and that when a decent version
of std::is_trivially_copyable is available, llvm::is_trivially_copyable is
compared to std::is_trivially_copyable.

As of this patch, llvm::Optional is no longer considered trivially copyable,
even if T is. This is to be fixed in a later patch, as it has impact on a
long-running bug (see r347004)

Note that GCC warns about this UB, but this got silented by https://reviews.llvm.org/D50296.

Differential Revision: https://reviews.llvm.org/D54472

llvm-svn: 351701
2019-01-20 21:19:56 +00:00
Matt Arsenault
9c6f0e54a5 AMDGPU: Legalize more bitcasts
llvm-svn: 351700
2019-01-20 19:45:18 +00:00
Matt Arsenault
63509e5a64 GlobalISel: Add isPointer legality predicates
llvm-svn: 351699
2019-01-20 19:45:14 +00:00
Matt Arsenault
481e85cce0 AMDGPU/GlobalISel: Really legalize exts from i1
There is a combine that was hiding these tests
not actually testing what they should be, although
they were producing the expected end result.

llvm-svn: 351698
2019-01-20 19:28:20 +00:00
Simon Pilgrim
0b80866b7e [X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisons
This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.

Noticed while cleaning up vector integer comparison costs for PR40376.

llvm-svn: 351697
2019-01-20 19:27:40 +00:00
Matt Arsenault
12837ca780 GlobalISel: Implement widenScalar for basic FP ops
llvm-svn: 351696
2019-01-20 19:10:31 +00:00
Matt Arsenault
4a10662415 AMDGPU/GlobalISel: Legalize f32->f16 fptrunc
llvm-svn: 351695
2019-01-20 19:10:26 +00:00
Matt Arsenault
3d37d26708 AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_values
This was crashing in the predicate function assuming the value
is a vector.

Copy more of what AArch64 uses. This probably needs more refinement
later, but I don't exactly understand what it means in some cases,
particularly since any legalization for these seems to be missing.

llvm-svn: 351693
2019-01-20 18:40:36 +00:00
Matt Arsenault
af0f1330ee AMDGPU/GlobalISel: Regbank select for fpext
llvm-svn: 351692
2019-01-20 18:35:41 +00:00
Matt Arsenault
dc75546ec1 AMDGPU/GlobalISel: Cleanup legality for extensions
llvm-svn: 351691
2019-01-20 18:34:24 +00:00
Simon Pilgrim
ee885e4c8d [X86] Auto upgrade old style VPCOM/VPCOMU intrinsics to generic integer comparisons
We were upgrading these to the new style VPCOM/VPCOMU intrinsics (which includes the condition code immediate), but we'll be getting rid of those shortly, so convert these to generics first.

This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.

Noticed while cleaning up vector integer comparison costs for PR40376.

llvm-svn: 351690
2019-01-20 17:36:22 +00:00
Simon Pilgrim
c52d758f97 [X86] Replace VPCOM/VPCOMU with generic integer comparisons (llvm)
These intrinsics can always be replaced with generic integer comparisons without any regression in codegen, even for -O0/-fast-isel cases.

Noticed while cleaning up vector integer comparison costs for PR40376.

A future commit will remove/autoupgrade the existing VPCOM/VPCOMU llvm intrinsics.

llvm-svn: 351688
2019-01-20 16:40:44 +00:00
Simon Pilgrim
04482873c0 [CostModel][X86] Add explicit vector select costs
Prior to SSE41 (and sometimes on AVX1), vector select has to be performed as a ((X & C)|(Y & ~C)) bit select.

Exposes a couple of issues with the min/max reduction costs (which only go down to SSE42 for some reason).

The increase pre-SSE41 selection costs also prevent a couple of tests from firing any longer, so I've either tweaked the target or added AVX tests as well to the existing SSE2 tests.

llvm-svn: 351685
2019-01-20 13:55:01 +00:00
Simon Pilgrim
b7bbc260af [CostModel][X86] Add explicit fcmp costs for pre-SSE42 targets
Typical throughputs: cmpss/cmpps = 1cy and cmpsd/cmppd = 2cy before the Core2 era

llvm-svn: 351684
2019-01-20 13:21:43 +00:00
Simon Pilgrim
41ffe33de4 [TTI][X86] Reordered getCmpSelInstrCost cost tables in descending ISA order. NFCI.
Minor tidyup to make it clearer whats going on before adding additional costs.

llvm-svn: 351683
2019-01-20 12:28:13 +00:00
Simon Pilgrim
c8b9cc6e56 [CostModel][X86] Split icmp/fcmp costs tests and test all comparison codes
llvm-svn: 351682
2019-01-20 12:10:42 +00:00
Simon Pilgrim
12879a221f [CostModel][X86] Add masked load/store/gather/scatter tests for SSE2/SSE42/AVX1 targets
llvm-svn: 351681
2019-01-20 11:23:01 +00:00
Simon Pilgrim
97e4ea576a [CostModel][X86] Add non-constant vselect cost tests
Also add AVX512 costs at the same time

llvm-svn: 351680
2019-01-20 11:19:35 +00:00
Dylan McKay
b664dd279d [AVR] Remove unneeded XFAILs from the Generic CodeGen tests
These have been in place for quite a while now.

Several bugs have since been fixed, and these tests now pass.

llvm-svn: 351679
2019-01-20 11:16:58 +00:00
Dylan McKay
cafb601214 [AVR] Allow AVR to be explicitly set as the default target triple
This extends the CMake cross compilation logic so that AVR can be set as
the default target triple, and thus the generic codegen tests can be
run.

This used to be possible on AVR; the CMake configuration files have
since been changed.

With this patch, 'cmake -DLLVM_DEFAULT_TARGET_TRIPLE=avr-unknown-unknown' can
be passed on the command line, making the `-mcpu` argument redundant to
'llc' and friends.

llvm-svn: 351678
2019-01-20 11:12:39 +00:00
Dylan McKay
4c1ab02ba2 [AVR] Replace two references to ARM's 't2_so_imm' type comments
These were originally introduced in a copy-paste committed in r351526.

The reference to 't2_so_imm' have been updated to 'imm_com8' so the
comment is now accurate.

Thanks to Eli Friedman for noticing this.

llvm-svn: 351674
2019-01-20 03:45:29 +00:00
Dylan McKay
f2b2682fd1 [AVR] Fix codegen bug in 16-bit loads
Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:

    ld  $GPR8, [PTR:XYZ]+
    ld  $GPR8, [PTR]+1

This has a problem; the [PTR] is incremented in-place once, but never
decremented.

Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.

This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.

Bug first reported by Keshav Kini.

Patch by Kaushik Phatak.

llvm-svn: 351673
2019-01-20 03:41:08 +00:00
Dylan McKay
3d8e01aa8e Revert "[AVR] Fix codegen bug in 16-bit loads"
This reverts commit r351544.

In that commit, I had mistakenly misattributed the issue submitter as
the patch author, Kaushik Phatak.

The patch will be recommitted immediately with the correct attribution.

llvm-svn: 351672
2019-01-20 03:41:00 +00:00
Vedant Kumar
1e2c4114d8 [ConstantMerge] Factor out check for un-mergeable globals, NFC
llvm-svn: 351671
2019-01-20 02:44:43 +00:00
Eric Fiselier
fbb246b9e8 make XFAIL, REQUIRES, and UNSUPPORTED support multi-line expressions
llvm-svn: 351668
2019-01-20 00:51:02 +00:00
Craig Topper
59062585b0 [X86] Add masked MCVTSI2P/MCVTUI2P ISD opcodes to model the cvtqq2ps cvtuqq2ps nodes that produce less than 128-bits of results.
These nodes zero the upper half of the result and can't be represented with vselect.

llvm-svn: 351666
2019-01-19 21:26:20 +00:00
Martin Storsjo
62e6ab67bb [llvm-objcopy] [COFF] Implement --only-section
Differential Revision: https://reviews.llvm.org/D56873

llvm-svn: 351663
2019-01-19 19:42:54 +00:00
Martin Storsjo
2f371fd739 [llvm-objcopy] [COFF] Implement --only-keep-debug
Differential Revision: https://reviews.llvm.org/D56840

llvm-svn: 351662
2019-01-19 19:42:48 +00:00
Martin Storsjo
9c93f41cc7 [llvm-objcopy] [COFF] Implement --strip-debug
Also remove sections similarly for --strip-all, --discard-all,
--strip-unneeded.

Differential Revision: https://reviews.llvm.org/D56839

llvm-svn: 351661
2019-01-19 19:42:41 +00:00
Martin Storsjo
fc7767149a [llvm-objcopy] [COFF] Add support for removing sections
Differential Revision: https://reviews.llvm.org/D56683

llvm-svn: 351660
2019-01-19 19:42:35 +00:00