1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

192198 Commits

Author SHA1 Message Date
Tony
f119c347fa [AMDGPU] Update AMDGPUUsage with DWARF proposal
Summary:
- Add AMDGPU DWARF proposal.
- Add references for gfx10 ISA and SemVer.

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, aprantl, dstuttard, tpr, jfb, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70523
2020-02-19 15:30:53 -05:00
Sanjay Patel
2dc6c3489f [x86] add test for uint->fp with unsafe-fp-math (PR43609); NFC 2020-02-19 15:18:52 -05:00
Krzysztof Parzyszek
c28d8cf19b [Hexagon] Change HVX vector predicate types from v512/1024i1 to v64/128i1
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.

It may cause existing bitcode files to become invalid.

* Converting between vector predicates and vector registers must be
  done explicitly via vandvrt/vandqrt instructions (their intrinsics),
  i.e. (for 64-byte mode):
    %Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
    %V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)

  The conversion intrinsics are:
    declare  <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
    declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
    declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
    declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
  They are all pure.

* Vector predicate values cannot be loaded/stored directly. This directly
  reflects the architecture restriction. Loading and storing or vector
  predicates must be done indirectly via vector registers and explicit
  conversions via vandvrt/vandqrt instructions.
2020-02-19 14:14:56 -06:00
Nikita Popov
453ec55b92 Reapply [IRBuilder] Always respect inserter/folder
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.

There are some changes in OpenMP and Instrumentation tests, where
bitcasts now get constant folded. I've also highlighted one
InstCombine test which now finishes in two rather than three
iterations, thanks to new instructions being inserted into the
worklist.

Differential Revision: https://reviews.llvm.org/D74787
2020-02-19 20:51:38 +01:00
Bill Wendling
c708922cd0 Include static prof data when collecting loop BBs
Summary:
If the programmer adds static profile data to a branch---i.e. uses
"__builtin_expect()" or similar---then we should honor it. Otherwise,
"__builtin_expect()" is ignored in crucial situations. So we trust that
the programmer knows what they're doing until proven wrong.

Subscribers: hiraditya, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74809
2020-02-19 11:33:48 -08:00
Simon Pilgrim
d794c53322 [AMDGPU] Regenerate immediate constant tests 2020-02-19 18:58:44 +00:00
Simon Pilgrim
c00ec168b5 [UpdateTestChecks] Add support for '.' in ir function names
Will let us regenerate from amdgpu float constant tests
2020-02-19 18:58:44 +00:00
Louis Dionne
879ac1565d [CMake] Only detect the linker once in AddLLVM.cmake
Summary:
Otherwise, the build output contains a bunch of "Linker detection: <xxx>"
lines that are really redundant. We also make redundant calls to the
linker, although that is a smaller concern.

Reviewers: smeenai

Subscribers: mgorny, fedor.sergeev, jkorous, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68648
2020-02-19 13:53:38 -05:00
Bardia Mahjour
d2570e78cd [DDG] Data Dependence Graph - Graph Simplification
Summary:
This is the last functional patch affecting the representation of DDG.
Here we try to simplify the DDG to reduce the number of nodes and edges by
iteratively merging pairs of nodes that satisfy the following conditions,
until no such pair can be identified. A pair of nodes consisting of a and b
can be merged if:

    1. the only edge from a is a def-use edge to b and
    2. the only edge to b is a def-use edge from a and
    3. there is no cyclic edge from b to a and
    4. all instructions in a and b belong to the same basic block and
    5. both a and b are simple (single or multi instruction) nodes.

These criteria allow us to fold many uninteresting def-use edges that
commonly exist in the graph while avoiding the risk of introducing
dependencies that didn't exist before.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72350
2020-02-19 13:41:51 -05:00
Florian Hahn
05986e5408 Revert "[PatternMatch] Match XOR variant of unsigned-add overflow check."
This reverts commit e01a3d49c224d6f8a7afc01205b05b9deaa07afa.
and commit a6a585b8030b6e8d4c50c71f54a6addb21995fe0.

This causes a failure on GreenDragon:
http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/9597
2020-02-19 19:37:08 +01:00
Tyker
797dfd61ae [AssumeBundle] Add documentation for the operand bundles of an llvm.assume
Summary:
Operand bundles on an llvm.assume allows representing
assumptions that an attribute holds for a certain value at a certain position.
Operand bundles enable assumptions that are either hard or impossible to
represent as a boolean argument of an llvm.assume.

Reviewers: jdoerfert, fhahn, nlopes, reames, regehr, efriedma

Reviewed By: jdoerfert

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74209
2020-02-19 18:53:15 +01:00
Jonas Paulsson
9dcacebd4c [ValueTracking] Improve isKnownNonNaN() to recognize zero splats.
isKnownNonNaN() could not recognize a zero splat because that is a
ConstantAggregateZero which is-a ConstantData but not a ConstantDataVector.

Patch makes a ConstantAggregateZero return true.

Review: Thomas Lively

Differential Revision: https://reviews.llvm.org/D74263
2020-02-19 09:35:36 -08:00
Nico Weber
b49202c11d [gn build] use \bfoo\b instead of \<foo\> in sync script
\<foo\> is more correct, but since we use shell=True on Windows,
the < and > get interpreted as redirection operators.

Rather than adding cmd escaping, just use \bfoo\b, which is Good
Enough Often Enough.
2020-02-19 12:32:02 -05:00
LLVM GN Syncbot
1580d3f4b6 [gn build] Port a54d81f5979 2020-02-19 17:28:29 +00:00
Nico Weber
83f9342471 [gn build] Set up include_dirs for a54d81f597 (first checker in a subdir) 2020-02-19 12:24:01 -05:00
Craig Topper
d3629d756d [X86] Add DCI.isBeforeLegalize() check to the v64i1 constant splitting code in combineStore.
We only need to split after type legalization. If we're before
we can just use a wide store and type legalization will split it.

Add a v128i1 test to exercise it post type legalization.
2020-02-19 09:18:16 -08:00
Stanislav Mekhanoshin
62c0ed15d8 [AMDGPU] Fix assumption about LaneBitmask content
Yet another assumption about an actual LaneBitmask content
is fixed.

Differential Revision: https://reviews.llvm.org/D74805
2020-02-19 09:07:11 -08:00
Nikita Popov
ad15d536fe Revert "[IRBuilder] Always respect inserter/folder"
This reverts commit f12fb2d99b8dd0dbef1c79f1d401200150f2d0bd.

I missed some changes in instrumentation test cases.
2020-02-19 17:51:55 +01:00
Nikita Popov
b55118d592 [InstCombine] Fix removal from deferred instructions
Make sure we don't skip the Deferred.remove() call if the
instruction is not in the worklist. Both of those are separate.

We don't have any cases where deferred instructions get removed
right now, but may cause problems in the future.
2020-02-19 17:48:28 +01:00
Nikita Popov
e70b7afc11 [IRBuilder] Always respect inserter/folder
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.

There are some changes in OpenMP tests, where bitcasts now get
constant folded. I've also highlighted one InstCombine test which
now finishes in two rather than three iterations, thanks to new
instructions being inserted into the worklist.

Differential Revision: https://reviews.llvm.org/D74787
2020-02-19 17:44:43 +01:00
Simon Pilgrim
2a62d779bd [SystemZ] Regenerate risbg tests. NFCI.
Pre-commit for some upcoming SimplifyDemandedBits bitrotate handling.
2020-02-19 16:39:28 +00:00
Mikhail Maltsev
bf9a57c53c [ARM,MVE] Fix predicate types of some intrinsics
Summary:
Some predicated MVE intrinsics return a vector with element size
different from the input vector element size. In this case the
predicate must type correspond to the output vector type.

The following intrinsics use the incorrect predicate type:
* llvm.arm.mve.mull.int.predicated
* llvm.arm.mve.mull.poly.predicated
* llvm.arm.mve.vshll.imm.predicated

This patch fixes the issue.

Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74838
2020-02-19 16:24:54 +00:00
Cameron McInally
65ad000134 [AArch64][SVE] Add initial backend support for FP splat_vector
Differential Revision: https://reviews.llvm.org/D74632
2020-02-19 10:19:11 -06:00
Nico Weber
d3b8bade8b [gn build] revert e8e078c8bf7987
Now that I've updated ancient goma clients on the bots, this should
work. (Internal goma bug: b/139410332, fixed months ago.)
2020-02-19 11:11:25 -05:00
Jay Foad
92cac194bd [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fmul.legacy intrinsic
Reviewers: arsenm, rampitec, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74835
2020-02-19 16:01:30 +00:00
Stefan Pintilie
6d0e277aa7 [Hexagon][NFC] Rename VK_Hexagon_PCREL to VK_PCREL
On PowerPC we will soon need to use pcrel to indicate PC Relative addressing.
Renamed the Hexagon specific variant kind to a non target specific VK so that
it can be used on both Hexagon and PowerPC.

Differential Revision: https://reviews.llvm.org/D74788
2020-02-19 09:52:58 -06:00
Krzysztof Parzyszek
f9d00e5265 Add <128 x i1> as an intrinsic type 2020-02-19 09:38:13 -06:00
Florian Hahn
0c4f1f800d [CGP] Adjust CodeGen tests after e01a3d49c22 2020-02-19 16:05:00 +01:00
Florian Hahn
aaa658ee7c [PatternMatch] Match XOR variant of unsigned-add overflow check.
Instcombine folds (a + b <u a) to (a ^ -1 <u b) and that does not match
the expected pattern in CodeGenPerpare via UAddWithOverflow.

This causes a regression over Clang 7 on both X86 and AArch64:
https://gcc.godbolt.org/z/juhXYV

This patch extends UAddWithOverflow to also catch the XOR case, if the
XOR is only used in the ICMP. This covers just a single case, but I'd
like to make sure I am not missing anything before tackling the other
cases.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D74228
2020-02-19 15:25:18 +01:00
Matt Arsenault
9a28f067b2 AMDGPU/GlobalISel: Select MUBUF path for global atomic cmpxchg
I'm not sure why this isn't a pattern, but the DAG manually selects
this.
2020-02-19 06:19:22 -08:00
Pierre-vh
1c6d7d0286 [AArch64][ASMParser] Refuse equal source/destination for LDRAA/LDRAB
Differential Revision: https://reviews.llvm.org/D74822
2020-02-19 14:15:17 +00:00
Jay Foad
5aafdf1461 [TableGen] Diagnose undefined fields when generating searchable tables
Summary:
Previously TableGen would crash trying to print the undefined value as
an integer.

Change-Id: I3900071ceaa07c26acafb33bc49966d7d7a02828

Reviewers: nhaehnle

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74210
2020-02-19 14:03:48 +00:00
Miloš Stojanović
e54a43362b Recommit: "[llvm-exegesis] Improve error reporting in Assembler.cpp"
Summary: Commit 63bb9fee525f8f29fd9c2174fa7f15573c3d1fd7 was reverted in
7603bfb4b0a6a90137d47f0182a490fe54bf7ca3 because it broke builds that treat
warnings as errors.
This commit updates the calls to `assembleToStream()` in tests to check that
the return value is valid.

Original commit message:

Followup to D74084.
Replace the use of `report_fatal_error()` with returning the error to
`llvm-exegesis.cpp` and handling it there.

Differential Revision: https://reviews.llvm.org/D74325
2020-02-19 14:40:28 +01:00
Pavel Labath
24a3c9da1b ErrorTest: Break up "ErrorMatchers" test
This test was getting a bit long. Before adding more checks, group the
existing checks according to the matcher used, and break it up into
smaller tests.
2020-02-19 14:19:48 +01:00
Sam Parker
85e30822d2 [ARM][LowOverheadLoops] Check loop liveouts
Check that no Q-regs are live out of the loop, unless the instruction
within the loop is predicated on the vctp.

Differential Revision: https://reviews.llvm.org/D72713
2020-02-19 12:59:01 +00:00
David Green
9b8bbb6024 [ARM] VMLAVA reduction patterns
Similar to VADDV and VADDLV that have been added recently, this adds
lowering and patterns for VMLAV, VMLAVA, VMLALV and VMLALVA. They
perform the same roles as the add's, just folding a mul into the same
instruction (and so taking two inputs). As such, they need to be lowered
in the same way as the types are often not legal.

Differential Revision: https://reviews.llvm.org/D74390
2020-02-19 12:39:58 +00:00
Georgii Rymar
e522f14a05 [yaml2obj] - Change the order of implicitly created sections.
.dynsym and .dynstr are allocatable and therefore normally are placed
before non-allocatable .strtab, .shstrtab, .symtab sections.
But we are placing them after currently what creates a mix of
alloc/non-alloc sections and does not look normal.

Differential revision: https://reviews.llvm.org/D74756
2020-02-19 15:09:19 +03:00
Simon Pilgrim
ff008d47fc [AMDGPU] performCvtF32UByteNCombine - add SHL and SimplifyMultipleUseDemandedBits support
This is part of the work to remove SelectionDAG::GetDemandedBits and just use SimplifyMultipleUseDemandedBits.

Recent experiments raised some v_cvt_f32_ubyte*_e32 regressions, so I've added some additional abilities to performCvtF32UByteNCombine to help unpack byte data more aggressively.

We still don't remove all OR(SHL,SRL) patterns as some of the regenerated nodes don't get combined again, but we are getting closer.

Differential Revision: https://reviews.llvm.org/D74786
2020-02-19 11:45:57 +00:00
David Green
0faa13a991 [ARM] MVE VADDLV lowering
Following on from the extra VADDV lowering, this extends things to
handle VADDLV which allows summing values into a pair of i32 registers,
together treated as a i64. This needs to be done in DAGCombine too as
the types are otherwise illegal, which is a fairly simple addition on
top of the existing code.

There is also a VADDLVA instruction handled here, that adds the incoming
values from the two general purpose registers. As opposed to the
non-long version where we could just add patterns for add(x, VADDV), the
long version needs to handle this early before the i64 has being split
into too many pieces.

Differential Revision: https://reviews.llvm.org/D74224
2020-02-19 11:07:20 +00:00
Petar Avramovic
1e307610e1 [MIPS GlobalISel] Legalize non-power-of-2 and unaligned load and store
Custom legalize non-power-of-2 and unaligned load and store for MIPS32r5
and older, custom legalize non-power-of-2 load and store for MIPS32r6.

Don't attempt to combine non power of 2 loads or unaligned loads when
subtarget doesn't support them (MIPS32r5 and older).

Differential Revision: https://reviews.llvm.org/D74625
2020-02-19 12:02:27 +01:00
Petar Avramovic
7933e40c35 [MIPS GlobalISel] Select 4 byte unaligned load and store
Improve legality checks for load and store, 4 byte scalar
load and store are now legal for all subtargets.
During regbank selection 4 byte unaligned loads and stores
for MIPS32r5 and older get mapped to gprb.
Select 4 byte unaligned loads and stores for MIPS32r5.
Fix tests that unintentionally had unaligned load or store.

Differential Revision: https://reviews.llvm.org/D74624
2020-02-19 11:57:06 +01:00
Florian Hahn
476c0607dd [TargetLower] Update shouldFormOverflowOp check if math is used.
On some targets, like SPARC, forming overflow ops is only profitable if
the math result is used: https://godbolt.org/z/DxSmdB
This patch adds a new MathUsed parameter to allow the targets
to make the decision and defaults to only allowing it
if the math result is used. That is the conservative choice.

This patch also updates AArch64ISelLowering, X86ISelLowering,
ARMISelLowering.h, SystemZISelLowering.h to allow forming overflow
ops if the math result is not used. On those targets using the
overflow intrinsic for the overflow check only generates better code.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D74722
2020-02-19 11:28:33 +01:00
Kerry McLaughlin
0625b3476c [AArch64][SVE] Add SVE2 intrinsics for polynomial arithmetic
Summary:
Implements the following intrinsics:
 - @llvm.aarch64.sve.eorbt
 - @llvm.aarch64.sve.eortb
 - @llvm.aarch64.sve.pmullb.pair
 - @llvm.aarch64.sve.pmullt.pair

Reviewers: sdesmalen, c-rhodes, dancgr, cameron.mcinally, efriedma, rengolin

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74769
2020-02-19 10:12:50 +00:00
Djordje Todorovic
fdc5995043 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-02-19 11:12:26 +01:00
David Green
4f25d92f10 [ARM] Extra MVE VADDV reduction patterns
We already make use of the VADDV vector reduction instruction for cases
where the input and the output start out at the same type. The MVE
instruction however will sum into an i32, so if we are summing a v16i8
into an i32, we can still use the same instructions. In terms of IR,
this looks like a sext of a legal type (v16i8) into a very illegal type
(v16i32) and a vecreduce.add of that into the result. This means we have
to catch the pattern early in a DAG combine, producing a target VADDVs/u
node, where the signedness is now important.

This is the first part, handling VADDV and VADDVA. There are also
VADDVL/VADDVLA instructions, which are interesting because they sum into
a 64bit value. And VMLAV and VMLALV, which are interesting because they
also do a multiply of two values. It may look a little odd in places as
a result.

On it's own this will probably not do very much, as the vectorizer will
not produce this IR yet.

Differential Revision: https://reviews.llvm.org/D74218
2020-02-19 09:45:35 +00:00
Florian Hahn
004b543f47 [DebugInfo] Pass linux triple to tests requiring ELF.
The tests added in D74425/commit a71feda24ea092ec14474216532b3ce9883b81ab
fail with an assertion on macOS, as they seem to require ELF support.

Passing a linux triple ensures the object files are using ELF.

This fixes some GreenDragon failures.
2020-02-19 10:41:40 +01:00
Petar Avramovic
178d3e3189 [MIPS GlobalISel] RegBankSelect G_MERGE_VALUES and G_UNMERGE_VALUES
Consider large operands in G_MERGE_VALUES and G_UNMERGE_VALUES as
Ambiguous during regbank selection.
Introducing new InstType AmbiguousWithMergeOrUnmerge which will
allow us to recognize whether to narrow scalar or use s64:fprb.

This change exposed a bug when reusing data from TypeInfoForMF.
Thus when Instr is about to get destroyed (using narrow scalar)
clear its data in TypeInfoForMF. Internal data is saved based on
Instr's address, and it will no longer be valid.
Add detailed asserts for InstType and operand size.

Generate generic instructions instead of MIPS target instructions
during argument lowering and custom legalizer.
Select G_UNMERGE_VALUES and G_MERGE_VALUES when proper banks are
selected: {s32:gprb, s32:gprb, s64:fprb} for G_UNMERGE_VALUES and
{s64:fprb, s32:gprb, s32:gprb} for G_MERGE_VALUES.
Update tests. One improvement is when floating point argument in
gpr(or two gprs) gets passed to another function through gpr
unnecessary fpr-to-gpr moves are no longer generated.

Differential Revision: https://reviews.llvm.org/D74623
2020-02-19 10:09:52 +01:00
Florian Hahn
9008c8943b [CGP] Precommit tests for D74228. 2020-02-19 09:24:06 +01:00
Craig Topper
96e31c495a [X86] Remove vXi1 select optimization from LowerSELECT. Move it to DAG combine. 2020-02-19 00:00:55 -08:00
Craig Topper
7ffe043309 [X86] Handle splats in LowerBUILD_VECTORvXi1 by directly emitting scalar selects instead of deferring that to LowerSELECT.
LoweSELECT will detect the constant inputs and convert to scalar
selects, but we can do it directly here.

I might remove some of the code from LowerSELECT and move it to
DAG combine so doing this explicitly will make us less dependent
on it happening in lowering.
2020-02-18 22:39:30 -08:00