1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

199635 Commits

Author SHA1 Message Date
Nemanja Ivanovic
2e394d9166 [PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization
When legalizing shuffles, we make an attempt to combine it into
a PPC specific canonical form that avoids a need for a swap. If the
combine is successful, we RAUW the node and the custom legalization
replaces the now dead node instead of the one it should replace.
Remove that erroneous call to RAUW.
2020-07-06 22:09:28 -05:00
LLVM GN Syncbot
cd295c72b1 [gn build] Port 939d8309dbd 2020-07-07 02:20:39 +00:00
Valentin Clement
dbcd284c77 [openmp] Move isAllowedClauseForDirective to tablegen + add clause version to OMP.td
Summary:
Generate the isAllowedClauseForDirective function from tablegen. This patch introduce
the VersionedClause in the tablegen file so that clause can be encapsulated in this class to
specify a range of validity on a directive.

VersionedClause has default minVersion, maxVersion so it can be used without them or
minVersion.

Reviewers: jdoerfert, jdenny

Reviewed By: jdenny

Subscribers: yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82982
2020-07-06 22:20:06 -04:00
Xiang1 Zhang
535fe28ef4 [X86-64] Support Intel AMX Intrinsic
INTEL ADVANCED MATRIX EXTENSIONS (AMX).
AMX is a new programming paradigm, it has a set of 2-dimensional registers
(TILES) representing sub-arrays from a larger 2-dimensional memory image and
operate on TILES.

These intrinsics use direct TMM register number as its params.

Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D83111
2020-07-07 10:13:40 +08:00
Biplob Mishra
ff0e91295b [PowerPC] Implement Vector Splat Immediate Builtins in Clang
Implements builtins for the following prototypes:
  vector signed int vec_splati (const signed int);
  vector float vec_splati (const float);
  vector double vec_splatid (const float);
  vector signed int vec_splati_ins (vector signed int, const unsigned int,
                                    const signed int);
  vector unsigned int vec_splati_ins (vector unsigned int, const unsigned int,
                                      const unsigned int);
  vector float vec_splati_ins (vector float, const unsigned int, const float);

Differential Revision: https://reviews.llvm.org/D82520
2020-07-06 20:29:33 -05:00
Amy Kwan
77b45c8014 [PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE.
This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs.
We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if:
- Element size is 4 bytes
- The RHS is a constant vector (and constant splat of 4-bytes)
- The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks:
<0, 4-7, 2, 4-7>
<4-7, 1, 4-7, 3>

Differential Revision: https://reviews.llvm.org/D83245
2020-07-06 20:28:38 -05:00
David Blaikie
ffd524a549 [ModuloSchedule] Devirtualize PeelingModuloScheduleExpander::expand as it's not needed
The use case is out of tree code deriving from this class - but without
a need to use the base class polymorphically, so skip the virtualization
and virtual dtor.

Post-commit review from 50ac7ce94f34c5f43b02185ae0c33e150e78b044
2020-07-06 18:05:32 -07:00
Jordan Rupprecht
4e156763e2 Revert "[LV] Enable the LoopVectorizer to create pointer inductions"
This reverts commit a8fe12065ec8137e55a6a8b35dd5355477c2ac16.

It causes a crash when building gzip. Will post the detailed reduced test case to D81267.
2020-07-06 17:50:38 -07:00
LLVM GN Syncbot
5d22c8ecf0 [gn build] Port 05f2b5ccfc5 2020-07-07 00:37:49 +00:00
LLVM GN Syncbot
16f6904f84 [gn build] Port 2020-07-07 00:37:49 +00:00
Nico Weber
ae4d21baa4 fix typos to cycle bots 2020-07-06 20:37:11 -04:00
Wolfgang Pieb
4a082e6416 Correct 3 spelling errors in headers and doc strings. 2020-07-06 17:27:51 -07:00
Sanjay Patel
8380959612 [DAGCombiner] reassociate reciprocal sqrt expression to eliminate FP division
X / (fabs(A) * sqrt(Z)) --> X / sqrt(A*A*Z) --> X * rsqrt(A*A*Z)

In the motivating case from PR46406:
https://bugs.llvm.org/show_bug.cgi?id=46406
...this is restoring the sequence that was originally in the source code.
We extracted a term from within the sqrt because we do not know in
instcombine whether a target will expand a sqrt call.
Note: we could say that the transform in IR should be restricted, but
that would not solve the problem if the source was originally in the
pattern shown here.

This is a gray area for fast-math-flag requirements. I think we should at
least check fast-math-flags on the fdiv and fmul because I view this
transform as 2 pieces: reassociate the fmul operands and form reciprocal
from the fdiv (as with the existing transform). We could argue that the
sqrt also needs FMF, but that was not required before, so we should change
that in a follow-up patch if that seems better.

We don't currently have a way to check that the target will produce a sqrt
or recip estimate without actually creating nodes (the APIs are SDValue
getSqrtEstimate() and SDValue getRecipEstimate()), so we clean up
speculatively created nodes if we are not able to create an estimate.
The x86 test with doubles verifies that we are not changing a test with
no estimate sequence.

Differential Revision: https://reviews.llvm.org/D82716
2020-07-06 19:12:21 -04:00
Eric Christopher
38ba3bf0b6 Temporarily Revert "[llvm-install-name-tool] Merge install-name options" as it breaks the objcopy build.
This reverts commit c143900a0851b2c7b7d52e4825c7f073b3474cf6.
2020-07-06 15:40:14 -07:00
Yuanfang Chen
cc889d9127 [NFC] change getLimitedCodeGenPipelineReason to static function 2020-07-06 15:39:27 -07:00
Roman Lebedev
abd4ea9318 [NFCI][llvm-reduce] ReduceOperandBundles: actually put Module forward-declaration back into llvm namespace 2020-07-07 01:32:26 +03:00
Roman Lebedev
40f6d0064d [llvm-reduce] Reducing call operand bundles
Summary:
This would have been marginally useful to me during/for rG7ea46aee3670981827c04df89b2c3a1cbdc7561b.

With ongoing migration to representing assumes via operand bundles on the assume, this will be gradually more useful.

Reviewers: nickdesaulniers, diegotf, dblaikie, george.burgess.iv, jdoerfert, Tyker

Reviewed By: nickdesaulniers

Subscribers: hiraditya, mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83177
2020-07-07 01:16:37 +03:00
Roman Lebedev
f52f143f6e [NFCI][IR] Introduce CallBase::Create() wrapper
Summary:
It is reasonably common to want to clone some call with different bundles.
Let's actually provide an interface to do that.

Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers

Reviewed By: nickdesaulniers

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83248
2020-07-07 01:16:36 +03:00
Sameer Arora
c3954a3cf0 [llvm-install-name-tool] Merge install-name options
This diff merges all options for llvm-install-name-tool under a single
function processLoadCommands. Also adds another test case for -add_rpath
option.

Test plan: make check-all

Reviewed by: jhenderson, alexshap, smeenai, Ktwu

Differential Revision: https://reviews.llvm.org/D82812
2020-07-06 15:15:20 -07:00
Roman Lebedev
522d679e59 [Scalarizer] Centralize instruction DCE
As reported in https://reviews.llvm.org/D83101#2133062
the new visitInsertElementInst()/visitExtractElementInst() functionality
is causing miscompiles (previously-crashing test added)

It is due to the fact how the infra of Scalarizer is dealing with DCE,
it was not updated or was it ready for such scalar value forwarding.
It always assumed that the moment we "scalarized" something,
it can go away, and did so with prejudice.

But that is no longer safe/okay to do.

Instead, let's prevent it from ever shooting itself into foot,
and let's just accumulate the instructions-to-be-deleted
in a vector, and collectively cleanup (those that are *actually* dead)
them all at the end.

All existing tests are not reporting any new garbage leftovers,
but maybe it's test coverage issue.
2020-07-07 01:12:51 +03:00
Eric Christopher
52fe422b68 Fix [-Werror,-Wsign-compare] in dominator unit test. 2020-07-06 14:50:13 -07:00
Bruno Ricci
127992925a [Support][NFC] Fix Wdocumentation warning in ADT/Bitfields.h
\tparam is used for template parameters instead of \param.
2020-07-06 22:41:40 +01:00
Stanislav Mekhanoshin
95821df464 [AMDGPU] Tweak getTypeLegalizationCost()
Even though wide vectors are legal they still cost more as we
will have to eventually split them. Not all operations can
be uniformly done on vector types.

Conservatively add the cost of splitting at least to 8 dwords,
which is our widest possible load.

We are more or less lying to cost mode with this change but
this can prevent vectorizer from creation of wide vectors which
results in RA problems for us.

Differential Revision: https://reviews.llvm.org/D83078
2020-07-06 14:07:48 -07:00
Matt Arsenault
463547bcbf AMDGPU/GlobalISel: Add types to special inputs
When passing special ABI inputs, we have no existing context for the
type to use.
2020-07-06 17:00:55 -04:00
Arlo Siemsen
07457fd3ce Add option LLVM_NM to allow specifying the location of the llvm-nm tool
The new option works like the existing LLVM_TABLEGEN, and
LLVM_CONFIG_PATH options.  Instead of building llvm-nm, the build uses
the executable defined by LLVM_NM.

This is useful for cross-compilation scenarios where the host cannot run
the cross-compiled tool, and recursing into another cmake build is not
an option (due to required DEFINE's, for example).

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D83022
2020-07-06 13:27:56 -07:00
Nicolai Hähnle
c554e8d06e DomTree: add private create{Child,Node} helpers
Summary:
Aside from unifying the code a bit, this change smooths the
transition to use of future "opaque generic block references"
in the type-erased dominator tree base class.

Change-Id: If924b092cc8561c4b6a7450fe79bc96df0e12472

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83086
2020-07-06 21:58:11 +02:00
Nicolai Hähnle
9c5e97b839 DomTree: Remove getRoots() accessor
Summary:
Avoid exposing details about how roots are stored. This enables subsequent
type-erasure changes.

v5:
- cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE

Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83085
2020-07-06 21:58:11 +02:00
Nicolai Hähnle
007aded190 DomTree: Remove the releaseMemory() method
Summary:
It is fully redundant with reset().

Change-Id: I25850b9f08eace757cf03cbb8780e970aca7f51a

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: wdng, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D83084
2020-07-06 21:58:11 +02:00
Nicolai Hähnle
d4eb4d6675 DomTree: Remove getChildren() accessor
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.

New methods are introduced to cover common access patterns.

Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83083
2020-07-06 21:58:11 +02:00
Wouter van Oortmerssen
850662dafb [WebAssembly] Added 64-bit memory.grow/size/copy/fill
This covers both the existing memory functions as well as the new bulk memory proposal.
Added new test files since changes where also required in the inputs.

Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit.

Differential Revision: https://reviews.llvm.org/D82821
2020-07-06 12:49:50 -07:00
Wouter van Oortmerssen
2d660a98ce [WebAssembly] 64-bit memory limits 2020-07-06 12:40:45 -07:00
Kazushi (Jam) Marukawa
475cbdd9be [VE] Support symbol with offset in assembly
Summary:
Change MCExpr to support Aurora VE's modifiers.  Change asmparser to use
existing MCExpr parser (parseExpression) to parse an expression contining
symbols with modifiers and offsets.  Also add several regression tests
of MC layer.

Reviewers: simoll, k-ishizaka

Reviewed By: simoll

Subscribers: hiraditya, llvm-commits

Tags: #llvm, #ve

Differential Revision: https://reviews.llvm.org/D83170
2020-07-07 04:16:51 +09:00
Kazushi (Jam) Marukawa
41dc1d54a5 [VE] Change to use isa
Summary: Change to use isa instead of dyn_cast to avoid a warning.

Reviewers: simoll, k-ishizaka

Reviewed By: simoll

Subscribers: hiraditya, llvm-commits

Tags: #llvm, #ve

Differential Revision: https://reviews.llvm.org/D83200
2020-07-07 03:48:49 +09:00
Matt Arsenault
6ca9a98429 AMDGPU: Don't ignore carry out user when expanding add_co_pseudo
This was resulting in a missing vreg def in the use select
instruction.

The output of the pseudo doesn't make sense, since it really shouldn't
have the vreg output in the first place, and instead an implicit scc
def to match the real scalar behavior.

We could have easier to understand tests if we selected scalar
versions of the [us]{add|sub}.with.overflow intrinsics.

This does still end up producing vector code in the end, since it gets
moved later.
2020-07-06 14:28:01 -04:00
Shuhong Liu
5279dece82 [AIX] Add system-aix to lit config file
Summary: This is a complementary patch to D82100 since the aix builbot is still running the unsupported test shtest-format-argv0. Add system-aix to the sub llvm-lit config.

Reviewers: daltenty, hubert.reinterpretcast

Reviewed By: hubert.reinterpretcast

Subscribers: delcypher, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82905
2020-07-06 12:54:12 -04:00
Luís Marques
0a0548042f [RISCV] Fold ADDIs into load/stores with nonzero offsets
We can often fold an ADDI into the offset of load/store instructions:

   (load (addi base, off1), off2) -> (load base, off1+off2)
   (store val, (addi base, off1), off2) -> (store val, base, off1+off2)

This is possible when the off1+off2 continues to fit the 12-bit immediate.
We remove the previous restriction where we would never fold the ADDIs if
the load/stores had nonzero offsets. We now do the fold the the resulting
constant still fits a 12-bit immediate, or if off1 is a variable's address
and we know based on that variable's alignment that off1+offs2 won't overflow.

Differential Revision: https://reviews.llvm.org/D79690
2020-07-06 17:32:57 +01:00
jasonliu
3b7308f12c [XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s)
Summary:

When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.

Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L

Differential Revision: https://reviews.llvm.org/D82481
2020-07-06 15:49:15 +00:00
Oliver Stannard
0a88afaed7 [Support] Fix formatted_raw_ostream for UTF-8
* The getLine and getColumn functions need to update the position, or
  they will return stale data for buffered streams. This fixes a bug in
  the clang -analyzer-checker-option-help option, which was not wrapping
  the help text correctly when stdout is not a TTY.
* If the stream contains multi-byte UTF-8 sequences, then the whole
  sequence needs to be considered to be a single character. This has the
  edge case that the buffer might fill up and be flushed part way
  through a character.
* If the stream contains East Asian wide characters, these will be
  rendered twice as wide as other characters, so we need to increase the
  column count to match.

This doesn't attempt to handle everything unicode can do (combining
characters, right-to-left markers, ...), but hopefully covers most
things likely to be common in messages and source code we might want to
print.

Differential revision: https://reviews.llvm.org/D76291
2020-07-06 16:18:15 +01:00
Roman Lebedev
c8b6177910 Reland "[ScalarEvolution] createSCEV(): recognize udiv/urem disguised as an sdiv/srem"
This reverts commit d3e3f36ff1151f565730977ac4f663a2ccee48ae,
which reverter the original commit 2c16100e6f72075564ea1f67fa5a82c269dafcd3,
but with polly tests now actually passing.
2020-07-06 18:00:22 +03:00
David Green
5c3a471846 [ARM] MVE FP16 cost adjustments
This adjusts the MVE fp16 cost model, similar to how we already do for
integer casts. It uses the base cost of 1 per cvt for most fp extend /
truncates, but adjusts it for loads and stores where we know that a
extending load has been used to get the load into the correct lane, and
only an MVE VCVTB is then needed.

Differential Revision: https://reviews.llvm.org/D81813
2020-07-06 15:57:51 +01:00
Mikhail Goncharov
c0ff9444d8 Revert "[ScalarEvolution] createSCEV(): recognize udiv/urem disguised as an sdiv/srem"
Summary:
This reverts commit 2c16100e6f72075564ea1f67fa5a82c269dafcd3.

ninja check-polly fails:
  Polly :: Isl/CodeGen/MemAccess/generate-all.ll
  Polly :: ScopInfo/multidim_srem.ll

Reviewers: kadircet, bollu

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83230
2020-07-06 16:41:59 +02:00
Florian Hahn
2341e47245 [LV] Pass dbgs() to verifyFunction call.
This is done in other places of the pass already and improves the output
on verification failure.
2020-07-06 15:09:20 +01:00
Sanjay Patel
9a3f0992c0 [x86] add tests for vector select with non-splat bit-test condition; NFC
Goes with D83181.
2020-07-06 09:50:47 -04:00
David Green
0c1fba645a [ARM] Adjust default fp extend and trunc costs
This adds some default costs for fp extends and truncates, generally
costing them as 1 per lane. If the type is not legal then the cost will
include a call to an __aeabi_ function.

Some NEON code is also adjusted to make sure it applies to the expected
types, now that fp16 is a more common thing.

Differential Revision: https://reviews.llvm.org/D82458
2020-07-06 14:23:17 +01:00
Matt Arsenault
44220c5748 GlobalISel: Move finalizeLowering call later
This matches the DAG behavior where this is called after the loop
checking for calls. The AMDGPU implementation depends on knowing if
there are calls in the function or not, so move this later.

Another problem is finalizeLowering is actually called twice; I was
seeing weird inconsistencies since the first call would produce
unexpected results and the second run would correct them in some
contexts. Since this requires disabling the verifier, and it's useful
to serialize the MIR immediately after selection, FinalizeISel should
probably not be a real pass.
2020-07-06 09:19:40 -04:00
Matt Arsenault
3bb38d40fc AMDGPU/GlobalISel: Don't emit code for unused kernel arguments 2020-07-06 09:04:06 -04:00
Matt Arsenault
e17fdb6b7f AMDGPU/GlobalISel: Fix hardcoded register number checks in test 2020-07-06 09:01:59 -04:00
Matt Arsenault
9a4e2cafde AMDGPU: Fix fixed ABI SGPR arguments
The default constructor wasn't setting isSet o the ArgDescriptor, so
while these had the value set, they were treated as missing. This only
ended up mattering in the indirect call case (and for regular calls in
GlobalISel, which current doesn't have a way to support the variable
ABI).
2020-07-06 09:01:18 -04:00
Matt Arsenault
25a6449fdc AMDGPU/GlobalISel: Add some missing return tests 2020-07-06 09:01:18 -04:00
Simon Pilgrim
7f40cfe2c5 [X86][XOP] Add XOP target vselect-pcmp tests
Noticed in the D83181 that XOP can probably do a lot more than other targets due to its vector shifts and vpcmov instructions
2020-07-06 13:58:26 +01:00