1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

200974 Commits

Author SHA1 Message Date
Johannes Doerfert
dd2810924a [OpenMP] Allow traits for the OpenMP context selector isa
It was unclear what `isa` was supposed to mean so we did not provide any
traits for this context selector. With this patch we will allow *any*
string or identifier. We use the target attribute and target info to
determine if the trait matches. In other words, we will check if the
provided value is a target feature that is available (at the call site).

Fixes PR46338

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83281
2020-07-29 10:22:27 -05:00
Simon Wallis
8010bdcf94 [MachineCopyPropagation] BackwardPropagatableCopy: add check for hasOverlappingMultipleDef
In MachineCopyPropagation::BackwardPropagatableCopy(),
a check is added for multiple destination registers.

The copy propagation is avoided if the copied destination register
is the same register as another destination on the same instruction.

A new test is added.  This used to fail on ARM like this:
error: unpredictable instruction, RdHi and RdLo must be different
        umull   r9, r9, lr, r0

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D82638
2020-07-29 16:21:01 +01:00
Xing GUO
1f86714980 [DWARFYAML] Make the field names consistent with the DWARF spec. NFC.
This patch replaces 'AddrSize'/'SegSize' with
'AddressSize'/'SegmentSelectorSize'. NFC.
2020-07-29 23:10:08 +08:00
Sanjay Patel
af9b68aef4 [ConstantFolding] fold integer min/max intrinsics
If both operands are undef, return undef.
If one operand is undef, clamp to limit constant.
2020-07-29 11:01:13 -04:00
Sanjay Patel
b62b20180b [ConstantFolding] add tests for integer min/max intrinsics; NFC 2020-07-29 11:01:13 -04:00
Simon Pilgrim
e0d56f3aef [CostModel][X86] Add SSE costs for SMAX/SMIN/UMAX/UMIN intrinsics 2020-07-29 15:55:43 +01:00
Juneyoung Lee
3c2cfe5a9c [InstSimplify] add tests for expandCommutativeBinOp; NFC 2020-07-29 23:21:39 +09:00
Florian Hahn
63c884fbb4 [SCEVExpander] Add option to preserve LCSSA directly.
This patch teaches SCEVExpander to directly preserve LCSSA.

As it is currently, SCEV does not look through PHI nodes in loops,
as it might break LCSSA form. Once SCEVExpander can preserve
LCSSA form, it should be safe for SCEV to look through PHIs.

To preserve LCSSA form, this patch uses formLCSSAForInstructions
on operands of newly created instructions, if the definition is inside
a different loop than the new instruction.

The final value we return from expandCodeFor may also need LCSSA
phis, depending on the insert point. As no user for it exists there yet,
create a temporary instruction at the insert point, which can be passed
to formLCSSAForInstructions. This temporary instruction is removed
after LCSSA construction.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D71538
2020-07-29 15:07:37 +01:00
Sanjay Patel
8d20a499b4 [ConstantFolding] update test checks FP min/max intrinsics
There's a slight difference in functionality with the new CHECK lines:
before, we allowed either -0.0 or 0.0 for maxnum/minnum. That matches
the definition, but we should always get a deterministic result from
constant folding within the compiler, so now we assert that we got
the single expected result in all cases.
2020-07-29 09:43:33 -04:00
Simon Pilgrim
99ab22ce47 [CostModel][X86] Add SSE costs for ABS intrinsics 2020-07-29 14:33:59 +01:00
Victor Campos
9c2a0c2f38 [Driver][ARM] Disable unsupported features when nofp arch extension is used
A list of target features is disabled when there is no hardware
floating-point support. This is the case when one of the following
options is passed to clang:

 - -mfloat-abi=soft
 - -mfpu=none

This option list is missing, however, the extension "+nofp" that can be
specified in -march flags, such as "-march=armv8-a+nofp".

This patch also disables unsupported target features when nofp is passed
to -march.

Differential Revision: https://reviews.llvm.org/D82948
2020-07-29 14:13:22 +01:00
Simon Pilgrim
44836ba0cf [TTI] Move abs/smax/smin/umax/umin cost expansion to ICA getIntrinsicInstrCost variant
This will simplify target overrides, and matches what we do for most integer intrinsic costs.
2020-07-29 13:44:38 +01:00
David Green
acc205118e [ARM] Tune getCastInstrCost for extending masked loads and truncating masked stores
This patch uses the feature added in D79162 to fix the cost of a
sext/zext of a masked load, or a trunc for a masked store.
Previously, those were considered cheap or even free, but it's
not the case as we cannot split the load in the same way we would for
normal loads.

This updates the costs to better reflect reality, and adds a test for it
in test/Analysis/CostModel/ARM/cast.ll.

It also adds a vectorizer test that showcases the improvement: in some
cases, the vectorizer will now choose a smaller VF when
tail-predication is enabled, which results in better codegen. (Because
if it were to use a higher VF in those cases, the code we see above
would be generated, and the vmovs would block tail-predication later in
the process, resulting in very poor codegen overall)

Original Patch by Pierre van Houtryve

Differential Revision: https://reviews.llvm.org/D79163
2020-07-29 13:41:34 +01:00
David Green
49873f2449 [Analysis] TTI: Add CastContextHint for getCastInstrCost
Currently, getCastInstrCost has limited information about the cast it's
rating, often just the opcode and types.  Sometimes there is a context
instruction as well, but it isn't trustworthy: for instance, when the
vectorizer is rating a plan, it calls getCastInstrCost with the old
instructions when, in fact, it's trying to evaluate the cost of the
instruction post-vectorization.  Thus, the current system can get the
cost of certain casts incorrect as the correct cost can vary greatly
based on the context in which it's used.

For example, if the vectorizer queries getCastInstrCost to evaluate the
cost of a sext(load) with tail predication enabled, getCastInstrCost
will think it's free most of the time, but it's not always free. On ARM
MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar
situations can come up with how masked loads can be extended when being
split.

To fix that, this path adds a new parameter to getCastInstrCost to give
it a hint about the context of the cast. It adds a CastContextHint enum
which contains the type of the load/store being created by the
vectorizer - one for each of the types it can produce.

Original patch by Pierre van Houtryve

Differential Revision: https://reviews.llvm.org/D79162
2020-07-29 13:32:53 +01:00
David Sherwood
db7d870d70 [SVE][CodeGen] Add simple integer add tests for SVE tuple types
I have added tests to:

  CodeGen/AArch64/sve-intrinsics-int-arith.ll

for doing simple integer add operations on tuple types. Since these
tests introduced new warnings due to incorrect use of
getVectorNumElements() I have also fixed up these warnings in the
same patch. These fixes are:

1. In narrowExtractedVectorBinOp I have changed the code to bail out
early for scalable vector types, since we've not yet hit a case that
proves the optimisations are profitable for scalable vectors.
2. In DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS I have replaced
calls to getVectorNumElements with getVectorMinNumElements in cases
that work with scalable vectors. For the other cases I have added
asserts that the vector is not scalable because we should not be
using shuffle vectors and build vectors in such cases.

Differential revision: https://reviews.llvm.org/D84016
2020-07-29 13:32:10 +01:00
Sjoerd Meijer
8b22bd7d24 [ARM] Optimize immediate selection
Optimize some specific immediates selection by materializing them with sub/mvn
instructions as opposed to loading them from the constant pool.

Patch by Ben Shi, powerman1st@163.com.

Differential Revision: https://reviews.llvm.org/D83745
2020-07-29 13:29:17 +01:00
Matt Arsenault
e11e91bb11 AMDGPU/GlobalISel: Refactor special argument management 2020-07-29 08:27:31 -04:00
Matt Arsenault
72b7484d1d AMDGPU: Make saturating add/sub legal for DAG path 2020-07-29 08:27:31 -04:00
Matt Arsenault
2a62cae903 AMDGPU/GlobalISel: Select llvm.amdgcn.global.atomic.csub
Remove the custom node boilerplate. Not sure why this tried to handle
the LDS atomic stuff.
2020-07-29 08:27:31 -04:00
David Sherwood
6c7452123a [SVE] Add checks for no warnings in CodeGen/AArch64/sve-sext-zext.ll
Previous patches fixed up all the warnings in this test:

  llvm/test/CodeGen/AArch64/sve-sext-zext.ll

and this change simply checks that no new warnings are added in future.

Differential revision: https://reviews.llvm.org/D83205
2020-07-29 13:06:39 +01:00
David Sherwood
3a4cd9337d [CodeGen] Remove calls to getVectorNumElements in DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR
In DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR I have replaced
calls to getVectorNumElements with getVectorMinNumElements, since
this code path works for both fixed and scalable vector types. For
scalable vectors the index will be multiplied by VSCALE.

Fixes warnings in this test:

  sve-sext-zext.ll

Differential revision: https://reviews.llvm.org/D83198
2020-07-29 13:05:39 +01:00
Florian Hahn
d2a571702a [NewGVN] Require asserts for crashing tests.
Without asserts, it might take a long time for the tests to crash.
Only run them with assert builds.
2020-07-29 12:41:05 +01:00
Yevgeny Rouban
a2a9ffd73d [LoopSimplifyCFG] Delete landing pads in dead exit blocks
In addition to removing phi nodes this patch removes any
landing pad that the dead exit block might have. Without
this fix Verifier complains about a new switch instruction
jumps to a block with a landing pad.

Differential Revision: https://reviews.llvm.org/D84320
2020-07-29 18:36:51 +07:00
Pushpinder Singh
62153b67c4 [CMAKE] Fix 'clean' target not working
cmake was still considering the empty value of ${fake_version_inc}
even if it was not defined.

Reviewed By: vsapsai

Differential Revision: https://reviews.llvm.org/D82847
2020-07-29 07:34:24 -04:00
Simon Pilgrim
58535386bb [TTI] Add default cost expansion for abs/smax/smin/umax/umin intrinsics 2020-07-29 12:13:06 +01:00
Georgii Rymar
21d06fcc5e [llvm-readobj] - Move out the common code from printRelocations() methods.
This introduces the printRelocationsHelper() which now contains the common
code used by both GNU and LLVM output styles.

Differential revision: https://reviews.llvm.org/D83935
2020-07-29 13:52:02 +03:00
Simon Pilgrim
21418c85a1 [X86][SSE] getV4X86ShuffleImm8 - canonicalize broadcast masks
If the mask input to getV4X86ShuffleImm8 only refers to a single source element (+ undefs) then canonicalize to a full broadcast.

getV4X86ShuffleImm8 defaults to inline values for undefs, which can be useful for shuffle widening/narrowing but does leave SimplifyDemanded* calls thinking the shuffle depends on unnecessary elements.

I'm still investigating what we should do more generally to avoid these undemanded elements, but broadcast cases was a simpler win.
2020-07-29 11:32:44 +01:00
Xing GUO
2028670e5f [DWARFYAML][test] Make the check lines stricter. NFC.
This patch makes the check lines stricter.
2020-07-29 17:31:38 +08:00
Xing GUO
8f65f802cb [DWARFYAML] Replace uint*_t with yaml::Hex* in the 'debug_aranges' entry.
Normally, we use yaml::Hex* to describe the length, offsets,
address/segment size. NFC.
2020-07-29 16:43:21 +08:00
Juneyoung Lee
73676829de [InstCombine] Add tests for select(freeze(undef)); NFC 2020-07-29 15:27:09 +09:00
Azharuddin Mohammed
986289ad08 [ThinLTO] [test] cache.ll: Prevent Spotlight indexing of the output dir
The test output files whose atime is altered in the test were getting
accessed by Spotlight indexing on macOS, causing them to get an updated
atime and leading to the test not behaving as expected.

Reviewed By: jhenderson, steven_wu

Differential Revision: https://reviews.llvm.org/D84700
2020-07-28 21:21:58 -07:00
Ikhlas Ajbar
bba33c567e [Hexagon] Correct the order of operands when lowering funnel shift-left
This patch corrects the order of operands in the pattern that lowers fshl
in Hexagon.
2020-07-28 21:22:41 -05:00
Chuanqi Xu
6f59cc588e [NFC] Edit the comment in User::replaceUsesOfWith 2020-07-29 10:02:04 +08:00
Xing GUO
b2f650ab28 [llvm-readelf][test] Improve wording in the comments. NFC.
This patch addresses comments in D84640 (https://reviews.llvm.org/D84640#2178475).
2020-07-29 09:59:28 +08:00
Stefanos Baziotis
9255f8a1f0 [ADT][BitVector][NFC] Merge find_first_in() / find_first_unset_in()
We can implement find_first_unset_in() in the same function
if every BitWord we use is first flipped.

Differential Revision: https://reviews.llvm.org/D84717
2020-07-29 04:51:22 +03:00
Kang Zhang
6613a2ce88 [PowerPC] Add Def CR1 for MTFSFI_rec and MTFSF_rec 2020-07-29 01:47:23 +00:00
Matt Arsenault
f7d5ce97f6 AMDGPU: Optimize copies to exec with other insts after exec def
It's possible to have terminator instructions after a write to exec,
so skip over them to find it.
2020-07-28 21:34:50 -04:00
Matt Arsenault
e7d2d74837 AMDGPU/GlobalISel: Fix selecting llvm.amdgcn.s.getreg
This introduces the same bug llvm.amdgcn.s.setreg has where if the
user specified an immediate outside of the valid 16-bit range, it will
select into a verifier error.
2020-07-28 21:34:50 -04:00
Thomas Lively
e2b0ae5192 [WebAssembly] Remove intrinsics for SIMD widening ops
Instead, pattern match extends of extract_subvectors to generate
widening operations. Since extract_subvector is not a legal node, this
is implemented via a custom combine that recognizes extract_subvector
nodes before they are legalized. The combine produces custom ISD nodes
that are later pattern matched directly, just like the intrinsic was.

Also removes the clang builtins for these operations since the
instructions can now be generated from portable code sequences.

Differential Revision: https://reviews.llvm.org/D84556
2020-07-28 18:25:55 -07:00
Craig Topper
c95d9699fd [X86] Add FeatureCMPXCHG8B and FeatureSlowUAMem16 to 'lakemont' in X86.td
We already had CMPXCH8B feature on this CPU for the frontend so
this doesn't have much effect.

The FeatureSlowUAMem16 only matters if someone compiles with
-march=lakemont -msse which doesn't make sense, but is consistent
with all our pre-sse4.2 CPUs. Maybe the feature flag should be
FeatureFastUAMem16 and set on the newer CPUs instead.
2020-07-28 18:24:46 -07:00
Matt Arsenault
36ab9584ef AMDGPU: Don't assert in canInsertSelect
Currently GlobalISel doesn't force all VGPR phi operands to VGPRs, so
this hit a case where it was queried with a VGPR and SGPR. This could
arguably be a verifier error, but it's currently not.
2020-07-28 21:01:06 -04:00
Valentin Clement
a18d313f65 [openmp][openacc][NFC] Add wrapper for records in DirectiveEmitter
Add wrapper classes to to access record's fields. This makes it easier to
pass record information to the diverse functions for code generation.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D84612
2020-07-28 20:47:40 -04:00
Thomas Lively
adbadea361 [WebAssembly] Implement truncating vector stores
Rather than expanding truncating stores so that vectors are stored one
lane at a time, lower them to a sequence of instructions using
narrowing operations instead, when possible. Since the narrowing
operations have saturating semantics, but truncating stores require
truncation, mask the stored value to manually truncate it before
narrowing. Also, since narrowing is a binary operation, pass in the
original vector as the unused second argument.

Differential Revision: https://reviews.llvm.org/D84377
2020-07-28 17:46:45 -07:00
Matt Arsenault
3c21944921 AMDGPU: Don't assume call targets are registers
GlobalISel let through a call to null, which would then fold into the
source operand like any other inline immediate. The SelectionDAG
lowering deletes calls to null and undef as a workaround from before
calls were supported. We should probably drop the special handling
case in the DAG lowering now, since the middle end optimizers delete
null calls anyway.
2020-07-28 20:46:06 -04:00
Matt Arsenault
f9805ce38d AMDGPU: Handle a few missing cases in getAddrModeArguments 2020-07-28 20:22:38 -04:00
Matt Arsenault
9df19f7114 AMDGPU: Don't assume there is only one terminator copy
This would stop on the first in reverse order, failing the verifier if
there were more earlier in the block.
2020-07-28 20:22:38 -04:00
Matt Arsenault
a1d4d5badf AMDGPU: Fix verifier error on spilling partially defined SGPRs
This needs an implicit def of the super-register in case one of the
lanes isn't defined, similar to copyPhysReg (or the not-VGPR spill
case below). This showed up in GlobalISel testing since it currently
doesn't fold out many undef instructions.
2020-07-28 20:01:57 -04:00
Matt Arsenault
aaf6327167 AMDGPU: Serialize MFI spill fields
These should probably be inferred from the function on parse, but the
target specific infrastructure currently does not give you a way to do
this. SILowerSGPRSpills early exits without this reporting spills,
which makes it difficult to write a MIR test for.
2020-07-28 20:01:57 -04:00
Joel E. Denny
12c5f043b8 [FileCheck] Report captured variables
Report captured variables in input dumps and traces.  For example:

```
$ cat check
CHECK: hello [[WHAT:[a-z]+]]
CHECK: goodbye [[WHAT]]

$ FileCheck -dump-input=always -vv check < input |& tail -8
<<<<<<
           1: hello world
check:1'0     ^~~~~~~~~~~
check:1'1           ^~~~~ captured var "WHAT"
           2: goodbye world
check:2'0     ^~~~~~~~~~~~~
check:2'1                   with "WHAT" equal to "world"
>>>>>>

$ FileCheck -dump-input=never -vv check < input
check2:1:8: remark: CHECK: expected string found in input
CHECK: hello [[WHAT:[a-z]+]]
       ^
<stdin>:1:1: note: found here
hello world
^~~~~~~~~~~
<stdin>:1:7: note: captured var "WHAT"
hello world
      ^~~~~
check2:2:8: remark: CHECK: expected string found in input
CHECK: goodbye [[WHAT]]
       ^
<stdin>:2:1: note: found here
goodbye world
^~~~~~~~~~~~~
<stdin>:2:1: note: with "WHAT" equal to "world"
goodbye world
^
```

Reviewed By: thopre

Differential Revision: https://reviews.llvm.org/D83651
2020-07-28 19:15:18 -04:00
Joel E. Denny
7663136ac6 [FileCheck] Extend -dump-input with substitutions
Substitutions are already reported in the diagnostics appearing before
the input dump in the case of failed directives, and they're reported
in traces (produced by `-vv -dump-input=never`) in the case of
successful directives.  However, those reports are not always
convenient to view while investigating the input dump, so this patch
adds the substitution report to the input dump too.  For example:

```
$ cat check
CHECK: hello [[WHAT:[a-z]+]]
CHECK: [[VERB]] [[WHAT]]

$ FileCheck -vv -DVERB=goodbye check < input |& tail -8
<<<<<<
           1: hello world
check:1       ^~~~~~~~~~~
           2: goodbye word
check:2'0     X~~~~~~~~~~~ error: no match found
check:2'1                  with "VERB" equal to "goodbye"
check:2'2                  with "WHAT" equal to "world"
>>>>>>
```

Without this patch, the location reported for a substitution for a
directive match is the directive's full match range.  This location is
misleading as it implies the substitution itself matches that range.
This patch changes the reported location to just the match range start
to suggest the substitution is known at the start of the match.  (As
in the above example, input dumps don't mark any range for
substitutions.  The location info in that case simply identifies the
right line for the annotation.)

Reviewed By: mehdi_amini, thopre

Differential Revision: https://reviews.llvm.org/D83650
2020-07-28 19:15:18 -04:00