1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

212804 Commits

Author SHA1 Message Date
Krzysztof Parzyszek
37342b5ed3 [ObjectYAML] Handle Hexagon V68 2021-03-17 21:43:35 -05:00
Krzysztof Parzyszek
36b88f31f1 [Hexagon] Improve stack address base reuse for HVX spills
The offset in HVX loads/stores is only 4 bits long, so often an
extra register is needed to hold the address. Minimize the number
of such registers by "standardizing" the base addresses and reusing
preexisting base registers when replacing frame indices.
2021-03-17 21:22:56 -05:00
Krzysztof Parzyszek
433decb61c [Hexagon] Add more patterns for HVX loads and stores
In particular, add patterns for loads/stores to the stack
(with a frame index as address).
2021-03-17 21:01:52 -05:00
Chen Zheng
ee92fab59b [NFC] make XCOFF dwarf dump test run only on PowerPC target. 2021-03-17 21:59:47 -04:00
Chen Zheng
45a2a59fa5 [XCOFF][llvm-dwarfdump] llvm-dwarfdump support for XCOFF
Author: hubert.reinterpretcast, shchenz

Reviewed By: jasonliu, echristo

Differential Revision: https://reviews.llvm.org/D97186
2021-03-17 21:21:51 -04:00
Amara Emerson
779b9f5e51 [GlobalISel] Don't DCE LIFETIME_START/LIFETIME_END markers.
These are pseudos without any users, so DCE was killing them in the combiner.

Marking them as having side effects doesn't seem quite right since they don't.

Gives a nice 0.3% geomean size win on CTMark -Os.

Differential Revision: https://reviews.llvm.org/D98811
2021-03-17 18:02:08 -07:00
Carl Ritson
8298ce5ba4 [AMDGPU] Avoid unnecessary graph visits during WQM marking
Avoid revisiting nodes with the same set of defined lanes by
using a unified visited set which integrates lanes into the key.
This retains the intent of the original code by still revisiting
a subgraph if a different set of lanes is defined and hence
marking might progress differently.

Note: default size of the visited set has been confirmed to
cover >99% of invocations in large array of test shaders.

Reviewed By: piotr

Differential Revision: https://reviews.llvm.org/D98772
2021-03-18 10:00:41 +09:00
Joel E. Denny
37c0143419 [FileCheck] Fix redundant diagnostics due to numeric errors
Fixed substitution printing not to produce an empty diagnostic for
errors handled elsewhere.

Reviewed By: thopre

Differential Revision: https://reviews.llvm.org/D98088
2021-03-17 19:25:41 -04:00
Joel E. Denny
20f8c79d86 [FileCheck] Fix numeric error propagation
A more general name might be match-time error propagation.  That is,
it's conceivable we'll one day have non-numeric errors that require
the handling fixed by this patch.

Without this patch, FileCheck behaves as follows:

```
$ cat check
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]

$ FileCheck -vv -dump-input=never check < input
check:1:54: remark: implicit EOF: expected string found in input
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
                                                     ^
<stdin>:2:1: note: found here

^
check:1:15: error: unable to substitute variable or numeric expression: overflow error
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
              ^
$ echo $?
0
```

Notice that the exit status is 0 even though there's an error.
Moreover, FileCheck doesn't print the error diagnostic unless both
`-dump-input=never` and `-vv` are specified.

The same problem occurs when `CHECK-NOT` does have a match but a
capture fails due to overflow: exit status is 0, and no diagnostic is
printed unless both `-dump-input=never` and `-vv` are specified.  The
usefulness of capturing from `CHECK-NOT` is questionable, but this
case should certainly produce an error.

With this patch, FileCheck always includes the error diagnostic and
has non-zero exit status for the above examples.  It's conceivable
that this change will cause some existing tests to fail, but my
assumption is that they should fail.  Moreover, with nearly every
project enabled, this patch didn't produce additional `check-all`
failures for me.

This patch also extends input dumps to include such numeric error
diagnostics for both expected and excluded patterns.

As noted in fixmes in some of the tests added by this patch, this
patch worsens an existing issue with redundant diagnostics.  I'll fix
that bug in a subsequent patch.

Reviewed By: thopre, jhenderson

Differential Revision: https://reviews.llvm.org/D98086
2021-03-17 19:25:41 -04:00
Mike Rice
1bc9df98e9 [OPENMP51]Initial support for the use clause.
Added basic parsing/sema/serialization support for the 'use' clause.

Differential Revision: https://reviews.llvm.org/D98815
2021-03-17 15:46:14 -07:00
Arthur Eubanks
406d3b34cf Revert "[NewPM] Verify LoopAnalysisResults after a loop pass"
This reverts commit 6db3ab2903f42712f44000afb5aa467efbd25f35.

Causing too large of compile time regression.
2021-03-17 15:22:52 -07:00
Amara Emerson
ef2c540c6d [AArch64][GlobalISel] Fall back if disabling neon/fp in the translator.
The previous technique relied on early-exiting the legalizer predicate
initialization, leaving an empty rule table. That causes a fallback
for most instructions, but some have legacy rules defined like G_ZEXT
which can try continue, but then crash.

We should fall back earlier, in the translator, to avoid this issue.

Differential Revision: https://reviews.llvm.org/D98730
2021-03-17 15:08:08 -07:00
Steven Wu
0d2a8c0614 [Object][MachO] Handle end iterator in getSymbolType()
Fix a bug in MachOObjectFile::getSymbolType() that it is not checking if
the iterator is end() before deference the iterator. Instead, return
`Other` type, which aligns with the behavior of `llvm-nm`.

rdar://75291638

Reviewed By: davide, ab

Differential Revision: https://reviews.llvm.org/D98739
2021-03-17 15:06:45 -07:00
David Green
6698320324 [ARM] Add VREV MVE shuffle costs
This uses the shuffle mask cost from D98206 to give a better cost of MVE
VREV instructions. This helps especially in VectorCombine where the cost
of shuffles is used to reorder bitcasts, which this helps keep the phase
ordering test for fp16 reductions producing optimal code. The isVREVMask
has been moved to a header file to allow it to be used across target
transform and isel lowering.

Differential Revision: https://reviews.llvm.org/D98210
2021-03-17 21:21:43 +00:00
Arthur Eubanks
57267d06c3 [NewPM] Verify LoopAnalysisResults after a loop pass
All loop passes should preserve all analyses in LoopAnalysisResults. Add
checks for those.

Note that due to PR44815, we don't check LAR's ScalarEvolution.
Apparently calling SE.verify() can change its results.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98805
2021-03-17 13:37:22 -07:00
Ricky Taylor
7965564c8e [M68k] Forward declare getMCInstrBeads in one place
At the moment `getMCInstrBeads` is forward-declared in a few places,
bring this together into a single header file.

This was done as part of the disassembler work, since the disassembler
would otherwise add one more forward declaration.

Differential Revision: https://reviews.llvm.org/D98533
2021-03-17 13:31:27 -07:00
Ricky Taylor
762c21ab09 [M68k] Use fixed asm string for MxPseudo instructions
This is required because empty strings are not allowed when generating
the assembly parser tables.

Differential Revision: https://reviews.llvm.org/D98532
2021-03-17 13:31:27 -07:00
Pavel Iliin
9eb4ffff25 [NFC][AArch64] Add codegen tests for various csinc-cmp sequences. 2021-03-17 20:17:40 +00:00
Simon Pilgrim
54bae34f81 [X86][SSE] Add SSE2/SSE42 test coverage to urem combine tests
Noticed when reviewing D88785
2021-03-17 19:58:03 +00:00
Nico Weber
418733b2b4 [lld-link] emit an error when writing a PDB > 4 GiB
Maybe there's a way to make them work, but until I've investigated
if tools can consume large PDBs, erroring out is better than slowly
and silently consuming all available ram due to internal invariants
being violated.

(Patch to make writing larger files work at
https://bugs.chromium.org/p/chromium/issues/detail?id=1179085#c25
but I haven't had time to check if windbg & co can consume these
large PDBs. llvm-pdbutil can't, but we can fix that one at least :) )

Differential Revision: https://reviews.llvm.org/D98788
2021-03-17 15:15:08 -04:00
Philip Reames
2b6a185756 [LCSSA] Extract a utility for deciding if a new use requires a new lcssa phi [NFC]
(Triggered by a review comment on D98728, but otherwise unrelated.)
2021-03-17 12:14:01 -07:00
Craig Topper
be8cccd26d [RISCV] Use getTargetExtractSubreg and getTargetInsertSubreg to simplify some code. NFCI 2021-03-17 12:10:19 -07:00
Philip Reames
b2328e9cf3 [LICM] Fix a crash when sinking instructions w/token operands
It is not legal to form a phi node with token type. The generic LCSSA construction code handles this correctly - by not forming LCSSA for such cases - but the adhoc fixup implementation in LICM did not.

This was noticed in the context of PR49607, but can be demonstrated on ToT with the tweaked test case. This is not specific to gc.relocate btw, it also applies to usage of the preallocated family of intrinsics as well.

Differential Revision: https://reviews.llvm.org/D98728
2021-03-17 11:18:46 -07:00
Zakk Chen
03a2d56fa7 [RISCV] Update RVV shift intrinsic tests to use XLEN bit as shift amount.
Fix the unexpected of using op1's element type as shift amount type.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98501
2021-03-17 10:47:49 -07:00
David Green
f5b24f17f1 [TTI] Add a Mask to getShuffleCost
This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask
can be provided a more accurate cost can be provided by the backend.
For example VREV costs could be returned by the ARM backend. This should
be an NFC until then, laying the groundwork for that to be added.

Differential Revision: https://reviews.llvm.org/D98206
2021-03-17 17:46:26 +00:00
Craig Topper
2c438f2a59 [RISCV] Support masked load/store for fixed vectors.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98561
2021-03-17 10:26:15 -07:00
Stephen Tozer
fea97b90a1 Reapply "[DebugInfo] Handle multiple variable location operands in IR"
Fixed section of code that iterated through a SmallDenseMap and added
instructions in each iteration, causing non-deterministic code; replaced
SmallDenseMap with MapVector to prevent non-determinism.

This reverts commit 01ac6d1587e8613ba4278786e8341f8b492ac941.
2021-03-17 16:45:25 +00:00
Mike Rice
7fc0b10919 [OPENMP51]Initial support for the interop directive.
Added basic parsing/sema/serialization support for interop directive.
Support for the 'init' clause.

Differential Revision: https://reviews.llvm.org/D98558
2021-03-17 09:42:07 -07:00
Bardia Mahjour
487229f50e [CGSCC] Print CG node itself instead of its address
Fix the debug output from cgscc
2021-03-17 12:36:55 -04:00
Eric Astor
1deec5ad2e [ms] [llvm-ml] Allow the /Zs parameter as a synonym for -filetype=null
For ml.exe, /Zs implies a syntax check with no output files.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90061
2021-03-17 12:18:43 -04:00
LemonBoy
f587aff82a [LoopVectorize] Refine hasIrregularType predicate
The `hasIrregularType` predicate checks whether an array of N values of type Ty is "bitcast-compatible" with a <N x Ty> vector.
The previous check returned invalid results in some cases where there's some padding between the array elements: eg. a 4-element array of u7 values is considered as compatible with <4 x u7>, even though the vector is only loading/storing 28 bits instead of 32.

The problem causes LLVM to generate incorrect code for some targets: for AArch64 the vector loads/stores are lowered in terms of ubfx/bfi, effectively losing the top (N * padding bits).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D97465
2021-03-17 17:03:47 +01:00
David Green
1f8f3be18e [ARM] Use lrdsb for more thumb1 loads.
Given a sextload i16, we can usually generate "ldrsh [rn. rm]". If we
don't naturally have a rn, rm addressing mode, we can either generate
"ldrh [rn, #0]; sxth" or "mov rm, #0; ldrsh [rn. rm]".

We currently generate the first, always creating a sxth. They are both
the same number of instructions, but if we generate the second then the
mov #0 will likely be CSE'd or pulled out of a loop, etc.

This adjusts the ISel patterns to do that, creating a mov instead of a
sxth.

Differential Revision: https://reviews.llvm.org/D98693
2021-03-17 15:29:02 +00:00
Simon Pilgrim
b026ee6b22 [DAG] TargetLowering::isBinOp() - add ISD::SSUBSAT/USUBSAT
Add to the generic non-commutative binop list.
2021-03-17 14:51:00 +00:00
Paul Robinson
94332ca104 [RGT] RPCUtilsTest, replace un-executed EXPECT with unreachable
Unreachable code should be self-documented as unreachable.

Found by the Rotten Green Tests project.

Differential Revision: https://reviews.llvm.org/D98518
2021-03-17 07:37:21 -07:00
Alexey Lapshin
da19cde461 [llvm-objcopy][NFC] Move ownership keeping code into restoreStatOnFile().
The D93881 added functionality which preserve ownership for output file
if llvm-objcopy is called under root. That code was added into the place
where output file is created. The llvm-objcopy already has a function which
sets/restores rights/permissions for the output file.
That is the restoreStatOnFile() function. This patch moves code
(preserving ownershipping) into the restoreStatOnFile() function.

Differential Revision: https://reviews.llvm.org/D98511
2021-03-17 17:27:00 +03:00
Timotej Kapus
b99ed0163e [OCaml] Handle nullptr in Llvm.global_initializer
LLVMGetInitializer returns nullptr in case there is no initializer.
There is not much that can be done with nullptr in OCaml, not even
test if it is null. Also, there does not seem to be a C or OCaml API
to test if there is an initializer. So this diff changes
Llvm.global_initializer to return an option.

Reviewed By: whitequark

Differential Revision: https://reviews.llvm.org/D65195
2021-03-17 13:39:35 +00:00
Hans Wennborg
d0e43622c0 Revert "[DebugInfo] Handle multiple variable location operands in IR"
This caused non-deterministic compiler output; see comment on the
code review.

> This patch updates the various IR passes to correctly handle dbg.values with a
> DIArgList location. This patch does not actually allow DIArgLists to be produced
> by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
> Other than that, it should cover every IR pass.
>
> Most of the changes simply extend code that operated on a single debug value to
> operate on the list of debug values in the style of any_of, all_of, for_each,
> etc. Instances of setOperand(0, ...) have been replaced with with
> replaceVariableLocationOp, which takes the value that is being replaced as an
> additional argument. In places where this value isn't readily available, we have
> to track the old value through to the point where it gets replaced.
>
> Differential Revision: https://reviews.llvm.org/D88232

This reverts commit df69c69427dea7f5b3b3a4d4564bc77b0926ec88.
2021-03-17 13:36:48 +01:00
Jason Hu
4d21b9cc78 [NFC][OCaml] Fix documentation for verify_function and const_of_int64
Documentation of verify_function is incorrect and that of
const_of_int64 is incomplete.

Reviewed By: whitequark

Differential Revision: https://reviews.llvm.org/D77884
2021-03-17 12:09:28 +00:00
Simon Pilgrim
2d3ee0e485 Revert rG3b635253ddd0106c88051cff3540d8eb90bee22f "[AMDGPU] Regenerate wave32.ll test checks"
Breaks on some buildbots.
2021-03-17 11:47:09 +00:00
David Zarzycki
b6009ddce9 [lit] Harmonize test timing data between Unix and Windows
The "path" recorded for timing purposes is only used as a key into a dictionary. It is never used as an actual path to a filesystem API, therefore we should use '/' as the canonical separator so that Unix and Windows machines can share timing data. This also ensures that the lit testing works across platforms.

Reviewed By: jhenderson, jmorse

Differential Revision: https://reviews.llvm.org/D98767
2021-03-17 07:42:40 -04:00
Bradley Smith
85ceade375 [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE
Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.

Differential Revision: https://reviews.llvm.org/D98487
2021-03-17 11:41:22 +00:00
Simon Pilgrim
773dd67e1c [AMDGPU] Regenerate wave32.ll test checks
This is to help simplify the diff on an upcoming patch
2021-03-17 11:27:11 +00:00
David Green
b0820d90be [LV] Account for the cost of predication of scalarized load/store
This adds the cost of an i1 extract and a branch to the cost in
getMemInstScalarizationCost when the instruction is predicated. These
predicated loads/store would generate blocks of something like:

    %c1 = extractelement <4 x i1> %C, i32 1
    br i1 %c1, label %if, label %else
  if:
    %sa = extractelement <4 x i32> %a, i32 1
    %sb = getelementptr inbounds float, float* %pg, i32 %sa
    %sv = extractelement <4 x float> %x, i32 1
    store float %sa, float* %sb, align 4
  else:

So this increases the cost by the extract and branch. This is probably
still too low in many cases due to the cost of all that branching, but
there is already an existing hack increasing the cost using
useEmulatedMaskMemRefHack. It will increase the cost of a memop if it is
a load or there are more than one store. This patch improves the cost
for when there is only a single store, and hopefully at some point in
the future the hack can be removed.

Differential Revision: https://reviews.llvm.org/D98243
2021-03-17 10:57:50 +00:00
Bu Le
b2b1c4104c [SLP] Fix the trunc instruction insertion problem
Current SLP pass has this piece of code that inserts a trunc instruction
after the vectorized instruction. In the case that the vectorized instruction
is a phi node and not the last phi node in the BB, the trunc instruction
will be inserted between two phi nodes, which will trigger verify problem
in debug version or unpredictable error in another pass.
This patch changes the algorithm to 'if the last vectorized instruction
is a phi, insert it after the last phi node in current BB' to fix this problem.
2021-03-17 13:51:08 +03:00
Fraser Cormack
f84a9cd429 [RISCV] Optimize "dominant element" BUILD_VECTORs
This patch adds an optimization path for BUILD_VECTOR nodes where the
majority of the elements are identical. These can be splatted, with the
remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can
be tweaked as required - it is currently conservative. Undef elements
are disregarded when judging the dominance of a particular element. This
allows them to be covered by the splat value.

In addition, vectors of 2 elements are always optimized to a splat (for
the upper element) and an insert at element zero.

This optimization is disabled when optimizing for size.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98700
2021-03-17 10:09:04 +00:00
Jay Foad
9d2225fc35 [AMDGPU] Split dot2-insts feature
Split out some of the instructions predicated on the dot2-insts target
feature into a new dot7-insts, in preparation for subtargets that have
some but not all of these instructions. NFCI.

Differential Revision: https://reviews.llvm.org/D98717
2021-03-17 09:42:21 +00:00
Jay Foad
fdda5b21a5 [TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter
This patch reduces the time taken for clang to compile the generated
disassembler for an out-of-tree target with InsnType bigger than 64 bits
from 4m30s to 48s.

D67686 did a similar thing for CodeEmitterGen.

The idea is to tweak the API of the APInt-like InsnType class so that
we don't need so many temporary InsnTypes. This takes advantage of the
rule stated in D52100 that currently "no string of bits extracted
from the encoding may exceeed 64-bits", so we can use uint64_t for some
temporaries.

D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.

Differential Revision: https://reviews.llvm.org/D98046
2021-03-17 09:28:50 +00:00
Bu Le
315ebb4028 [SLP][Test] Precommit test for D98423 2021-03-17 12:11:50 +03:00
edwin-wang
e5bb9a33f8 [NFC] [XCOFF] Update PowerPC readobj test case with expression
This patch is to replace the fixed value with expression.
Keep .file section as fixed values as it might be changed. The
remaining sections will hardly be modified. So the Index values
are sequential. By using expression, we can avoid the fixed value
changes in coming patches.

This is a follow-up of patch D97117.

Reviewed By: hubert.reinterpretcast, shchenz

Differential Revision: https://reviews.llvm.org/D98620
2021-03-17 16:02:50 +08:00
Fangrui Song
dfd32b8a1c [MC] Delete unused MCOperand::{create,is,get}FPImm 2021-03-17 00:30:38 -07:00