1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

136310 Commits

Author SHA1 Message Date
vpykhtin
e99867bef0 [AMDGPU] Don't combine DPP if DPP register is used more than once per instruction
Reviewers: arsenm, rampitec, foad

Reviewed By: rampitec, foad

Subscribers: wuzish, kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82551
2020-07-03 15:08:26 +03:00
Luke Geeson
920620bd6d [ARM] Add Cortex-A77 Support for Clang and LLVM
This patch upstreams support for the Arm-v8 Cortex-A77
processor for AArch64 and ARM.

In detail:
- Adding cortex-a77 as a cpu option for aarch64 and arm targets in clang
- Cortex-A77 CPU name and ProcessorModel in llvm

details of the CPU can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a77

and a similar submission to GCC can be found here:
e0664b7a63

The following people contributed to this patch:
- Luke Geeson
- Mikhail Maltsev

Reviewers: t.p.northover, dmgreen, ostannard, SjoerdMeijer

Reviewed By: dmgreen

Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82887
2020-07-03 13:00:54 +01:00
James Henderson
6f98470a43 [DebugInfo] Use Cursor to detect errors in debug line prologue parser
Previously, the debug line parser would keep attempting to read data
even if it had run out of data to read. This meant errors in parsing
would often end up being reported as something else, such as an unknown
version or malformed directory/filename table. This patch fixes the
issues by using the Cursor API to capture errors.

Reviewed by: labath

Differential Revision: https://reviews.llvm.org/D83043
2020-07-03 11:52:06 +01:00
Xing GUO
120aef86f5 [DWARFYAML][debug_gnu_*] Add the missing context IsGNUStyle. NFC.
This patch helps add the missing context `IsGNUStyle`. Before this patch, yaml2obj cannot parse the YAML description of 'debug_gnu_pubnames' and 'debug_gnu_pubtypes' correctly due to the missing context.

In other words, if we have

```
DWARF:
  debug_gnu_pubtypes:
    Length:
      TotalLength: 0x1234
    Version:    2
    UnitOffset: 0x1234
    UnitSize:   0x4321
    Entries:
      - DieOffset:  0x12345678
        Name:       abc
        Descriptor: 0x00      ## Descriptor can never be mapped into Entry.Descriptor
```

yaml2obj will complain that "error: unknown key 'Descriptor'".

This patch helps resolve this problem.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82435
2020-07-03 18:12:58 +08:00
Simon Pilgrim
2593b6907b Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. 2020-07-03 10:54:28 +01:00
Simon Pilgrim
8257854bda [InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors
Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars.

Differential Revision: https://reviews.llvm.org/D83058
2020-07-03 10:04:37 +01:00
Guillaume Chatelet
9e782e3923 [Alignment][NFC] Use 5 bits to store Instructions Alignment
As per [MaxAlignmentExponent]{b7338fb1a6/llvm/include/llvm/IR/Value.h (L688)} alignment is not allowed to be more than 2^29.
Encoded as Log2, this means that storing alignment uses 5 bits.
This patch makes sure all instructions store their alignment in a consistent way, encoded as Log2 and using 5 bits.

Differential Revision: https://reviews.llvm.org/D83119
2020-07-03 08:54:27 +00:00
Guillaume Chatelet
5c1ab6ec74 [Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082
2020-07-03 08:06:43 +00:00
serge-sans-paille
81d0d1ecfb Fix stack-clash probing for large static alloca
Differential Revision: https://reviews.llvm.org/D82867
2020-07-03 09:22:03 +02:00
Guillaume Chatelet
ffe142881c [NFC] Use ADT/Bitfields in Instructions
This is an example patch for D81580.

Differential Revision: https://reviews.llvm.org/D81662
2020-07-03 07:20:22 +00:00
Craig Topper
83c3d30676 [X86] Remove MODRM_SPLITREGM from the disassembler tables.
This offers a very minor table size reduction due to only being
used for one AMX opcode.
2020-07-03 00:16:20 -07:00
Sam Parker
1a1d08e1d0 [CostModel] Fix cast crash
Don't presume instruction operands while matching reductions.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46430

Differential Revision: https://reviews.llvm.org/D82453
2020-07-03 07:53:45 +01:00
Kai Luo
8382e3ec50 [PowerPC] Implement probing for dynamic stack allocation
This patch is part of supporting `-fstack-clash-protection`. Mainly do
such things compared to existing `lowerDynamicAlloc`

- Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get
  actual frame pointer and final stack pointer.
- Synthesize a loop to probe by blocks.
- Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in
  prologepilog.

Differential Revision: https://reviews.llvm.org/D81358
2020-07-03 05:36:40 +00:00
Craig Topper
58b13955d1 [X86] Add back support for matching VPTERNLOG from back to back logic ops.
I think this mostly looks ok. The only weird thing I noticed was
a couple rotate vXi8 tests picked up an extra logic op where we have

(and (or (and), (andn)), X). Previously we matched the (or (and), (andn))
to vpternlog, but now we match the (and (or), X) and leave the and/andn
unmatched.
2020-07-02 22:11:52 -07:00
Carl Ritson
0962ad81e4 [AMDGPU] Insert PS early exit at end of control flow
Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82737
2020-07-03 14:04:34 +09:00
Kai Luo
e5fd5e8a10 [PowerPC][NFC] Prevent unused error when assertion is disabled. 2020-07-03 04:23:19 +00:00
Carl Ritson
5eef7b47fb Revert "[AMDGPU] Insert PS early exit at end of control flow"
This reverts commit 2bfcacf0ad362956277a1c2c9ba00ddc453a42ce.

There appears to be an issue to analysis preservation.
2020-07-03 13:03:33 +09:00
Kai Luo
5e4157d196 [PowerPC][NFC] Refactor lowerDynamicAlloc
When performing dynamic stack allocation, calculation of frame pointer
and actual negsize can be separated. This patch refactors
`lowerDynamicAlloc` in preparation of supporting
`-fstack-clash-protection` which also has to calculate actual frame
pointer and negsize.

Differential Revision: https://reviews.llvm.org/D81354
2020-07-03 03:33:24 +00:00
Carl Ritson
f5d888ce7f [AMDGPU] Insert PS early exit at end of control flow
Exit early if the exec mask is zero at the end of control flow.
Mark the ends of control flow during control flow lowering and
convert these to exits during the insert skips pass.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82737
2020-07-03 12:26:28 +09:00
Carl Ritson
d40c701079 [AMDGPU] Unify early PS termination blocks
Generate a single early exit block out-of-line and branch to this
if all lanes are killed. This avoids branching if lanes are active.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D82641
2020-07-03 09:58:05 +09:00
Craig Topper
d005b641e0 [X86] Teach lower512BitShuffle to try bitmask and bitblend before splitting v32i16/v64i8 on av512f only targets.
We consider v32i16/v64i8 to be legal types on avx512f, but we
don't have most operations until avx512bw. But we can use
and/or/xor operations. So try those before splitting.

This is especially helpful since we turn some ands with constant
masks into shuffles in early DAG combines. So we should make sure
we recover those back to AND.
2020-07-02 15:35:48 -07:00
Biplob Mishra
d6caac3195 [PowerPC] Implement Vector Blend Builtins in LLVM/Clang
Implements vec_blendv()

Differential Revision: https://reviews.llvm.org/D82774
2020-07-02 16:52:52 -05:00
Sanjay Patel
6e12757f99 [SelectionDAG] don't split branch on logic-of-vector-compares
SelectionDAGBuilder converts logic-of-compares into multiple branches based
on a boolean TLI setting in isJumpExpensive(). But that probably never
considered the pattern of extracted bools from a vector compare - it seems
unlikely that we would want to turn vector logic into control-flow.

The motivating x86 reduction case is shown in PR44565:
https://bugs.llvm.org/show_bug.cgi?id=44565
...and that test shows the expected improvement from using pmovmsk codegen.

For AArch64, I modified the test to include an extra op because the simpler
test gets transformed by a codegen invocation of SimplifyCFG.

Differential Revision: https://reviews.llvm.org/D82602
2020-07-02 17:05:24 -04:00
Amy Kwan
48d2183b14 [PowerPC]Add Vector Insert Instruction Definitions and MC Test
Adds td definitions and asm/disasm tests for the following instructions:

  VINSBVLX
  VINSBVRX
  VINSHVLX
  VINSHVRX
  VINSWVLX
  VINSWVRX
  VINSBLX
  VINSBRX
  VINSHLX
  VINSHRX
  VINSWLX
  VINSWRX
  VINSDLX
  VINSDRX
  VINSW
  VINSD

Differential Revision: https://reviews.llvm.org/D83052
2020-07-02 15:49:16 -05:00
Craig Topper
7667ce2824 [X86] Add vpternlog to the broadcast unfolding table. 2020-07-02 13:43:44 -07:00
Craig Topper
bf33bea430 [X86] Modify the conditions for when we stop making v16i8/v32i8 rotate Custom based on having avx512 features.
The comments here indicate that we prefer to promote the shifts
instead of allowing rotate to be pattern matched. But we weren't
taking into account whether 512-bit registers are enabled or
whethever we have vpsllvw/vpsrlvw instructions.

splatvar_rotate_v32i8 is a slight regrssion, but the other cases
are neutral or improved.
2020-07-02 13:07:51 -07:00
Biplob Mishra
283a0554d5 [PowerPC]Implement Vector Permute Extended Builtin
Implements vector permute builtin: vec_permx()

Differential Revision: https://reviews.llvm.org/D82869
2020-07-02 14:53:18 -05:00
Arthur Eubanks
c38ff9ab39 [NewPM][LSR] Rename strength-reduce -> loop-reduce
The legacy pass was called "loop-reduce".

This lowers the number of check-llvm failures under NPM by 83.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D82925
2020-07-02 11:15:29 -07:00
sstefan1
cf13fc2dbd [OpenMPOpt][Fix] Remove double initialization of omp::types. 2020-07-02 19:51:54 +02:00
Nemanja Ivanovic
84b83b6cc1 [PowerPC] Remove undefs from splat input when changing shuffle mask
As of 1fed131660b2c5d3ea7007e273a7a5da80699445, we have code that
changes shuffle masks so that we can put the shuffle in a canonical
form that can be matched to a single instruction. However, it
does not properly account for undef elements in the BUILD_VECTOR
that is the RHS splat so we can end up with undefs where they
shouldn't be. This patch converts the splat input with undefs to
one without.
2020-07-02 12:26:56 -05:00
Sander de Smalen
3254840625 [AArch64][SVE] NFC: Rename isOrig -> isReverseInstr
This is a non-functional to clarify some of the terminology in the
AArch64SVEInstrInfo/SVEInstrFormats.td files around the tables
for mapping an instruction to it's reverse instruction counter part,
and vice versa. e.g. DIV -> DIVR and DIVR -> DIV.

Reviewers: paulwalker-arm, cameron.mcinally, rengolin, efriedma

Reviewed By: paulwalker-arm, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82979
2020-07-02 17:01:15 +01:00
Simon Pilgrim
d6a5eff9f1 [InstCombine] Add (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) support for non-uniform vectors
As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount.

Differential Revision: https://reviews.llvm.org/D83035
2020-07-02 16:56:33 +01:00
Ryan Santhiraraja
b3878897c6 Preserve GlobalsAA analysis result in LowerConstantIntrinsics
LowerConstantIntrinsics fails to preserve the analysis result of
GlobalsAA. Not preserving the analysis might affect benchmark
performance. This change fixes this issue.

Patch by Ryan Santhiraraja <rsanthir@quicinc.com>

Reviewers: fpetrogalli, joerg, fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D82342
2020-07-02 15:40:41 +01:00
Dmitry Preobrazhensky
9877ca8a82 [AMDGPU][CODEGEN] Added support of new inline assembler constraints
Added support for constraints 'I', 'J', 'B', 'C', 'DA', 'DB'.

See https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81651
2020-07-02 17:20:15 +03:00
Jon Roelofs
76a7da04bb Fix missing build dependencies on omp_gen
Differential Revision: https://reviews.llvm.org/D83003
2020-07-02 07:55:20 -06:00
Nathan James
5b06ff50e1 [ASTMatchers] Enhanced support for matchers taking Regex arguments
Added new Macros `AST(_POLYMORPHIC)_MATCHER_REGEX(_OVERLOAD)` that define a matchers that take a regular expression string and optionally regular expression flags. This lets users match against nodes while ignoring the case without having to manually use `[Aa]` or `[A-Fa-f]` in their regex. The other point this addresses is in the current state, matchers that use regular expressions have to compile them for each node they try to match on, Now the regular expression is compiled once when you define the matcher and used for every node that it tries to match against. If there is an error while compiling the regular expression an error will be logged to stderr showing the bad regex string and the reason it couldn't be compiled. The old behaviour of this was down to the Matcher implementation and some would assert, whereas others just would never match. Support for this has been added to the documentation script as well. Support for this has been added to dynamic matchers ensuring functionality is the same between the 2 use cases.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D82706
2020-07-02 14:52:25 +01:00
Nathan James
888ea87b16 call ::pthread_detach on llvm_execute_on_thread_impl
Fixes all TSAN bugs in clangd

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D83039
2020-07-02 14:41:05 +01:00
Sander de Smalen
67ab949978 [AArch64][SVE] Put zeroing pseudos and patterns under flag.
This patch puts the _ZERO pseudos and corresponding patterns
under the predicate 'UseExperimentalZeroingPseudos', so that they
can be enabled/disabled through compile flags.

This is done because the zeroing pseudos use MOVPRFX to do merging of
the inactive lanes, but it depends on the uarch whether this operation
is actually merged with the destructive operation. If not, it may be
more profitable to use a SELECT and to give the compiler the freedom to
schedule these instructions as normal, rather than keeping them bundled
together. Additionally, this feature is not yet fully implemented and
there are still known bugs (see D80410) that need to be resolved before
the 'experimental' can be dropped from the name.

Reviewers: paulwalker-arm, cameron.mcinally, efriedma

Reviewed By: paulwalker-arm

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82780
2020-07-02 14:24:33 +01:00
David Green
cee41879a1 [BasicAA] Fix recursive phi MustAlias calculations
With the option -basic-aa-recphi we can detect recursive phis that loop
through constant geps, which allows us to detect more no-alias case for
pointer IV's. If the other phi operand and the other alias value are
MustAlias though, we cannot presume that every element in the loop is
also MustAlias. We need to instead be conservative and return MayAlias.

Differential Revision: https://reviews.llvm.org/D82987
2020-07-02 14:01:38 +01:00
Guillaume Chatelet
132b11f5e0 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82977
2020-07-02 11:28:02 +00:00
Guillaume Chatelet
1a720eec07 [Alignment][NFC] VectorLayout now uses Align internally
By rewritting `ScalarizerVisitor::getVectorLayout` in such a way it returns `VectorLayout` (or `None`) it becomes obvious that `VectorLayout::VecAlign` cannot be `0`.

This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82981
2020-07-02 11:25:55 +00:00
Kerry McLaughlin
07d934b054 [AArch64][SVE] Add reg+imm addressing mode for unpredicated stores
Reviewers: sdesmalen, efriedma, david-arm

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82985
2020-07-02 12:00:01 +01:00
Anna Welker
18bf72a2b7 [LV] Enable the LoopVectorizer to create pointer inductions
This patch enables the LoopVectorizer to build a phi of pointer
type and provide the vector loads and stores with vector type
getelementptrs built from the pointer induction variable, which
produces much less instructions than the previous approach of
creating scalar getelementpointers and glue them together to a
vector.

Differential Revision: https://reviews.llvm.org/D81267
2020-07-02 11:39:28 +01:00
Roman Lebedev
bac1e3b4d7 [ScalarEvolution] createSCEV(): recognize udiv/urem disguised as an sdiv/srem
Summary:
While InstCombine trivially converts that `srem` into a `urem`,
it might happen later than wanted, in particular i'd like
for that to happen on  https://godbolt.org/z/bwuEmJ test case
early in pipeline, before first instcombine run, just before `-mem2reg`.

SCEV should recognize this case natively.

Reviewers: mkazantsev, efriedma, nikic, reames

Reviewed By: efriedma

Subscribers: clementval, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82721
2020-07-02 13:22:12 +03:00
Ben Dunbobbin
d762921b07 [Support][Windows] Prevent 2s delay when renaming a file that does not exist
Differential Revision: https://reviews.llvm.org/D82542
2020-07-02 10:41:17 +01:00
Nuno Lopes
fcb721b853 DSE: fix builtin function recognition to take decl into account 2020-07-02 10:28:47 +01:00
Sander de Smalen
58e4673769 [CodeGen][SVE] Don't drop scalable flag in DAGCombiner::visitEXTRACT_SUBVECTOR
There was a rogue 'assert' in AArch64ISelLowering for the tuple.get intrinsics,
that shouldn't really have been there (I suspect this was a remnant from when
we expected the wider vector always to have come from a vector CONCAT).

When I tried to create a more minimal reproducer, I found a bug in
DAGCombiner where it drops the scalable flag when trying to fold:

      extract_subv (bitcast X), Index --> bitcast (extract_subv X, Index')

This patch fixes both issues.

Reviewers: david-arm, efriedma, spatel

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82910
2020-07-02 10:16:43 +01:00
Sander de Smalen
b6c6989aee [AArch64][SVE] Add unpred load/store patterns for bf16 types
Reviewers: kmclaughlin, c-rhodes, efriedma

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82909
2020-07-02 10:01:24 +01:00
Nicholas Guy
f4899fdead [ARM] Rearrange SizeReduction when using -Oz
Move the Thumb2SizeReduce pass to before IfConversion when optimising
for minimal code size.

Running the Thumb2SizeReduction pass before IfConversionallows T1
instructions to propagate to the final output, rather than the
ifConverter modifying T2 instructions and preventing them from being
reduced later.

This change does introduce a regression regarding execution time, so
it's only applied when optimising for size.

Running the LLVM Test Suite with this change produces a geomean
difference of -0.1% for the size..text metric.

Differential Revision: https://reviews.llvm.org/D82439
2020-07-02 09:19:38 +01:00
David Sherwood
f46a2e5105 [CodeGen] Fix warnings in getCopyToPartsVector
Whilst trying to assemble the following test:

  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_set2.c

I discovered we were hitting some warnings about possible invalid
calls to getVectorNumElements() in getCopyToPartsVector(). I've
tried to fix these by using ElementCount types where possible and
I've made the assumption that we don't support using a fixed width
vector to copy parts of a scalable vector, and vice versa. Looking
at how the copy is implemented I think that's the right thing for
now.

Differential Revision: https://reviews.llvm.org/D82744
2020-07-02 09:08:20 +01:00