1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

212575 Commits

Author SHA1 Message Date
Hubert Tong
9e4ece6cd0 Revert "[AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax"
This reverts commit bcdd40f802a5dfd7b3ac11304e6099bfcdd25b1e.
See https://reviews.llvm.org/D98543.
2021-03-12 14:48:00 -05:00
Craig Topper
7bf3c46e09 [RISCV] Add test cases for failure to optimize select_cc with X < 1 or X > -1. NFC
We can use BGE with X0 to implement these, but we currently put
1 or -1 into a register.
2021-03-12 11:19:04 -08:00
Roman Lebedev
678779eec7 [SCEV] Improve modelling for (null) pointer constants
This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-12 22:11:58 +03:00
Craig Topper
19e0e79da1 [RISCV] Add support for scalable vector masked load/store.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98460
2021-03-12 10:32:33 -08:00
Zahira Ammarguellat
6707ee0435 [NFC] Fix "unused parameter" error revealed in the Linux self-build. 2021-03-12 10:26:40 -08:00
Stanislav Mekhanoshin
5aafd8a5a0 [AMDGPU] Fix -amdgpu-inline-arg-alloca-cost
Before D94153 this threshold was in a pre-scaled units.
After D94153 inlining threshold multiplier is not applied
to this portion of the threshold anymore. Restore the
threshold by applying the multiplier.

Differential Revision: https://reviews.llvm.org/D98362
2021-03-12 10:19:50 -08:00
Thomas Preud'homme
0911193c17 [FileCheck] Add support for hex alternate form in FileCheck
Add printf-style alternate form flag to prefix hex number with 0x when
present. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:

[[#%#<precision specifier><format specifier>, ...]

where <precision specifier> and <format specifier> are optional and ...
can be a variable definition or not with an empty expression or not.

This feature was requested in https://reviews.llvm.org/D81144#2075532
for llvm/test/MC/ELF/gen-dwarf64.s

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D97845
2021-03-12 18:14:17 +00:00
serge-sans-paille
8a1cc679e3 [NFC] Use llvm::raw_string_ostream instead of std::stringstream
That's more efficient and we don't loose any valuable feature when doing so.
2021-03-12 18:43:59 +01:00
Nico Weber
d1f2a244f8 [gn build] (manually) port bcdd40f802a5 2021-03-12 12:15:52 -05:00
Anirudh Prasad
6e29318dd0 [AsmParser][SystemZ][z/OS] Introducing HLASM Comment Syntax
- This patch adds in support for the ordinary HLASM comment syntax asm
  statements (Reference - Chapter 7, Comment Statements, Ordinary Comment
  Statements)
- In brief, the ordinary comment syntax if used, must begin with the "*"
  character
- To achieve this, this patch makes use of the CommentString attribute
  provided in the base MCAsmInfo class
- In the SystemZMCAsmInfo class, the CommentString attribute was set to
  "*" based on the assembler dialect
- Furthermore, a new attribute RestrictCommentString, is provided to only
  treat a string as a comment if it appears at the start of the asm
  statement. Example: "jo *-4" is valid in HLASM (jump back 4 bytes from
  current point - similar to jo -4 in gnu asm) and we don't want "*-4" to
  be treated as a comment.
- RFC for HLASM Parser support implementation: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html

Reviewed By: scott.linder, Kai

Differential Revision: https://reviews.llvm.org/D97703
2021-03-12 11:56:11 -05:00
Nico Weber
a8d425f07a [lit] rewrap a few lines to 80 columns
No behavior change.
2021-03-12 11:55:00 -05:00
Simon Pilgrim
1833b432fd [X86][AVX] Insert zeros byte elements into 256/512-bit vectors using shuffle/and
Avoid extracting/inserting subvectors which makes it more difficult for shuffle combining to merge them together.
2021-03-12 15:16:36 +00:00
Simon Pilgrim
ddbea9a58f [X86] Provide lighter weight getTargetShuffleMask wrapper. NFCI.
Most callers to getTargetShuffleMask don't use the IsUnary flag.
2021-03-12 15:16:35 +00:00
Nico Weber
81b44f54a3 Revert "[IndirectCallPromotion] Don't strip ".__uniq." suffix when it strips"
This reverts commit 90dfbeef5982ecfb5523b48f3c301bea033e5c3d.
Causes PR49554. Also see comments on https://reviews.llvm.org/D98389
2021-03-12 10:03:58 -05:00
Simonas Kazlauskas
fd51ea9877 Test cases for rem-seteq fold with illegal types
This also briefly tests a larger set of architectures than the more
exhaustive functionality tests for AArch64 and x86.

As requested in D88785

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D98339
2021-03-12 16:28:04 +02:00
Matt Arsenault
a70739a774 GlobalISel: Fix marking byval arguments as immutable
byval arguments need to be assumed writable. Only implicitly stack
passed arguments which aren't addressable in the IR can be assumed
immutable.

Mips is still broken since for some reason its doing its own thing
with the ValueHandlers (and x86 doesn't actually handle byval
arguments now, although some of the code is there).
2021-03-12 09:01:53 -05:00
Matt Arsenault
fb34cc83ce GlobalISel: Partially fix handling of byval arguments
This was essentially ignoring byval and treating them as a pointer
argument which needed to be loaded from. This should copy the frame
index value to the virtual register, not insert a load from the frame
index into the pointer value.

For AMDGPU, this was producing a load from the byval pointer argument,
to a pointer used for the byval arguments. I do not understand how
AArch64 managed to work before since it appears to be similarly
broken.

We could also change the ValueHandler API to avoid the extra copy from
the frame index, since currently it returns a new register.

I believe there is still an issue with outgoing byval arguments. These
should have a copy inserted in case the callee decided to overwrite
the memory.
2021-03-12 09:01:53 -05:00
Matt Arsenault
36d460b873 AArch64/GlobalISel: Don't use common prefix in test
Unlike update_llc_test_checks, update_mir_test_checks isn't actually
smart enough to common functions with identical output.
2021-03-12 09:01:52 -05:00
Matt Arsenault
98a883c4c2 AMDGPU/GlobalISel: Cleanup call lowering sequence
Now that handleAssignments is handling all of the argument splitting,
we don't have to move the insert point around.
2021-03-12 09:01:52 -05:00
serge-sans-paille
1967caff19 [NFC] Use StringRef instead of const char* for AsmPrinter
This avoids calling strlen to repeatedly compute some string size.
2021-03-12 14:39:25 +01:00
Florian Hahn
83a87f9a9e [LV] Fix name in CHECK pattern after fb3ca7076 2021-03-12 13:31:48 +00:00
Florian Hahn
8d07d8b5df [LV] Account IV recipes being uniform in VPTransformState::get().
This patch fixes a crash when trying to get a scalar value using
VPTransformState::get() for uniform induction values or truncated
induction values. IVs and truncated IVs can be uniform and the updated
code accounts for that, fixing the crash.

This should fix
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31981
2021-03-12 13:29:06 +00:00
Sanjay Patel
95bd4a26fb [SimplifyCFG] avoid sinking insts within an infinite-loop
The test is reduced from a C source example in:
https://llvm.org/PR49541

It's possible that the test could be reduced further or
the predicate generalized further, but it seems to require
a few ingredients (including the "late" SimplifyCFG options
on the RUN line) to fall into the infinite-loop trap.
2021-03-12 08:04:57 -05:00
Stefan Gränitz
5d515df178 [Orc] Fix race condition in DebugObjectManagerPlugin
During finalization the debug object is registered with the target. Materialization must wait for this process to finish. Otherwise we might start running code before the debugger finished processing the corresponding debug info.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D98417
2021-03-12 14:04:09 +01:00
Hans Wennborg
0f6eabfeda Revert "[InstrProfiling] Don't generate __llvm_profile_runtime_user"
This broke the check-profile tests on Mac, see comment on the code
review.

> This is no longer needed, we can add __llvm_profile_runtime directly
> to llvm.compiler.used or llvm.used to achieve the same effect.
>
> Differential Revision: https://reviews.llvm.org/D98325

This reverts commit c7712087cbb505d324e1149fa224f607c91a8c6a.

Also reverting the dependent follow-up commit:

Revert "[InstrProfiling] Generate runtime hook for ELF platforms"

> When using -fprofile-list to selectively apply instrumentation only
> to certain files or functions, we may end up with a binary that doesn't
> have any counters in the case where no files were selected. However,
> because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
> runtime would still be pulled in and incur some non-trivial overhead,
> especially in the case when the continuous or runtime counter relocation
> mode is being used. A better way would be to pull in the profile runtime
> only when needed by declaring the __llvm_profile_runtime symbol in the
> translation unit only when needed.
>
> This approach was already used prior to 9a041a75221ca, but we changed it
> to always generate the __llvm_profile_runtime due to a TAPI limitation.
> Since TAPI is only used on Mach-O platforms, we could use the early
> emission of __llvm_profile_runtime there, and on other platforms we
> could change back to the earlier approach where the symbol is generated
> later only when needed. We can stop passing -u__llvm_profile_runtime to
> the linker on Linux and Fuchsia since the generated undefined symbol in
> each translation unit that needed it serves the same purpose.
>
> Differential Revision: https://reviews.llvm.org/D98061

This reverts commit 87fd09b25f8892e07b7ba11525baa9c3ec3e5d3f.
2021-03-12 13:53:46 +01:00
Simon Pilgrim
6caa86e179 [PPC] Fix UBSAN warning about out of range shift. NFCI. 2021-03-12 12:03:48 +00:00
Simon Pilgrim
e5295af6ae [PPC] Fix static analyzer / UBSAN warnings about out of range shifts. NFCI. 2021-03-12 10:34:35 +00:00
Serguei Katkov
a0cc2dd0ab Revert "Mark gc.relocate and gc.result as readnone"
As readnone function they become movable and LICM can hoist them
out of a loop. As a result in LCSSA form phi node of type token
is created. No one is ready that GCRelocate first operand is phi node
but expects to be token.

GVN test were also updated, it seems it does not do what is expected.
Test for LICM is also added.

This reverts commit f352463ade6e49c3b0275f296d9190d828b7630b.
2021-03-12 16:59:17 +07:00
Fraser Cormack
fa5bc9bba6 [RISCV] Optimize INSERT_VECTOR_ELT sequences
This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways.
Primarily, it removes the use of vslidedown during lowering, and the
vector element is inserted entirely using vslideup with a custom VL and
slide index.

Additionally, lowering of i64-element vectors on RV32 has been optimized
in several ways. When the 64-bit value to insert is the same as the
sign-extension of the lower 32-bits, the codegen can follow the regular
path. When this is not possible, a new sequence of two i32 vslide1up
instructions is used to get the vector element into a vector. This
sequence was suggested by @craig.topper. From there, the value is slid
into the final position for more consistent lowering across RV32 and
RV64.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98250
2021-03-12 09:13:38 +00:00
Fraser Cormack
8e6f056201 [RISCV] Fix up stale VECREDUCE comments. NFC.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98399
2021-03-12 08:49:46 +00:00
Bjorn Pettersson
bcace0dff8 [InstSimplify] Simplify smul.fix and smul.fix.sat
Add simplification of smul.fix and smul.fix.sat according to
  X * 0 -> 0
  X * undef -> 0
  X * (1 << scale) -> X

This includes the commuted patterns and splatted vectors.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D98299
2021-03-12 09:09:58 +01:00
Bjorn Pettersson
5d0c366826 [ConstantFold] Handle undef/poison when constant folding smul_fix/smul_fix_sat
Do constant folding according to
  posion * C -> poison
  C * poison -> poison
  undef * C -> 0
  C * undef -> 0
for smul_fix and smul_fix_sat intrinsics (for any scale).

Reviewed By: nikic, aqjune, nagisa

Differential Revision: https://reviews.llvm.org/D98410
2021-03-12 09:09:58 +01:00
Johannes Doerfert
ce96365607 Revert "[OpenMP] Do not propagate match extensions to nested contexts"
Two tests failed for some reason, need to investigate:
  https://lab.llvm.org/buildbot/#/builders/109/builds/10399

This reverts commit ad9e98b8efa0138559eb640023695dab54967a8d.
2021-03-11 23:48:36 -06:00
Johannes Doerfert
0058a98cec Revert "[OpenMP] Introduce the disable_selector_propagation variant selector trait"
Need to revert ad9e98b8efa0138559eb640023695dab54967a8d which this
commit depends on.

This reverts commit f771ef7b5f0ed260d00931cd50e6fe462edbacaf.
2021-03-11 23:48:35 -06:00
Johannes Doerfert
24773fbc70 [Attributor] Derive willreturn based on mustprogress
Since D86233 we have `mustprogress` which, in combination with
`readonly`, implies `willreturn`. The idea is that every side-effect
has to be modeled as a "write". Consequently, `readonly` means there
is no side-effect, and `mustprogress` guarantees that we cannot "loop"
forever without side-effect.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94125
2021-03-11 23:31:44 -06:00
Johannes Doerfert
0f695356f4 [Attributor][NFC] Update tests after D94741
The update_test_checks script can now check for global symbols and is able
to handle them properly when they differ across prefixes, e.g.,
attribute #0 might be different in different runs.

This patch simply updates all the Attributor tests with the new script.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D97906
2021-03-11 23:31:39 -06:00
Johannes Doerfert
e3c1fc67b1 [OpenMP] Introduce the disable_selector_propagation variant selector trait
Nested `omp [begin|end] declare variant` inherit the selectors from
surrounding `omp (begin|end) declare variant` constructs. To stop such
propagation the user can add the `disable_selector_propagation` to the
`extension` set in the `implementation` selector.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D95765
2021-03-11 23:31:25 -06:00
Johannes Doerfert
274a713a36 [OpenMP] Do not propagate match extensions to nested contexts
If we have nested declare variant context, it doesn't make sense to
inherit the match extension from the parent. Instead, just skip it.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95764
2021-03-11 23:31:21 -06:00
Johannes Doerfert
6e9991eb47 [Utils] Check for more global information in update_test_checks
This allows to check for various globals (metadata/attributes/...) and
also resolves problems with globals (metadata/attributes/...) being
reused across different prefixes.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D94741
2021-03-11 23:31:16 -06:00
Johannes Doerfert
6c444365cb [FIX] Allow non-constant assume operand bundle operands.
Fixes PR49545

Reviewed By: zequanwu, fhahn, lebedev.ri

Differential Revision: https://reviews.llvm.org/D98444
2021-03-11 23:31:09 -06:00
Esme-Yi
018bbe154e [Debug-Info] Add names for the debug line prologue.
Summary: This is a minor patch to add names for the debug line prologue, as a follow-up of D95998.

Reviewed By: dblaikie, ikudrin, shchenz

Differential Revision: https://reviews.llvm.org/D98383
2021-03-12 04:45:08 +00:00
Craig Topper
07d94ffb28 [RISCV] Return false from isShuffleMaskLegal except for splats.
We don't support any other shuffles currently.

This changes the bswap/bitreverse tests that check for this in
their expansion code. Previously we expanded a byte swapping
shuffle through memory. Now we're scalarizing and doing bit
operations on scalars to swap bytes.

In the future we can probably use vrgather.vx to do a byte swap
shuffle.
2021-03-11 20:02:49 -08:00
Yonghong Song
dae54112ca BPF: provide better error message for unsupported atomic operations
Currently, BPF backend does not support all variants of
  atomic_load_{add,and,or,xor}, atomic_swap and atomic_cmp_swap
For example, it only supports 32bit (with alu32 mode) and 64bit
operations for atomic_load_{and,or,xor}, atomic_swap and
atomic_cmp_swap. Due to historical reason, atomic_load_add is
always supported with 32bit and 64bit.

If user used an unsupported atomic operation, currently,
codegen selectiondag cannot find bpf support and will issue
a fatal error. This is not user friendly as user may mistakenly
think this is a compiler bug.

This patch added Custom rule for unsupported atomic operations
and will emit better error message during ReplaceNodeResults()
callback. The following is an example output.

  $ cat t.c
  short sync(short *p) {
    return  __sync_val_compare_and_swap (p, 2, 3);
  }
  $ clang -target bpf -O2 -g -c t.c
  t.c:2:11: error: Unsupported atomic operations, please use 64 bit version
    return  __sync_val_compare_and_swap (p, 2, 3);
            ^
  fatal error: error in backend: Cannot select: t19: i64,ch =
    AtomicCmpSwap<(load store seq_cst seq_cst 2 on %ir.p)> t0, t2,
    Constant:i64<2>, Constant:i64<3>, t.c:2:11
    t2: i64,ch = CopyFromReg t0, Register:i64 %0
      t1: i64 = Register %0
    t11: i64 = Constant<2>
    t10: i64 = Constant<3>
  In function: sync
  PLEASE submit a bug report ...

Fatal error will still happen since we did not really do proper
lowering for these unsupported atomic operations. But we do get
a much better error message.

Differential Revision: https://reviews.llvm.org/D98471
2021-03-11 19:19:00 -08:00
Carl Ritson
b3653dfbd2 [AMDGPU] Do not annotate an else branch if there is a kill
As llvm.amdgcn.kill is lowered to a terminator it can cause
else branch annotations to end up in the wrong block.
Do not annotate conditionals as else branches where there is
a kill to avoid this.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D97427
2021-03-12 11:52:08 +09:00
Mircea Trofin
87d29d8dee Revert "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes"
This reverts commit 5eaeb0fa67e57391f5584a3f67fdb131e93afda6.

It appears there are analyses that assume clearing - example:
https://lab.llvm.org/buildbot#builders/36/builds/5964
2021-03-11 18:31:19 -08:00
Mircea Trofin
c85bdf8369 [NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes
Check with the analysis result by calling invalidate instead of clear on
the analysis manager.

Differential Revision: https://reviews.llvm.org/D98440
2021-03-11 18:15:28 -08:00
David Blaikie
d97ffcdf36 Move (llvm-original-di-preservation) test example output into the Inputs directory (since it's an input to the test execution)
The "Inputs" subdirectory is used for all files read by the test, not
only those used as input to the execution - so even though this file is
used as a golden reference for the output of the test, it's still an
input to the test execution (it is read in the process of executing the
test).
2021-03-11 17:36:33 -08:00
Maksim Panchenko
3442ee8292 [RuntimeDyld] Speedup resolution of relocations to external symbols
From what I can tell, the loop inside applyExternalSymbolRelocations()
used to call getSymbolAddress(). After the JITSymbolResolver interface
redesign, the functionality has changed, and the loop should no longer
trigger repopulation of ExternalSymbolRelocations. If that's the case,
there is no need to update the loop iterator manually, and
ExternalSymbolRelocations can be cleared at once. This way, when there
are many external symbols in the program, the function runs much faster.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D97531
2021-03-11 16:58:49 -08:00
Carl Ritson
df6c503796 [AMDGPU] Restrict image_msaa_load to MSAA dimension types
This instruction is only valid on 2D MSAA and 2D MSAA Array
surfaces.  Remove intrinsic support for other dimension types,
and block assembly for unsupported dimensions.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D98397
2021-03-12 09:47:24 +09:00
Ruiling Song
27bf846d0e [AMDGPU] Don't check hasStackObjects() when reserving VGPR
We have amdgpu_gfx functions that have high register pressure. If
we do not reserve VGPR for SGPR spill, we will fall into the path
to spill the SGPR to memory, which does not only have correctness issue,
but also have really bad performance.

I don't know why there is the check for hasStackObjects(), in our case,
we don't have stack objects at the time of finalizeLowering(). So just
remove the check that we always reserve a VGPR for possible SGPR spill
in non-entry functions.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D98345
2021-03-12 08:11:14 +08:00