1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

210321 Commits

Author SHA1 Message Date
Kazu Hirata
923c60906b [llvm-objdump] Use append_range (NFC) 2021-01-26 20:00:19 -08:00
Kazu Hirata
8f3f48c295 [MemorySSA] Use ListSeparator (NFC) 2021-01-26 20:00:18 -08:00
Kazu Hirata
d493c03566 [AMDGPU] Forward-declare TargetRegisterClass (NFC)
AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a
forward declaration of TargetRegisterClass in InstructionSelector.h.
This patch adds a forward declaration right in
AMDGPUInstructionSelector.h.

While we are at it, this patch removes the one in
InstructionSelector.h, where it is unnecessary.
2021-01-26 20:00:16 -08:00
Craig Topper
69039e7730 [TableGen] Add isContradictoryImpl implementation to CheckCondCodeMatcher and CheckChild2CondCodeMatcher.
This enables better pattern factoring in the RISCV ISel table.
2021-01-26 19:44:57 -08:00
Tom Stellard
1a410d196a Bump the trunk major version to 13
and clear the release notes.
2021-01-26 19:37:55 -08:00
LLVM GN Syncbot
8cadd79a8e [gn build] Port bb9eb1982980 2021-01-27 01:23:23 +00:00
Craig Topper
87d87d0dfc [RISCV] Add rv64 run lines to rv32 MC layer tests for B extension
Remove common instructions from rv64 tests since they are now
covered by the rv64 run lines in the rv32 tests.

Add rv32-only* tests for a few cases that aren't common between
r32 and rv64.

Addresses review feedback from D95150.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95272
2021-01-26 17:20:05 -08:00
Petr Hosek
3cfd18b793 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Nico Weber
696b1678eb [gn build] fix get.py change 2021-01-26 19:20:23 -05:00
Nico Weber
5eefc4a62f [gn build] restore build command removed in 9595a7ff55b6 for platforms without prebuilts 2021-01-26 19:19:31 -05:00
Nico Weber
1e7cca27a4 llvm-lib: Pull error printing code out of two functions
Slightly changes the output in error code, but no behavior change in
normal use. This is for preparation for using these two functions
elsewhere.
2021-01-26 19:13:30 -05:00
Fangrui Song
9459868630 [llc] Add reportError helper and canonicalize error messages 2021-01-26 15:33:37 -08:00
Duncan P. N. Exon Smith
795af9c7de Frontend: Simplify handling of non-seeking streams in CompilerInstance, NFC
Add a new `raw_pwrite_ostream` variant, `buffer_unique_ostream`, which
is like `buffer_ostream` but with unique ownership of the stream it's
wrapping. Use this in CompilerInstance to simplify the ownership of
non-seeking output streams, avoiding logic sprawled around to deal with
them specially.

This also simplifies future work to encapsulate output files in a
different class.

Differential Revision: https://reviews.llvm.org/D93260
2021-01-26 15:20:43 -08:00
Jessica Paquette
ed1a930649 [GlobalISel] Implement computeKnownBits for G_SEXT_INREG
Just use the existing `Known.sextInReg` implementation.

- Update KnownBitsTest.cpp.
- Update combine-redundant-and.mir for a more concrete example.

Differential Revision: https://reviews.llvm.org/D95484
2021-01-26 15:01:38 -08:00
Adrian Prantl
73fb46a3c5 Salvage debug info for function arguments in coro-split funclets.
This patch improves the availability for variables stored in the
coroutine frame by emitting an alloca to hold the pointer to the frame
object and rewriting dbg.declare intrinsics to point inside the frame
object using salvaged DIExpressions. Finally, a new alloca is created
in the funclet to hold the FramePtr pointer to ensure that it is
available throughout the entire function at -O0.

This path also effectively reverts D90772. The testcase updates
highlight nicely how every removed CHECK for a dbg.value is preceded
by a new CHECK for a dbg.declare.

Thanks to JunMa, Yifeng, and Bruno for their thoughtful reviews!

Differential Revision: https://reviews.llvm.org/D93497

rdar://71866936
2021-01-26 15:01:26 -08:00
Zhuojia Shen
822d9e26e1 [ARM] Fix STRT/STRHT/STRBT input/output operands.
STRT, STRHT, and STRBT are store instructions and their source register
$Rt should be treated as an input operand instead of an output operand.
This should fix things (e.g., liveness tracking in LivePhysRegs) if
these instructions were used in CodeGen.

Differential Revision: https://reviews.llvm.org/D95074
2021-01-26 14:00:58 -08:00
Bjorn Pettersson
3aa7f0e352 [NewPM] Add ExtraVectorizerPasses support
As it looks like NewPM generally is using SimpleLoopUnswitch
instead of LoopUnswitch, this patch also use SimpleLoopUnswitch
in the ExtraVectorizerPasses sequence (compared with LegacyPM
which use the LoopUnswitch pass).

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D95457
2021-01-26 22:59:10 +01:00
Valery N Dmitriev
6413cc64a9 [InstCombine] Preserve FMF for powi simplifications.
Differential Revision: https://reviews.llvm.org/D95455
2021-01-26 13:26:06 -08:00
Valery N Dmitriev
6cc92601ad [NFC] Show instcombine powi simplifications drop FMF
Differential Revision: https://reviews.llvm.org/D95454
2021-01-26 13:26:06 -08:00
Craig Topper
60dd07a304 [X86] In shrinkAndImmediate, place the new constant into the topological sort.
Revert the change to use APInt::isSignedIntN from
5ff5cf8e057782e3e648ecf5ccf1d9990b53ee90.

Its clear that the games we were playing to avoid the topological
sort aren't working. So just fix it once and for all.

Fixes PR48888.
2021-01-26 13:18:04 -08:00
Julian Lettner
f11e7efb47 [NFC][lit] Cleanup code using string interpolation
LLVM now requires Python 3.6, so we can use string interpolation to make
code more readable.
2021-01-26 13:04:31 -08:00
Amara Emerson
1eb88d40b9 [GlobalISel][IRTranslator] Ignore the llvm.experimental.noalias.scope.decl intrinsic.
These don't generate any code.
2021-01-26 13:04:11 -08:00
LLVM GN Syncbot
80d5e143a6 [gn build] Port 1e634f3952aa 2021-01-26 20:48:31 +00:00
Fangrui Song
3aa07124e5 [llvm-elfabi] Fix test after D95140 2021-01-26 12:45:45 -08:00
Haowei Wu
4cba78c0ca [llvm-elfabi] Support ELF file that lacks .gnu.hash section
Before this change, when reading ELF file, elfabi determines number of
entries in .dynsym by reading the .gnu.hash section. This change makes
elfabi read section headers directly first. This change allows elfabi
works on ELF files which do not have .gnu.hash sections.

Differential Revision: https://reviews.llvm.org/D93362
2021-01-26 12:31:52 -08:00
Fangrui Song
71834ae8ab Add -fbinutils-version= to gate ELF features on the specified binutils version
There are two use cases.

Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler().  Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).

Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld".  You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969)

Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).

This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.

`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.

Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).

Differential Revision: https://reviews.llvm.org/D85474
2021-01-26 12:28:23 -08:00
Petr Hosek
ef4906bd9c Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a9e20bd5df3cb47283715f0ff38b751 because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Austin Kerbow
d1f23a1772 [AMDGPU] Update subtarget features for new target ID support
Support for XNACK and SRAMECC is not static on some GPUs. We must be able
to differentiate between different scenarios for these dynamic subtarget
features.

The possible settings are:

- Unsupported: The GPU has no support for XNACK/SRAMECC.
- Any: Preference is unspecified. Use conservative settings that can run anywhere.
- Off: Request support for XNACK/SRAMECC Off
- On: Request support for XNACK/SRAMECC On

GCNSubtarget will track the four options based on the following criteria. If
the subtarget does not support XNACK/SRAMECC we say the setting is
"Unsupported". If no subtarget features for XNACK/SRAMECC are requested we
must support "Any" mode. If the subtarget features XNACK/SRAMECC exist in the
feature string when initializing the subtarget, the settings are "On/Off".

The defaults are updated to be conservatively correct, meaning if no setting
for XNACK or SRAMECC is explicitly requested, defaults will be used which
generate code that can be run anywhere. This corresponds to the "Any" setting.

Differential Revision: https://reviews.llvm.org/D85882
2021-01-26 11:25:51 -08:00
LLVM GN Syncbot
364bd7857e [gn build] Port 4edf35f11a9e 2021-01-26 19:12:09 +00:00
Petr Hosek
a8839c989d Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Adhemerval Zanella
7ab636be94 [ARM] [ELF] Fix ARMMaterializeGV for Indirect calls
Recent shouldAssumeDSOLocal changes (introduced by 961f31d8ad14c66)
do not take in consideration the relocation model anymore.  The ARM
fast-isel pass uses the function return to set whether a global symbol
is loaded indirectly or not, and without the expected information
llvm now generates an extra load for following code:

```
$ cat test.ll
@__asan_option_detect_stack_use_after_return = external global i32
define dso_local i32 @main(i32 %argc, i8** %argv) #0 {
entry:
  %0 = load i32, i32* @__asan_option_detect_stack_use_after_return,
align 4
  %1 = icmp ne i32 %0, 0
  br i1 %1, label %2, label %3

2:
  ret i32 0

3:
  ret i32 1
}

attributes #0 = { noinline optnone }

$ lcc test.ll -o -
[...]
main:
        .fnstart
[...]
        movw    r0, :lower16:__asan_option_detect_stack_use_after_return
        movt    r0, :upper16:__asan_option_detect_stack_use_after_return
        ldr     r0, [r0]
        ldr     r0, [r0]
        cmp     r0, #0
[...]
```

And without 'optnone' it produces:
```
[...]
main:
        .fnstart
[...]
        movw    r0, :lower16:__asan_option_detect_stack_use_after_return
        movt    r0, :upper16:__asan_option_detect_stack_use_after_return
        ldr     r0, [r0]
        clz     r0, r0
        lsr     r0, r0, #5
        bx      lr

[...]
```

This triggered a lot of invalid memory access in sanitizers for
arm-linux-gnueabihf.  I checked this patch both a stage1 built with
gcc and a stage2 bootstrap and it fixes all the Linux sanitizers
issues.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D95379
2021-01-26 15:57:55 -03:00
Craig Topper
4599117eed [RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well.
239cfbccb0509da1a08d9e746706013b732e646b add support for legalizing
i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate
to i8/i16 if we're legalizing one of those.
2021-01-26 10:50:03 -08:00
Julian Lettner
0aac51e35d Reland "[lit] Use os.cpu_count() to cleanup TODO"
The initial problem with the remaining bot config was resolved.

We can now use Python3.  Let's use `os.cpu_count()` to cleanup this
helper.

Differential Revision: https://reviews.llvm.org/D94734
2021-01-26 10:19:26 -08:00
Matt Arsenault
6be06cd224 AMDGPU: Fix redundant FP spilling/assert in some functions
If a function has stack objects, and a call, we require an FP. If we
did not initially have any stack objects, and only introduced them
during PrologEpilogInserter for CSR VGPR spills, SILowerSGPRSpills
would end up spilling the FP register as if it were a normal
register. This would result in an assert in a debug build, or
redundant handling of the FP register in a release build.

Try to predict that we will have an FP later, although this is ugly.
2021-01-26 13:01:45 -05:00
Matt Arsenault
df7c58a46d AMDGPU: Add assertion to determineCalleeSaves
Make sure this isn't getting called multiple times. I was surprised we
were modifying the function here, which I think is a bit questionable.
2021-01-26 13:01:45 -05:00
Sanjay Patel
da6d2a1054 [LoopVectorize] add test for fmin/fmax FMF propagation; NFC
The existing test has less FMF than we might expect if
our FMF was fixed (on all FP values), so this additional
test is intended to check propagation in a more "normal"
example.
2021-01-26 11:22:51 -05:00
Sanjay Patel
549e4519b0 [LoopUtils] do not initialize Cmp predicate unnecessarily; NFC
The switch must set the predicate correctly; anything else
should lead to unreachable/assert.

I'm trying to fix FMF propagation here and the callers,
so this is a preliminary cleanup.
2021-01-26 11:22:51 -05:00
Simon Pilgrim
95a58e9335 [AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI.
Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions
2021-01-26 16:09:39 +00:00
Simon Pilgrim
345ddf4259 [AMDGPU] Fix null-dereference static analysis warnings. NFCI.
Avoid repeated calls to isZeroValue() and check for a null pointer before dereferencing a dyn_cast<>.
2021-01-26 15:43:59 +00:00
Matt Arsenault
0db9cfc2ab AMDGPU: Clear IsSSA property in SIFormMemoryClauses
Fixes verifier error when writing MIR testcases
2021-01-26 10:40:41 -05:00
Florian Hahn
acb2114d07 [LoopUnswitch] Avoid partially unswitching too aggressively.
This patch adds additional checks to avoid partial unswitching
in cases where it won't be profitable, e.g. because the path directly
exits the loop anyways.
2021-01-26 15:18:41 +00:00
Florian Hahn
edc697c1b2 [LoopUnswitch] Add some additional tests.
Add a few additional tests where partial unswitching is not really
profitable and should be avoided.
2021-01-26 15:12:45 +00:00
Sander de Smalen
466208abea [CostModel] Handle CTLZ and CCTZ in getTypeBasedIntrinsicInstrCost
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D95355
2021-01-26 14:37:51 +00:00
Sebastian Neubauer
3bc57e4094 [AMDGPU] Add IntrWillReturn to three intrinsics
None of these can terminate a wave or lane.
With these, all intrinsic are IntrWillReturn except those that change
exec or can terminate the wave.

Not marking intrinsics as WillReturn may prevent optimizations in the
future: https://lists.llvm.org/pipermail/llvm-dev/2021-January/148047.html

Differential Revision: https://reviews.llvm.org/D95436
2021-01-26 15:33:15 +01:00
Mirko Brkusanin
c22bb38a3d [AMDGPU] Fix use of HasModifiers in VopProfile
HasModifiers should be true if at least one modifier is used.
This should make the use of this field bit more consistent.

Differential Revision: https://reviews.llvm.org/D94795
2021-01-26 15:21:11 +01:00
Florian Hahn
752c4c588e [Passes] Run peeling as part of simple/full loop unrolling.
Loop peeling removes conditions from loop bodies that become invariant
after a small number of iterations. When triggered, this leads to fewer
compares and possibly PHIs in loop bodies, enabling further
optimizations. The current cost-model of loop peeling should be quite
conservative/safe, i.e. only peel if a condition in the loop becomes
known after peeling.

For example, see PR47671, where loop peeling enables vectorization by
removing a PHI the vectorizer does not understand. Granted, the
loop-vectorizer could also be taught about constant PHIs, but loop
peeling is likely to enable other optimizations as well.

This has an impact on quite a few benchmarks from
MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto, for example

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsVectorized

    Program                                        base   patch  diff
     test-suite...ve-susan/automotive-susan.test     8.00   9.00 12.5%
     test-suite...nal/skidmarks10/skidmarks.test    35.00  31.00 -11.4%
     test-suite...lications/sqlite3/sqlite3.test    41.00  43.00  4.9%
     test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test    25.00  26.00  4.0%
     test-suite...006/450.soplex/450.soplex.test    88.00  89.00  1.1%
     test-suite...TimberWolfMC/timberwolfmc.test   120.00 119.00 -0.8%
     test-suite.../CINT2006/403.gcc/403.gcc.test   215.00 216.00  0.5%
     test-suite...006/447.dealII/447.dealII.test   957.00 958.00  0.1%
     test-suite...ternal/HMMER/hmmcalibrate.test    75.00  75.00  0.0%

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsAnalyzed

    Program                                        base    patch   diff
     test-suite...ks/Prolangs-C/agrep/agrep.test   440.00  434.00  -1.4%
     test-suite...nal/skidmarks10/skidmarks.test   312.00  308.00  -1.3%
     test-suite...marks/7zip/7zip-benchmark.test   6399.00 6323.00 -1.2%
     test-suite...lications/minisat/minisat.test   134.00  135.00   0.7%
     test-suite...rks/FreeBench/pifft/pifft.test   295.00  297.00   0.7%
     test-suite...TimberWolfMC/timberwolfmc.test   1879.00 1869.00 -0.5%
     test-suite...pplications/treecc/treecc.test   689.00  691.00   0.3%
     test-suite...T2000/300.twolf/300.twolf.test   1593.00 1597.00  0.3%
     test-suite.../Benchmarks/Bullet/bullet.test   1394.00 1392.00 -0.1%
     test-suite...ications/JM/ldecod/ldecod.test   1431.00 1429.00 -0.1%
     test-suite...6/464.h264ref/464.h264ref.test   2229.00 2230.00  0.0%
     test-suite...lications/sqlite3/sqlite3.test   2590.00 2589.00 -0.0%
     test-suite...ications/JM/lencod/lencod.test   2732.00 2733.00  0.0%
     test-suite...006/453.povray/453.povray.test   3395.00 3394.00 -0.0%

Note the -11% regression in number of loops vectorized for skidmarks. I
suspect this corresponds to the fact that those loops are gone now (see
the reduction in number of loops analyzed by LV).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88471
2021-01-26 13:52:30 +00:00
Lang Hames
4a5753f9a7 [ORC] Attempt to auto-claim responsibility for weak defs in ObjectLinkingLayer.
Compilers may insert new definitions during compilation, E.g. EH personality
function pointers, or named constant pool entries. This commit causes
ObjectLinkingLayer to attempt to claim responsibility for all weak definitions
in objects as they're linked. This is always safe (first claimant for each
symbol is granted responsibility, subsequent claims are rejected without error)
and prevents compiler-injected symbols from being dead-stripped (which they
will be if they remain unclaimed by anyone).

This change was motivated by errors seen by an out-of-tree client while testing
eh-frame support in JITLink ELF/x86-64: IR containing exceptions didn't define
DW.ref.__gxx_personality_v0 (since it's added by CodeGen), and this caused
DW.ref.__gxx_personality_v0 to be dead-stripped leading to linker failures.

No test case yet: We won't have a way to test in-tree until we enable JITLink
for lli on Linux.
2021-01-27 00:09:40 +11:00
Lang Hames
8bc0e661cd [ORC] Fix debug logging message. 2021-01-26 23:52:44 +11:00
Lang Hames
59f2f1fbee [JITLink][ELF/x86-64] When building PLT stub, use -4 offset for PCRel32.
This is required for ELF where PCRel32 doesn't implicitly subtract 4.

No test case yet: I haven't figured out a good way to test stub
generation -- this may required extensions to jitlink-check.
2021-01-26 23:51:58 +11:00
Dmitry Preobrazhensky
c6bc9da0dd [AMDGPU][MC] Refactored exp tgt handling
Summary:
- Separated tgt encoding from parsing;
- Separated tgt decoding from printing;
- Improved errors handling;
- Disabled leading zeroes in index. The following code is no longer accepted: exp pos00 v3, v2, v1, v0

Reviewers: arsenm, rampitec, foad

Differential Revision: https://reviews.llvm.org/D95216
2021-01-26 14:54:15 +03:00