1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

216192 Commits

Author SHA1 Message Date
Nikita Popov
8df1db8535 [LoopUnroll] Add test for unrollable non-latch multi-exit (NFC)
This test case requires unrolling against a non-latch exit in
a multiple-exit loop with exiting latch. It's not covered by
exiting heuristics or the extension in D102635.
2021-05-23 10:51:45 +02:00
David Green
77bb6ccfb3 [ARM] Add extra debug messages for gather/scatter lowering. NFC 2021-05-23 08:52:13 +01:00
Martin Storsjö
a30052855f [Windows] Use TerminateProcess to exit without running destructors
If exiting using _Exit or ExitProcess, DLLs are still unloaded
cleanly before exiting, running destructors and other cleanup in those
DLLs. When the caller expects to exit without cleanup, running
destructors in some loaded DLLs (which can be either libLLVM.dll or
e.g. libc++.dll) can cause deadlocks occasionally.

This is an alternative to D102684.

Differential Revision: https://reviews.llvm.org/D102944
2021-05-22 23:41:40 +03:00
Martin Storsjö
68aa000dc9 [MinGW] Mark a number of library functions unavailable for mingw targets
These functions were marked unavailable for MSVC targets before,
within an "T.isOSWindows() && !T.isOSCygMing()" block, but these ones
are unavailable on MinGW targets too.

This avoids generating calls to stpcpy for MinGW targets, which has
been happening since 6dbf0cfcf789365493f70ae69df8a7a59be41c75 (in
some cases).

This fixes https://github.com/mstorsjo/llvm-mingw/issues/201.

Differential Revision: https://reviews.llvm.org/D102946
2021-05-22 23:40:19 +03:00
Simon Pilgrim
f6dc684c0c [CostModel][X86] Align v4i64 MUL costs on AVX1 targets with worst case
Based on worst case of sandybridge (vs btver2 + bdver2) llvm-mca analysis - which is a lot less than what we were predicting (I think based off total uop count).
2021-05-22 20:07:55 +01:00
Nikita Popov
66053da05c [IR] Optimize no-op removal from AttributeList (NFC)
When removing an AttrBuilder from an index of an AttributeList,
directly return the original list if no attributes were actually
removed.
2021-05-22 19:03:27 +02:00
Nikita Popov
632878de15 [IR] Optimize no-op removal from AttributeSet (NFC)
When removing an AttrBuilder from an AttributeSet, first check
whether there is any overlap. If nothing is being removed, we can
directly return the original set.
2021-05-22 18:55:25 +02:00
Simon Pilgrim
0bdfd66523 [CostModel][X86] Pull out X86/X64 scalar int arithmetric costs from SSE tables. NFCI.
These aren't dependent on any SSE level (and don't tend to get quicker either).
2021-05-22 16:13:49 +01:00
Lang Hames
da54af9961 [ORC] Add more synchronization to TestLookupWithUnthreadedMaterialization.
Don't run tasks until their corresponding thread has been added to the running
threads vector. This is an extention to fda4300da82, which doesn't seem to have
been enough to fix the synchronization issues on its own.
2021-05-22 07:59:24 -07:00
Lang Hames
635b337d6d [JITLink] Move some Block bitfields into Addressable to improve packing.
Keeping these bitfields from Block to Addressable allows them to be packed with
the bitfields at the end of Addressable, reducing the size of Block by eight
bytes.
2021-05-22 07:59:24 -07:00
Yaxun (Sam) Liu
80152597c5 [HIP] support ThinLTO
Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
LTO for offload compilation. Allow LTO for AMDGPU target.

AMDGPU target does not support codegen of object files containing
call of external functions, therefore the LLVM module passed to
AMDGPU backend needs to contain definitions of all the callees.
An LLVM option is added to allow function importer to import
functions with noinline attribute.

HIP toolchain passes proper LLVM options to lld to make sure
function importer imports definitions of all the callees.

Reviewed by: Teresa Johnson, Artem Belevich

Differential Revision: https://reviews.llvm.org/D99683
2021-05-22 10:48:34 -04:00
Nikita Popov
8c07a78a96 Reapply [InstCombine] Fold multiuse shr eq zero
This was reverted due to performance regressions in ARM benchmarks,
which have since been addressed by D101196 (SCEV analysis improvement)
and D101778 (CGP reverse transform).

-----

The single-use case is handled implicity by converting the icmp
into a mask check first. When comparing with zero in particular,
we don't need the one-use restriction, as we only produce a single
icmp.

https://alive2.llvm.org/ce/z/MSixcm
https://alive2.llvm.org/ce/z/GwpG0M
2021-05-22 14:46:50 +02:00
David Green
aca1951bd7 [ARM] Clean up some tests, removing dead instructions. NFC 2021-05-22 13:38:00 +01:00
Simon Pilgrim
99610346a2 [CostModel][X86] vXi8 MUL is always promoted to vXi16 2021-05-22 11:56:49 +01:00
Florian Hahn
2e9363c644 [Matrix] Bail out early if there are no matrix intrinsics.
If there are no matrix intrinsics in a function, we can directly bail
out, as there's nothing left to do.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D102931
2021-05-22 11:37:25 +01:00
Simon Pilgrim
4e6ce44315 [CostModel][X86] Add test coverage for sub-64bit vXi8 multiplication costs
These can be cheaply promoted to a single v8i16 vector for multiplication
2021-05-22 11:33:36 +01:00
Simon Pilgrim
1482151467 [CostModel][X86] Improve v8i32 MUL costs on AVX1 targets to account for slower btver2
BTVER2 has a 2 cycle throughput for v4i32 multiplies (same as SSE41 targets), which is only partially hidden by the subvector extracts/insert when splitting v8i32.
2021-05-22 11:13:07 +01:00
Tomasz Miąsko
f7d35d6801 [Demangle][Rust] Parse function signatures
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102581
2021-05-22 11:49:08 +02:00
Tomasz Miąsko
c8ca4bcf8b [Demangle][Rust] Parse references
Reviewed By: dblaikie

Part of https://reviews.llvm.org/D102580
2021-05-22 11:49:08 +02:00
Tomasz Miąsko
563ebde96d [Demangle][Rust] Parse raw pointers
Reviewed By: dblaikie

Part of https://reviews.llvm.org/D102580
2021-05-22 11:49:08 +02:00
Nikita Popov
76bc4c8b9d [CVP] Add test for PR50399 (NFC) 2021-05-22 11:21:34 +02:00
Roman Lebedev
325fca859f Reland [X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()
Now that getMemoryOpCost() correctly handles all the vector variants,
we should no longer hand-roll our own version of it, but use it directly.

The AVX512 variant probably needs a similar change,
but there it is less obvious.

This was initially landed in 69ed93a4355123a45c1d7216aea7cd53d07a361b,
but was reverted in 6b95fd199d96e3ba5c28a23b17b74203522bdaa8
because the patch it depends on was reverted.
2021-05-22 11:47:08 +03:00
Roman Lebedev
3d2511b099 Reland [X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again
Instead of handling power-of-two sized vector chunks,
try handling the large vector in a stream mode,
decreasing the operational vector size
once it no longer works for the elements left to process.

Notably, this improves costs for overaligned loads - loading padding is fine.
This more directly tracks when we need to insert/extract the YMM/XMM subvector,
some costs fluctuate because of that.

This was initially landed in c02476f3158f2908ef0a6f628210b5380bd33695,
but reverted in 5fddc3312bad7e62493f1605385fad5e589e6450,
because the code made some very optimistic assumptions about invariants
that didn't hold in practice.

Reviewed By: RKSimon, ABataev

Differential Revision: https://reviews.llvm.org/D100684
2021-05-22 11:46:32 +03:00
LemonBoy
ca1236f15d [SelectionDAG] Fix argument copy elision with irregular types
D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D102153
2021-05-22 09:43:37 +02:00
Serge Pavlov
54bb9505e3 [ConstantFolding] Use APFloat for constant folding. NFC
Replace use of host floating types with operations on APFloat when it is
possible. Use of APFloat makes analysis more convenient and facilitates
constant folding in the case of non-default FP environment.

Differential Revision: https://reviews.llvm.org/D102672
2021-05-22 13:00:20 +07:00
Lang Hames
f800869258 [ORC] Check for underflow on SymbolStringPtr ref-counts. 2021-05-21 21:11:54 -07:00
Lang Hames
365986f1c8 [ORC] Fix debugging output: printDescription should not have a newline. 2021-05-21 21:11:54 -07:00
Lang Hames
52bc19513e [ORC] Fix race condtition in CoreAPIsTest.
This test has been failing intermittently on some builders, probably due to a
race on the WorkThreads vector. This patch should fix that.
2021-05-21 21:11:54 -07:00
Fangrui Song
4f8014aa66 [UpdateTestChecks] Default --x86_scrub_rip to False
True is a bad default: the useful symbol names and `@GOTPCREL` are scrubbed.

Change the default and add global variable tests to x86-basic.ll
(renamed from x86_function_name.ll since we now also test variables).
I updated some tests to show the differences.

Updated LCPI regex to include Darwin style `LCPI_[0-9]+_[0-9]+` (no
leading dot).

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D102588
2021-05-21 19:26:15 -07:00
Lang Hames
7aa19629c2 [ORC][C-bindings] Replace LLVMOrcJITTargetMachineBuilderDisposeTargetTriple.
The implementation and intent behind freeing the triple string here is the same
as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C
API), so we should use LLVMDisposeMessage for to free the string for
consistency.

Patch by Mats Larsen -- thanks Mats!

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D102957
2021-05-21 17:38:06 -07:00
Arthur Eubanks
1514e1d07b Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes"
This reverts commit d14d84af2f5ebb8ae2188ce6884a29a586dc0a40.

Causes unacceptable memory regressions.
2021-05-21 16:38:03 -07:00
Arthur Eubanks
6347acc246 Revert "[NPM] Do not run function simplification pipeline unnecessarily"
This reverts commit 97ab068034161fb35e5c9a7b293bf1e569cf077b.

Depends on D100917, which is to be reverted.
2021-05-21 16:38:02 -07:00
Eli Friedman
627b91d7fb [NewPM] Mark BitcodeWriter as required.
The textual IR writer has an equivalent marking.  It looks like this got
missed in e6ea877.
2021-05-21 16:14:09 -07:00
Vitaly Buka
ea3abf1cd5 [lit] Print full googletest commad line
Similar to regular output of LIT tests:
c162f086ba/llvm/utils/lit/lit/TestRunner.py (L1569)

Differential Revision: https://reviews.llvm.org/D102899
2021-05-21 16:11:51 -07:00
Nick Desaulniers
753fcd644e [IR] make stack-protector-guard-* flags into module attrs
D88631 added initial support for:

- -mstack-protector-guard=
- -mstack-protector-guard-reg=
- -mstack-protector-guard-offset=

flags, and D100919 extended these to AArch64. Unfortunately, these flags
aren't retained for LTO. Make them module attributes rather than
TargetOptions.

Link: https://github.com/ClangBuiltLinux/linux/issues/1378

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D102742
2021-05-21 15:53:30 -07:00
Saleem Abdulrasool
073c495f90 RISCV: add a few deprecated aliases for CSRs
This adds the {s,u,m}badaddr CSR aliases as well as the sptbr alias.
These are for compatibility with binutils.  Furthermore, these are used
by the RISC-V Proxy Kernel and are required to enable building the Proxy
Kernel with the LLVM IAS.

The aliases here are deprecated.  These are being introduced in order to
provide a compatibility story for the existing GNU toolchain, which
still supports the deprecated spelling in the assembler.  However, in
order to encourage the migration of existing coding, we provide warnings
indicating that the aliased CSRs are deprecated and should be replaced.

Differential Revision: https://reviews.llvm.org/D101919
Reviewed By: Craig Topper
2021-05-21 13:52:58 -07:00
Arthur Eubanks
586b51e638 [Verifier] Move some atomicrmw/cmpxchg checks to instruction creation
These checks already exist as asserts when creating the corresponding
instruction. Anybody creating these instructions already need to take
care to not break these checks.

Move the checks for success/failure ordering in cmpxchg from the
verifier to the LLParser and BitcodeReader plus an assert.

Add some tests for cmpxchg ordering. The .bc files are created from the
.ll files with an llvm-as with these checks disabled.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102803
2021-05-21 13:41:17 -07:00
David Goldblatt
d146e16557 [InstSimplify] add tests for rem-of-mul; NFC
These are baseline tests for D102864
2021-05-21 15:46:39 -04:00
Vitaly Buka
b8fefdf1e4 [NFC][lit] Add missing UNRESOLVED test
D102899 will change it behavour.
2021-05-21 11:34:00 -07:00
Vitaly Buka
23de4624fe [NFC][lit] Add skipped test into upstream format
Missing from D102694
2021-05-21 11:34:00 -07:00
Vitaly Buka
97ffd237a8 [nfc][lit] Relax spacing check 2021-05-21 11:34:00 -07:00
LLVM GN Syncbot
aa8c759a27 [gn build] Port 9db55b314b5b 2021-05-21 18:10:35 +00:00
Florian Hahn
bb3027b0bb [Matrix] Remove unused matrix-propagate-shape option.
The option was used during the initial bringup, but it does not add any
value at this point. Remove it.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D102930
2021-05-21 19:01:54 +01:00
Philip Reames
8e442b20eb precommit tests for D102934 and D102928 2021-05-21 10:58:48 -07:00
Lang Hames
dfb6a9ab5f [ORC] Use GTEST_SKIP in ORC C-API unit test.
Now that gtest has been updated to 1.10 which supports GTEST_SKIP, we can use
that over return;

Patch by Mats Larsen. Thanks Mats!

Reviewed By: lhames, ikudrin

Differential Revision: https://reviews.llvm.org/D102710
2021-05-21 10:15:05 -07:00
Simon Pilgrim
b944c68941 [CostModel][X86] Improve f64/v2f64/v4f64 FMUL costs on AVX1 targets to account for slower btver2
BTVER2 has a weaker f64 multiplier that other AVX1-era targets, so we need to bump the worst case cost slightly - llvm-mca reports the new vectorization in simplebb is beneficial on btver2, bdver2 and sandybridge AVX1 targets
2021-05-21 18:12:13 +01:00
Tony Tye
39f986d2c4 [NFC][AMDGPU] Add documentation for AMD Instinct MI100 accelerator
Add link to documentation for "AMD Instinct MI100 Instruction Set
Architecture" to AMDGPUUsage.rst.

Reviewed By: kzhuravl, rampitec, dp

Differential Revision: https://reviews.llvm.org/D102859
2021-05-21 16:51:13 +00:00
maekawatoshiki
dc79ae34bd Revert "[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass"
This reverts commit cea7a3fe3d1fc91a00cb54cee3ac6f361343417e.
To investigate sanitizer-x86_64-linux-fast failure.
2021-05-22 01:40:43 +09:00
Benjamin Kramer
fc9b0de2d9 [X86] Inline variable to avoid unused warning in Release builds. NFCI. 2021-05-21 18:28:46 +02:00
Simon Pilgrim
271ac585d4 [CostModel][X86] Improve fneg costs
These are always lowered as xor ops, so are always cheap
2021-05-21 17:23:45 +01:00