1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-21 18:22:53 +01:00
Commit Graph

219447 Commits

Author SHA1 Message Date
xddxd
9b52b6c39a Clean up and improve Zen CPU detection 2022-09-26 20:09:05 +03:00
xddxd
bc7b2ed172 Quick hack for Zen 4, more Zen 3 detection 2022-09-26 18:35:50 +03:00
Yahfz
e8f2aa0ae4 add raptorlake target 2022-09-01 10:03:53 +03:00
sguo35
5521155be5 Fix register clobbering on aarch64 GHC when mixing tail/non-tail calls
By default LLVM doesn't save any regs for GHC on arm64.
This means we'll clobber LR on arm64 if we make non-tail calls (e.g. L2 syscall)
So we should save LR on non-tail calls, and not assume we won't make
non-tail calls.
2022-05-28 23:38:31 +03:00
Malcolm Jestadt
c725f494c9 X86: Avoid converting EVEX to VEX when disp8 would be beneficial
Saves around 2% code size
2022-05-27 15:29:35 +03:00
sguo35
eb7a5e5301 Fix tail call guarantee setting for GHC on arm64 backend 2022-05-05 07:35:52 +03:00
nastys
509d31ad89 Force ELF on macOS 2021-12-31 21:26:34 +03:00
Nekotekina
1c0ca194dc Relax volatile stores (always return)
This commit partially reverts 2211738092
That commit caused performance regression in RPCS3
2021-12-27 01:19:26 +03:00
Nekotekina
a670c459ea Enable LLVM_USE_PERF 2021-12-24 20:17:33 +03:00
Nekotekina
318b8fe374 X86: fixup matchPMADDWD_3 2021-11-18 12:01:17 +03:00
Nekotekina
1cc7bdd501 X86: improve (V)PMADDWD detection (2)
Implement "full" pattern.
2021-11-16 13:50:49 +03:00
Nekotekina
610c27aa1c X86: disable AVX512 truncate with saturation instructions
These are not very useful in RPCS3.
However, pessimizations occur.
2021-11-16 13:45:26 +03:00
Ani
747fc0efaa lib/Support: Detect AVX-512 on Alderlake CPUs 2021-11-05 10:02:38 +03:00
Ani
18f153b33b lib/Support: Add Tigerlake and Alderlake detection 2021-11-02 16:39:40 +00:00
Ani
6382230f0e test: Remove test files
The test directory severely inflates the size of the AUR clone, and we're not even using the tests
2021-11-02 16:34:32 +00:00
Nekotekina
05332b78c7 TypeSize.h: add integral constructor to ElementCount 2021-11-02 17:42:39 +03:00
Nekotekina
902970b075 IRBuilder.h: remove deprecations 2021-11-02 17:42:39 +03:00
Nekotekina
c9fceef173 X86: fixup (V)PMADDWD detection
Fix some bugs (missing checks).
Add constant support.
2021-11-02 17:42:39 +03:00
Nekotekina
f7d625e31a X86: improve (V)PMADDWD detection
In function combineMulToPMADDWD, if 17 bit are sign bits,
not just zero bits, the optimization can be applied sometimes.
For now, detect and replace SRA pairs with SRL.
2021-11-02 17:42:39 +03:00
Nekotekina
c36b21c023 X86: modify PreserveAll CC to save full AVX-512 state 2021-11-02 17:42:39 +03:00
Nekotekina
2e65b3a418 Fix possible compilation issue with gcc-11 2021-11-02 17:42:39 +03:00
Nekotekina
548daf04b5 X86: avoid vector-scalar shifts if splat amount is directly a vector ADD/SUB/AND op.
Prefer vector-vector shifts if available (AVX2+).
Improves code generated for rotate and funnel shifts.
Otherwise it would generate a shuffle + slower vector-scalar shift.
2021-11-02 17:42:39 +03:00
Nekotekina
d5bc359dfd X86: add patterns for X86ISD::VSHLV and X86ISD::VSRLV
Replace VSELECT instruction which zeroes their result on exceeding legal SHL/SRL shift amount.
2021-11-02 17:42:39 +03:00
Nekotekina
bed700114e X86: add pattern for X86ISD::VSRAV
Detect clamping ashr shift amount to max legal value
2021-11-02 17:42:39 +03:00
Nekotekina
2ffa82223f X86: expand detectAVGPattern()
Allow all integer widths in the pattern, allow ashr
Handle signed and mixed cases, allowing to replace truncation
2021-11-02 17:42:39 +03:00
Nekotekina
5ff8f4151c X86: optimize VSELECT for v16i8 with shl + sign bit test 2021-11-02 17:42:39 +03:00
Nekotekina
4743d020ce X86: LowerShift: new algorithm for vector-vector shifts
Emit pair of shifts of double size if possible
2021-11-02 17:42:39 +03:00
Nekotekina
d5b5885c23 Disable GDBRegistrationListener
It makes emitting object extremely slow.
GDB doesn't work properly with it anyway.
GDB also often crashes because it cannot read the format.
2021-11-02 17:42:39 +03:00
Nekotekina
6516f565ed MCJIT: don't finalize modules on symbol lookup (workaround)
This is extremely slow yet unnecessary with manual finalization.
In LLVM 6 this wasn't a problem.
2021-11-02 17:42:39 +03:00
Nekotekina
d18817ded9 X86: Fix/workaround Small Code Model for JIT
Force RIP-relative jump tables and global values
These things were causing crashes due to use of absolute addressing
2021-11-02 17:42:39 +03:00
Ivan
8edd96e498 Set up CI with Azure Pipelines 2021-11-02 17:42:39 +03:00
guopeilin
39c406a58f [AArch64][GlobalISel] Use ZExtValue for zext(xor) when invert tb(n)z
Currently, we use SExtValue to decide whether to invert tbz or tbnz.
However, for the case zext (xor x, c), we should use ZExt rather
than SExt otherwise we will generate totally opposite branches.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D108755

(cherry picked from commit 5f48c144c58f6d23e850a1978a6fe05887103b17)
2021-09-21 09:15:10 -07:00
James Henderson
be1489a181 [debuginfo-test][cross-project-tests] Release note for new project name
Add a release note for the renaming of the debuginfo-test to
cross-project-tests, performed in commit
1364750dadbb56032ef73b4d0d8cbc88a51392da and follow-ons.

Reviewed by: sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D110134
2021-09-21 09:48:35 +01:00
Simon Pilgrim
aed4e7449f [X86] combineX86ShuffleChain - ensure we only peek through bitcasts to vectors (PR51858)
When searching for hidden identity shuffles (added at rG41146bfe82aecc79961c3de898cda02998172e4b), only peek through bitcasts to the source operand if it is a vector type as well.

(cherry picked from commit dcba99418438ec1d624ad207674234bd2e9e3394)
2021-09-20 11:22:27 -07:00
Jon Chesterfield
47e63cb60e [openmp] Apply code change from D109500
(cherry picked from commit 71052ea1e3c63b7209731fdc1726d10640d97480)
2021-09-13 20:57:17 -07:00
Florian Hahn
fca643847d [VPlan] Fix crash caused by not updating all users properly.
Users of VPValues are managed in a vector, so we need to be more
careful when iterating over users while updating them. For now, just
copy them.

Fixes 51798.

(cherry picked from commit 368af7558e55039e4e93c3eed68ce00da86e5e35)
2021-09-13 20:56:27 -07:00
Tom Stellard
7b93a88a1e Revert "[AArch64][GlobalISel] Legalize bswap <2 x i16>"
This reverts commit 5cd63e9ec2a385de2682949c0bbe928afaf35c91.

https://bugs.llvm.org/show_bug.cgi?id=51707
2021-09-10 21:09:59 -07:00
Nikita Popov
b6614b2353 Revert [MC][ELF] Emit separate unique sections for different flags
Commit Message from @MaskRay:

Rust has a fragile embed-bitcode implementation
(bddb59cf07/compiler/rustc_codegen_llvm/src/back/write.rs (L970-L1017))
which relied on the previous LLVM MC behavior.  Rust's LLVM fork
has carried a revert. This commit made the similar revert to help
distributions since they would otherwise probably carry a similar patch
(as they ship rust linked against system LLVM).

Fixes https://bugs.llvm.org/show_bug.cgi?id=51207.

Differential Revision: https://reviews.llvm.org/D107216
2021-09-10 16:55:29 -07:00
Elliot Saba
a967752a95 [X86] Don't clobber EBX in stackprobes
On X86, the stackprobe emission code chooses the `R11D` register, which
is illegal on i686.  This ends up wrapping around to `EBX`, which does
not get properly callee-saved within the stack probing prologue,
clobbering the register for the callers.

We fix this by explicitly using `EAX` as the stack probe register.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D109203

(cherry picked from commit ae8507b0df738205a6b9e3795ad34672b7499381)
2021-09-10 09:30:52 -07:00
Nikita Popov
7d56483207 [IR] Handle constant expressions in containsUndefinedElement()
If the constant is a constant expression, then getAggregateElement()
will return null. Guard against this before calling HasFn().

(cherry picked from commit af382b93831ae6a58bce8bc075458cfd056e3976)
2021-09-10 09:04:21 -07:00
Roman Lebedev
bedf13b1a7 [SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA form for bonus instructions (PR51125)
I can't seem to wrap my head around the proper fix here,
we should be fine without this requirement, iff we can form this form,
but the naive attempt (https://reviews.llvm.org/D106317) has failed.
So just to unblock the release, put up a restriction.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51125

(cherry picked from commit 909cba969981032c5740774ca84a34b7f76b909b)
2021-09-10 09:02:26 -07:00
Fraser Cormack
5e5115bbd0 [MemCpyOpt] Fix a variety of scalable-type crashes
This patch fixes a variety of crashes resulting from the `MemCpyOptPass`
casting `TypeSize` to a constant integer, whether implicitly or
explicitly.

Since the `MemsetRanges` requires a constant size to work, all but one
of the fixes in this patch simply involve skipping the various
optimizations for scalable types as cleanly as possible.

The optimization of `byval` parameters, however, has been updated to
work on scalable types in theory. In practice, this optimization is only
valid when the length of the `memcpy` is known to be larger than the
scalable type size, which is currently never the case. This could
perhaps be done in the future using the `vscale_range` attribute.

Some implicit casts have been left as they were, under the knowledge
they are only called on aggregate types. These should never be
scalably-sized.

Reviewed By: nikic, tra

Differential Revision: https://reviews.llvm.org/D109329

(cherry-picked from commit 7fb66d4)
2021-09-09 16:21:27 -07:00
Bradley Smith
28d769100d Workaround incorrect types when lowering fixed length gather/scatter
When lowering a fixed length gather/scatter the index type is assumed to
be the same as the memory type, this is incorrect in cases where the
extension of the index has been folded into the addressing mode.

For now add a temporary workaround to fix the codegen faults caused by
this by preventing the removal of this extension. At a later date the
lowering for SVE gather/scatters will be redesigned to improve the way
addressing modes are handled.

As a short term side effect of this change, the addressing modes
generated for fixed length gather/scatters will not be optimal.

Differential Revision: https://reviews.llvm.org/D109145

(cherry picked from commit 14e1a4a6eef2fb95ec852c9ddfc597f80bba3226)
2021-09-09 09:05:58 -07:00
Bjorn Pettersson
b37f5f2114 Inform pass manager when child loops are deleted
As part of the nontrivial unswitching we could end up removing child
loops. This patch add a notification to the pass manager when
that happens (using the markLoopAsDeleted callback).

Without this there could be stale LoopAccessAnalysis results cached
in the analysis manager. Those analysis results are cached based on
a Loop* as key. Since the BumpPtrAllocator used to allocate
Loop objects could be resetted between different runs of for
example the loop-distribute pass (running on different functions),
a new Loop object could be created using the same Loop pointer.
And then when requiring the LoopAccessAnalysis for the loop we
got the stale (corrupt) result from the destroyed loop.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D109257

(fixes PR51754)
(cherry-picked from commit 0f0344dd1e3b53387bb396070916e67f4c426da6)
2021-09-09 09:04:59 -07:00
serge-sans-paille
601c2dd9dd Fine grain control over some symbol visibility
Setting -fvisibility=hidden when compiling Target libs has the advantage of
not being intrusive on the codebase, but it also sets the visibility of all
functions within header-only component like ADT. In the end, we end up with
some symbols with hidden visibility within llvm dylib (through the target libs),
and some with external visibility (through other libs). This paves the way for
subtle bugs like https://reviews.llvm.org/D101972

This patch explicitly set the visibility of some classes to `default` so that
`llvm::Any` related symbols keep a `default` visibility. Indeed a template
function with `default` visibility parametrized by a type with `hidden`
visibility is granted `hidden` visibility, and we don't want this for the
uniqueness of `llvm::Any::TypeId`.

Differential Revision: https://reviews.llvm.org/D108943
2021-09-08 21:06:19 -07:00
Cullen Rhodes
5f6ef6fbfd [AArch64][SME] Fix imm bug in mov vector to tile aliases
Also fixes a warning mentioned in D109359.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D109363

(cherry picked from commit 89786c2b992c3cb4c4a230542d2af34ec2915a08)
2021-09-08 20:47:08 -07:00
Chen Zheng
921995afd5 Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount."
This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and
is not a right patch according to comments in D91724

This reverts commit 42eaf4fe0adef3344adfd9fbccd49f325cb549ef.

(cherry picked from commit 34badc409cc452575c538c4b6449546adc38f121)
2021-09-08 20:46:17 -07:00
Jonas Paulsson
d21237cb11 [SelectionDAGBuilder] Bugfix in visitInlineAsm()
In case of a virtual register tied to a phys-def, the register class needs to
be computed. Make sure that this works generally also with fast regalloc by
using TLI.getRegClassFor() whenever possible, and make only the case of
'Untyped' use getMinimalPhysRegClass().

Fixes https://bugs.llvm.org/show_bug.cgi?id=51699.

Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109291

(cherry picked from commit 118997d8e931dcb4c6e972611a7e4febcc33a061)
2021-09-08 14:03:50 -07:00
Maksim Panchenko
64f4a52b52 [llvm-objdump] Fix 'llvm-objdump -dr' for executables with relocations
Print relocations interleaved with disassembled instructions for
executables with relocatable sections, e.g. those built with "-Wl,-q".

Differential Revision: https://reviews.llvm.org/D109016

(cherry picked from commit 6300e4ac5806c9255c68c6fada37b2ce70efc524)
2021-09-08 08:45:39 -07:00
Hans Wennborg
09c230ecec Add llvm-ml to LLVM_TOOLCHAIN_TOOLS (PR50536)
so that it gets installed in LLVM_INSTALL_TOOLCHAIN_ONLY builds,
such as used by the Windows installer.

Differential revision: https://reviews.llvm.org/D109358

(cherry picked from commit c364dcbf1fd81c6291e935564fce2d9ebb97a3d0)
2021-09-08 08:44:45 -07:00