1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

210647 Commits

Author SHA1 Message Date
Florian Hahn
a84d68c5a7 [ConstraintElimination] Add test with pointer bitcast. 2021-02-02 17:36:05 +00:00
Fangrui Song
b110101bd6 [MC] Support SHF_GNU_RETAIN as section flag 'R'
On Linux target triples, GNU as sets EI_OSABI to ELFOSABI_GNU when SHF_GNU_RETAIN is used。
On `*-*-freebsd`, it usually sets EI_OSABI to ELFOSABI_FREEBSD.

GNU ld respects SHF_GNU_RETAIN only for ELFOSABI_FREEBSD/ELFOSABI_GNU.
https://sourceware.org/bugzilla/show_bug.cgi?id=27282

MC doesn't set ELFOSABI_GNU for SHF_GNU_RETAIN/STB_GNU_UNIQUE/STT_GNU_IFUNC.
MC assembled object files do not have special semantics in GNU ld.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D95730
2021-02-02 09:34:09 -08:00
Fangrui Song
4be3ad3853 [yaml2obj/obj2yaml/llvm-readobj] Support SHF_GNU_RETAIN
In binutils, the flag is defined for ELFOSABI_GNU and ELFOSABI_FREEBSD.
It can be used to mark a section as a GC root.

In practice, the flag has generic semantics and can be applied to many
EI_OSABI values, so we consider it generic.

Differential Revision: https://reviews.llvm.org/D95728
2021-02-02 09:19:53 -08:00
Sanjay Patel
a9c32b94c7 [ExpandReductions] add test for fmin with FMF; NFC 2021-02-02 12:17:08 -05:00
Jeroen Dobbelaere
378758ee45 [InlineFunction] Only update noalias scopes once for an instruction.
Inlining sometimes maps different instructions to be inlined onto the same instruction.

We must ensure to only remap the noalias scopes once. Otherwise the scope might disappear (at best).
This patch ensures that we only replace scopes for which the mapping is known.

This approach is preferred over tracking which instructions we already handled in a SmallPtrSet,
as that one will need more memory.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95862
2021-02-02 17:57:10 +01:00
David Green
e7a4a9e940 [ARM] Correct some tablegen operand types. NFC 2021-02-02 16:55:31 +00:00
Florian Hahn
4314d26857 [ConstraintElimination] Add nicer way to dump constraints (NFC).
Use ConstraintSystem::dump(Names) to display the result of decomposing a
condition.
2021-02-02 16:36:45 +00:00
David Green
91ae3d7f3f [ARM] Mark MVE_VMOV_to_lane_32 as isInsertSubregLike
This allows the peephole optimizer to know that a MVE_VMOV_to_lane_32 is
the same as an insert subreg, allowing it to optimize some redundant
lane moves.

Differential Revision: https://reviews.llvm.org/D95433
2021-02-02 16:35:47 +00:00
Sebastian Neubauer
e79a30d836 [AMDGPU] Remove unused tmp register
The temporary register is only used to compute the frame pointer.
The frame pointer is overwritten and not used in between, so we
can reuse the frame pointer for the computation, saving one register.

Differential Revision: https://reviews.llvm.org/D95865
2021-02-02 17:17:54 +01:00
Sebastian Neubauer
1d5127c810 [AMDGPU] Save fp/bp after csr saves
Saving callee-save registers happens in whole wave mode. Exec is saved
to a free register, which can be reused to save the frame pointer.
Therefore, saving the fp needs to happen after saving csrs.

Differential Revision: https://reviews.llvm.org/D95861
2021-02-02 17:17:54 +01:00
Wenlei He
eddb3ffdfb [CSSPGO] Factor out common part for CSSPGO inline and AFDO inline
Refactoring SampleProfileLoader::inlineHotFunctions to use helpers from CSSPGO inlining and reduce similar code in the inlining loop, plus minor cleanup for AFDO path.

This is resubmit of D95024, with build break and overtighten assertion fixed.

Test Plan:
2021-02-02 07:55:08 -08:00
Stefan Pintilie
020aa3c460 [PowerPC] Materialize 34 bit constants with pli on Power 10.
NOTE: This patch was originally written by Anil Mahmud. His code has been
rebased but otherwise left mostly unchanged.

A new instructon on Power 10 allows for the materialization of 34 bit
immediate values. This patch allows the compiler to take advantage of
the new instruction in this situation.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D92879
2021-02-02 09:49:22 -06:00
David Green
bee1ac1e53 [ARM] Add MVE insert-of-extract pattern
A v4i32 insert of an extract can become a simple lane move, as opposed
to round-tripping via a GPR. This adds a patterns that turns an v4i32
insert-extract pair into a EXTRACT_SUBREG/INSERT_SUBREG, with the
required COPY_TO_REGCLASS. These get better optimized into a simple lane
move by the rest of the backend.

Differential Revision: https://reviews.llvm.org/D95428
2021-02-02 15:15:04 +00:00
Roman Lebedev
0136cec13c [InstCombine] Host inversion out of ashr's value operand (PR48995)
This is a yet another hint that we will eventually need InstCombineInverter,
which would consistently sink inversions, but but for that we'll need
to consistently hoist inversions where possible, so let's do that here.

Example of a proof: https://alive2.llvm.org/ce/z/78SbDq

See https://bugs.llvm.org/show_bug.cgi?id=48995
2021-02-02 17:56:43 +03:00
Roman Lebedev
27b7c783ff [NFC][InstCombine] Add tests for (~x) a>> y --> ~(x a>> y) fold (PR48995)
See https://bugs.llvm.org/show_bug.cgi?id=48995
2021-02-02 17:56:31 +03:00
Tom Weaver
99a2c6afdf Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit df3e39f60b356ca9dbfc11e96e5fdda30afa7acb.

introduced failing test instrprof-gc-sections.c
causing build bot to fail:
http://lab.llvm.org:8011/#/builders/53/builds/1184
2021-02-02 14:19:31 +00:00
David Green
b8cb37b9c8 [ARM] Extra shuffle tests. NFC 2021-02-02 14:16:42 +00:00
David Green
dc98bdfb7d [ARM] Select VINS from vector inserts
This patch adds tablegen patterns for pairs of i16/f16 insert/extracts.
If we are inserting into two adjacent vector lanes (0 and 1 for
example), we can use either a vmov;vins or vmovx;vins to insert the pair
together, avoiding a round-trip from GRP registers. This is quite a
large patterns with a number of EXTRACT_SUBREG/INSERT_SUBREG/
COPY_TO_REGCLASS nodes, but hopefully as most of those become copies all
that will be cleaned up by further optimizations.

The VINS pattern was also adjusted to allow it to represent that it is
inserting into the top half of an existing register.

Differential Revision: https://reviews.llvm.org/D95381
2021-02-02 13:50:02 +00:00
Simon Pilgrim
347d983c7a [X86][SSE] LowerINSERT_VECTOR_ELT - pull out repeated EltSizeInBits calls. NFCI. 2021-02-02 13:45:18 +00:00
Sander de Smalen
45b8ee9aa4 NFC: Migrate SpeculateAroundPHIs to work on InstructionCost
This patch migrates cost values and arithmetic to work on InstructionCost.
When the interfaces to TargetTransformInfo are changed, any InstructionCost
state will propagate naturally.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: ctetreau

Differential Revision: https://reviews.llvm.org/D95353
2021-02-02 13:32:45 +00:00
Sander de Smalen
53e3448991 NFC: Migrate SimpleLoopUnswitch to work on InstructionCost
This patch migrates cost values and arithmetic to work on InstructionCost.
When the interfaces to TargetTransformInfo are changed, any InstructionCost
state will propagate naturally.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D95352
2021-02-02 13:32:44 +00:00
Dmitry Preobrazhensky
e22eed53cb [AMDGPU][MC] Corrected parsing of optional modifiers
Fixed bugs in parsing of "no*" modifiers and improved errors handling.
See https://bugs.llvm.org/show_bug.cgi?id=41282.

Differential Revision: https://reviews.llvm.org/D95675
2021-02-02 14:52:29 +03:00
Simon Pilgrim
8caaa7e0d2 [X86][AVX512] Support variable-index vector insertion on AVX512 targets (PR47924)
With predicate masks, AVX512 can efficiently perform variable-index vector insertion with 2 broadcasts + 1 comparison, avoiding a lot of aliased memory traffic.

Differential Revision: https://reviews.llvm.org/D95779
2021-02-02 11:41:18 +00:00
Andrew Ng
dd96c8fb71 [X86] Fix disassembly of x86-64 GDTLS code sequence
For x86-64 the REX.w prefix takes precedence over any other size
override (i.e. 0x66). Therefore, for x86-64 when REX.w is present set
'hasOpSize' to false to ensure that any size override is ignored.

Fixes PR48901.

Differential Revision: https://reviews.llvm.org/D95682
2021-02-02 11:35:00 +00:00
Simon Pilgrim
0e7f51461d [X86][AVX] Add missing VEX_WIG tags from VPACKUSDW/VPHSUBD/VPCMPISTRI/VPCMPISTRM/VPCMPESTRI/VPCMPESTRM
Fixes PR48877

Differential Revision: https://reviews.llvm.org/D95801
2021-02-02 11:25:44 +00:00
David Green
505ef9e320 [ARM] Remove DLS lr, lr
A DLS lr, lr instruction only moves lr to itself. It need not be emitted
on it's own to save a instruction in the loop preheader.

Differential Revision: https://reviews.llvm.org/D78916
2021-02-02 11:09:31 +00:00
Adrian Kuegel
306c653648 Revert "[CSSPGO] Factor out common part for CSSPGO inline and AFDO inline"
This reverts commit 9a03058d6322edb8abc803ba3e436cc62647d979.
2021-02-02 11:51:04 +01:00
Adrian Kuegel
930ba0d06e Revert "Fix build break from D95024"
This reverts commit 09cd849fdef2b2d3de2d0b0a5c512100957e0ef6.
2021-02-02 11:51:04 +01:00
David Green
2ad4eade18 [ARM] Regenerate LowOverheadLoops mir tests. NFC 2021-02-02 10:28:58 +00:00
David Sherwood
81a8f9f3aa [SVE][LoopVectorize] Add masked load/store and gather/scatter support for SVE
This patch updates IRBuilder::CreateMaskedGather/Scatter to work
with ScalableVectorType and adds isLegalMaskedGather/Scatter functions
to AArch64TargetTransformInfo. In addition I've fixed up
isLegalMaskedLoad/Store to return true for supported scalar types,
since this is what the vectorizer asks for.

In LoopVectorize.cpp I've changed
LoopVectorizationCostModel::getInterleaveGroupCost to return an invalid
cost for scalable vectors, since currently this relies upon using shuffle
vector for reversing vectors. In addition, in
LoopVectorizationCostModel::setCostBasedWideningDecision I have assumed
that the cost of scalarising memory ops is infinitely expensive.

I have added some simple masked load/store and gather/scatter tests,
including cases where we use gathers and scatters for conditional invariant
loads and stores.

Differential Revision: https://reviews.llvm.org/D95350
2021-02-02 09:52:39 +00:00
Benjamin Kramer
96428d51bc Fold one-use variable into assert. NFCI.
Avoids a warning in Release builds.
2021-02-02 10:50:48 +01:00
Sebastian Neubauer
8175a7d5b7 [AMDGPU] Mark epilog restores as frame-destroy
I guess instructions were marked as frame-setup by accident, they are
restores as part of the epilog.

Differential Revision: https://reviews.llvm.org/D95783
2021-02-02 10:24:37 +01:00
Sebastian Neubauer
32c7642ef3 [AMDGPU] Clarify calling conv about inactive lanes
So far, it was not specified what happens with the VGPRs of inactive
lanes when functions are called. This patch explicitely mentions that
the VGPR values of inactive lanes need to be preserved for all
registers.

This describes the current behavior, as only active lanes of registers
are saved to scratch. Also, as the multi-lane nature of VGPRs is not
properly modeled, we cannot determine the live VGPRs from inactive lanes
at calls. So we cannot save them, even if we intended to do so.

Differential Revision: https://reviews.llvm.org/D95610
2021-02-02 10:15:09 +01:00
Wenlei He
41fcc00b36 Fix build break from D95024 2021-02-02 01:01:12 -08:00
Wenlei He
60e5015150 [CSSPGO] Factor out common part for CSSPGO inline and AFDO inline
Refactoring SampleProfileLoader::inlineHotFunctions to use helpers from CSSPGO inlining and reduce similar code in the inlining loop, plus minor cleanup for AFDO path.

Test Plan:

Differential Revision: https://reviews.llvm.org/D95024
2021-02-02 00:34:06 -08:00
Thomas Symalla
af31f24b1c Fixed includes.
Differential Revision: https://reviews.llvm.org/D93708
2021-02-02 09:14:54 +01:00
Thomas Symalla
298055ddcd Fixed includes. 2021-02-02 09:14:54 +01:00
Thomas Symalla
a7e61e92bb Reverted whitespace changes.
Differential Revision: https://reviews.llvm.org/D90968
2021-02-02 09:14:54 +01:00
Thomas Symalla
b7a94c0dc2 Added missing includes. 2021-02-02 09:14:54 +01:00
Thomas Symalla
ba95ff2c1c Renamed med3 opcode, removed superfluous copy. 2021-02-02 09:14:54 +01:00
Thomas Symalla
087ff79f1a Removed the generic virtual register creations. Reworked the tests. 2021-02-02 09:14:54 +01:00
Thomas Symalla
f548399826 Implemented a MED3_S32 GIR opcode. 2021-02-02 09:14:53 +01:00
Thomas Symalla
ee664c032b Added and used new target pseudo for v_cvt_pk_i16_i32, changes due to code review. 2021-02-02 09:14:53 +01:00
Thomas Symalla
43278a1cb3 Formatting changes 2021-02-02 09:14:53 +01:00
Thomas Symalla
46f1f49a56 Formatting changes. 2021-02-02 09:14:53 +01:00
Thomas Symalla
8aacc5adcd Updating formatting changes. 2021-02-02 09:14:53 +01:00
Thomas Symalla
8955159f2d Resolve formatting changes. 2021-02-02 09:14:53 +01:00
Thomas Symalla
38676d07e0 Code changes yielded from review. 2021-02-02 09:14:53 +01:00
Thomas Symalla
4c17035470 Fixed tests. 2021-02-02 09:14:53 +01:00
Thomas Symalla
ea43201600 Move step to PreLegalizer 2021-02-02 09:14:53 +01:00