1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

200851 Commits

Author SHA1 Message Date
Roman Lebedev
942c801f82 [Reduce] Argument reduction: do properly handle invoke insts (PR46819)
replaceFunctionCalls() is very non-exhaustive, it only handles
CallInst's. Which means, by the time we drop old function,
there may still be uses of it lurking around.
Let's instead whack-a-mole them by all by replacing with undef.

I'm not sure this is the best handling, especially for calls, but IMO
poorly reduced input is much better than crashing reduction tool.
A (previously-crashing!) test added.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46819
2020-07-26 01:29:00 +03:00
Roman Lebedev
c62f97f67d [Reduce] Basic block reduction: do properly handle invoke insts (PR46818)
Terminator may have returned value, so we need to replace uses,
and in general handle invoke as a branch inst.

I'm not sure this is the best handling, but IMO poorly reduced
input is much better than crashing reduction tool.
A (previously-crashing!) test added.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46818
2020-07-26 01:28:59 +03:00
Lang Hames
ef607c6171 [ORC] Rename TargetProcessControl DynamicLibraryHandle and loadLibrary.
The new names, DylibHandle and loadDylib, are more concise and make
clear that these utilities are for loading dynamic libraries, not static
ones.
2020-07-25 15:21:43 -07:00
Lang Hames
8dd38c568c [ORC] Don't require PageSize or Triple during TargetProcessControl construction
Subclasses will commonly gather that information from a remote during
construction, in which case they won't have meaningful values to pass to
TargetProcessControl's constructor.
2020-07-25 15:21:43 -07:00
Philip Reames
dfcbe0aef2 [Statepoints] Support lowering gc relocations to virtual registers
(Disabled under flag for the moment)

This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator.  Today, we force spill all operands in SelectionDAG.  The code to do so is distinctly non-optimal.  The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to.  In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots.

This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above.  In particular, the first N are lowered to variadic tied def/use pairs.  So new statepoint looks like this:
reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ...

N is limited by the maximal number of tied registers machine instruction can have (15 at the moment).

The current patch is restricted to handling relocations within a single basic block.  Cross block relocations (e.g. invokes) are handled via the legacy mechanism.  This restriction will be relaxed in future patches.

Patch By: dantrushin
Differential Revision: https://reviews.llvm.org/D81648
2020-07-25 14:26:05 -07:00
Craig Topper
a376bad472 [X86] Add llvm.roundeven test cases. Add f80 tests cases for constrained intrinsics that lower to libcalls. NFC 2020-07-25 13:29:47 -07:00
Craig Topper
12c5aa4515 [X86] Fix intrinsic names in strict fp80 tests to use f80 in their names instead of x86_fp80.
The type is called x86_fp80, but when it is printed in the intrinsic
name it should be f80. The parser doesn't seem to care that the
name was wrong.
2020-07-25 13:12:49 -07:00
LLVM GN Syncbot
91051ca099 [gn build] Port 136c8f50e96 2020-07-25 18:51:58 +00:00
Roman Lebedev
f9cbb7144f [Reduce] Try turning function definitions into declarations first, NFCI-ish
ReduceFunctions could do it, but it also replaces *all* calls with undef,
so if any of undef replacements makes reduction uninteresting,
it won't work.

ReduceBasicBlocks also could do it, but well, it may take many guesses
for all the blocks of a function to happen to be out-of-chunk,
which is not a very efficient way to go about it.

So let's just do this first.
2020-07-25 21:43:36 +03:00
Florian Hahn
2f84723455 [X86] Remove stress-scheduledagrrlist.ll.
This test seems to take quite a long time with EXPENSIVE_CHECKS.

Remove it.
2020-07-25 15:45:24 +01:00
Nikita Popov
23a372d0f2 [LVI] Don't require operand number for range (NFC)
Pass the Value* instead of the operand number, rename I to CxtI.
This makes the function a bit more generally useful.
2020-07-25 16:33:45 +02:00
Matt Arsenault
4251a6ded6 AMDGPU/GlobalISel: Don't assert on G_INSERT > 128-bits
Just fallback for now. Really tablegen needs to generate all of the
subregister index handling we need.
2020-07-25 10:05:44 -04:00
Nikita Popov
004e042624 [SCCP] Add assume non null test (NFC) 2020-07-25 16:02:15 +02:00
Nikita Popov
86610c55e0 [SCCP] Restore the change reporting as well
Reapply 5db5b4bc4394ca247c9eb665e03b851848aa2fbf.
2020-07-25 15:11:30 +02:00
Nikita Popov
74624d8649 Reapply [SCCP] Directly remove non-feasible edges
Reapply with DTU update moved after CFG update, which is a
requirement of the API.

-----

Non-feasible control-flow edges are currently removed by replacing
the branch condition with a constant and then calling
ConstantFoldTerminator. This happens in a rather roundabout manner,
by inspecting the users (effectively: predecessors) of unreachable
blocks, and further complicated by the need to explicitly materialize
the condition for "forced" edges. I would like to extend SCCP to
discard switch conditions that are non-feasible based on range
information, but this is incompatible with the current approach
(as there is no single constant we could use.)

Instead, this patch explicitly removes non-feasible edges. It
currently only needs to handle the case where there is a single
feasible edge. The llvm_unreachable() branch will need to be
implemented for the aforementioned switch improvement.

Differential Revision: https://reviews.llvm.org/D84264
2020-07-25 14:52:35 +02:00
Simon Pilgrim
b71064b07e SimplifyLibCalls - remove unnecessary header and forward declaration. NFC.
We include TargetLibraryInfo.h so don't need to forward declare it, and we don't need to include TargetLibraryInfo.h in SimplifyLibCalls.cpp as well.
2020-07-25 12:58:39 +01:00
Simon Pilgrim
85fa3b1299 [X86][SSE] combineX86ShufflesRecursively - move all Root node asserts to the same location. NFCI.
Minor tidyup for some upcoming shuffle combine improvements.
2020-07-25 12:48:14 +01:00
Simon Pilgrim
a3ff65cfbb SymbolRemappingReader.h - pass Twine by reference not value. NFCI. 2020-07-25 12:48:14 +01:00
Florian Hahn
c1dc5ad58a [IPSCCP] Drop argmemonly after replacing pointer argument.
This patch updates IPSCCP to drop argmemonly and
inaccessiblemem_or_argmemonly if it replaces a pointer argument.

Fixes PR46717.

Reviewers: efriedma, davide, nikic, jdoerfert

Reviewed By: efriedma, jdoerfert

Differential Revision: https://reviews.llvm.org/D84432
2020-07-25 11:52:14 +01:00
Nathan James
edd124eebe Fix C2975 error under MSVC
Apparantly a constexpr value isn't a compile time constant under certain versions of MSVC.
2020-07-25 11:03:59 +01:00
Simon Pilgrim
5a6feccd46 [X86][SSE] getFauxShuffle - ignore undemanded sources for PACKSS/PACKUS faux shuffles
If we don't care about an entire LHS/RHS of the PACK op, then can just treat it the same as undef (we don't care if it saturates) and is safe to treat as a shuffle.

This can happen if we attempt to decode as a faux shuffle before SimplifyDemandedVectorElts has been called on the PACK which should replace the source with UNDEF entirely.
2020-07-25 10:51:14 +01:00
Nathan James
8b5de07c6a [ADT] Add a range-based version of std::move
Adds a range-based version of `std::move`, the version that moves a range, not the one that creates r-value references.

Reviewed By: dblaikie, gamesh411

Differential Revision: https://reviews.llvm.org/D83902
2020-07-25 10:37:34 +01:00
Jessica Paquette
0266a8b395 [AArch64][GlobalISel] Look through constants when selection stores of 0
Very minor code size improvements (hits 8 times in Bullet at -O3), but still
something.

Also very minor NFC change to make sure we only search for a 0 constant when
selecting a store. Before, we'd do this for loads as well.

Differential Revision: https://reviews.llvm.org/D84573
2020-07-24 22:46:14 -07:00
Amy Kwan
2f09b60413 [PowerPC] Exploit the High Order Vector Multiply Instructions on Power10
This patch aims to exploit the following vector multiply high instructions on Power10.
vmulhsw VRT, VRA, VRB
vmulhsd VRT, VRA, VRB
vmulhuw VRT, VRA, VRB
vmulhud VRT, VRA, VRB

Differential Revision: https://reviews.llvm.org/D82584
2020-07-24 20:57:57 -05:00
Rong Xu
f48156accc [PGO] Fix incorrect function entry count
Function entry count might be zero after the profile counts reset and
before reentry to the function.

Zero profile entry count is very bad as the profile count from BFI will
be wrong.

A simple fix is to set the profile entry count to 1 if there are
non-zero profile counts in this function.

Differential Revision: https://reviews.llvm.org/D84378
2020-07-24 17:39:55 -07:00
Rong Xu
873a89587d [PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Skip profile count promotion if any of the ExitBlocks contains a ret
instruction. This is to prevent dumping of incomplete profile -- if the
the loop is a long running loop and dump is called in the middle
of the loop, the result profile is incomplete.

ExitBlocks containing a ret instruction is an indication of a long running
loop -- early exit to error handling code.

Differential Revision: https://reviews.llvm.org/D84379
2020-07-24 17:38:31 -07:00
Rong Xu
ab6b85942c Revert "[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction"
This reverts commit 6fdc6f6c7d34af60c45c405f448370a684ef6f2a.
2020-07-24 17:35:44 -07:00
Rong Xu
84b04d6437 Revert "[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction"
This reverts commit 867ef4472d8e57384c929e4f06b74d1ac8883a99.
2020-07-24 17:33:49 -07:00
Rong Xu
ce1bcf4d0a [PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Forgot including the tests in the commit 6fdc6f6c7d34af60c4.
2020-07-24 17:23:33 -07:00
Amy Kwan
dedffec69e [PowerPC] Implement Truncate and Store VSX Vector Builtins
This patch implements the `vec_xst_trunc` function in altivec.h in  order to
utilize the Store VSX Vector Rightmost [byte | half | word | doubleword] Indexed
instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82467
2020-07-24 19:22:39 -05:00
Jessica Paquette
b6d2a38769 [AArch64][GlobalISel] Use wzr/xzr for 16 and 32 bit stores of zero
We weren't performing this optimization on 16 and 32 bit stores. SDAG happily
does this though.

e.g. https://godbolt.org/z/cWocKr

This saves about 0.2% in code size on CTMark at -O3.

Differential Revision: https://reviews.llvm.org/D84568
2020-07-24 17:15:20 -07:00
Rong Xu
27560ce987 [PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Skip profile count promotion if any of the ExitBlocks contains a ret
instruction. This is to prevent dumping of incomplete profile -- if the
the loop is a long running loop and dump is called in the middle
of the loop, the result profile is incomplete.

ExitBlocks containing a ret instruction is an indication of a long running
loop -- early exit to error handling code.

Differential Revision: https://reviews.llvm.org/D84379
2020-07-24 17:13:58 -07:00
Matt Arsenault
16ca0e0369 GlobalISel: Define mulfix/divfix opcodes
The full expansion involves the funnel shifts, which depend on another
patch to expand those.
2020-07-24 20:02:20 -04:00
Amara Emerson
8f9bed84e6 [AArch64][GlobalISel] Promote G_UITOFP vector operands to same elt size as result.
Fixes legalization failures.
2020-07-24 17:00:50 -07:00
Alina Sbirlea
0651a23551 Reapply "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff."
This is the part of the patch that's moving the Updates to a CFGDiff
object. Splitting off from the clean-up work merging the two branches when BUI is null.

Differential Revision: https://reviews.llvm.org/D77341
2020-07-24 14:10:50 -07:00
Matt Arsenault
26921b3bb5 AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z
PHIElimination/createPHISourceCopy inserts non-branch terminators
after the control flow pseudo if a successor phi reads a register
defined by the control flow pseudo. If this happens, we need to split
the expansion of the control flow pseudo to ensure all the branches
are after all of the other mask management instructions.

GlobalISel hit this in testscases that happened to be tail
duplicated. The original testcase still does not work, since the same
problem appears to be present in a later pass.
2020-07-24 16:51:59 -04:00
Eli Friedman
d2a7f965f0 [AArch64][SVE] Add "fast" fcmp operations.
dacf8d3 added support for most fcmp operations, but there are some extra
variations I hadn't considered: SelectionDAG supports float comparisons
that are neither ordered nor unordered. Add support for the missing
operations.

Differential Revision: https://reviews.llvm.org/D84460
2020-07-24 13:22:41 -07:00
Johannes Doerfert
b8680170b1 [SROA] Teach promote to register about droppable instructions
This is the second of two patches to address PR46753. We basically allow
SROA to promote allocas that are used in doppable instructions, for
now that means `llvm.assume`. The (transitive) uses are replaced by
`undef` in the droppable instructions.

See also D83976.

Reviewed By: Tyker

Differential Revision: https://reviews.llvm.org/D83978
2020-07-24 15:15:39 -05:00
Johannes Doerfert
ac3ceab3a2 [Mem2Reg] Teach promote to register about droppable instructions
This is the first of two patches to address PR46753. We basically allow
mem2reg to promote allocas that are used in doppable instructions, for
now that means `llvm.assume`. The uses of the alloca (or a bitcast or
zero offset GEP from there) are replaced by `undef` in the droppable
instructions.

Reviewed By: Tyker

Differential Revision: https://reviews.llvm.org/D83976
2020-07-24 15:15:38 -05:00
Johannes Doerfert
e393443e6f [SROA][Mem2Reg] Do not crash on alloca + addrspacecast
SROA knows that it can look through addrspacecast but
PromoteMemoryToRegister did not handle them. This caused an assertion
error for the test case, exposed while running
`Transforms/PhaseOrdering/inlining-alignment-assumptions.ll` with D83978
applied.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D84085
2020-07-24 15:15:38 -05:00
Gui Andrade
519182b1ba [MSAN] Allow inserting array checks
Flattens arrays by ORing together all their elements.

Differential Revision: https://reviews.llvm.org/D84446
2020-07-24 20:12:58 +00:00
Nemanja Ivanovic
fe8cc25a97 [PowerPC] Fix computation of offset for load-and-splat for permuted loads
Unfortunately this is another regression from my canonicalization patch
(1fed131660b2). The patch contained two implicit assumptions:
1. That we would have a permuted load only if we are loading a partial vector
2. That a partial vector load would necessarily be as wide as the splat

However, assumption 2 is not correct since it is possible to do a wider
load and only splat a half of it. This patch corrects this assumption by
simply checking if the load is permuted and adjusting the offset if it is.
2020-07-24 15:38:46 -04:00
Martin Storsjö
1d7926a981 [MC] [COFF] Make sure that weak external symbols are undefined symbols
For comdats (e.g. caused by -ffunction-sections), Section is already
set here; make sure it's null, for the weak external symbol to be undefined.

This fixes PR46779.

Differential Revision: https://reviews.llvm.org/D84507
2020-07-24 22:15:08 +03:00
Martin Storsjö
e8a94ed276 [llvm-lib] Support adding short import library objects with llvm-lib
This fixes PR 42837.

Differential Revision: https://reviews.llvm.org/D84465
2020-07-24 22:15:08 +03:00
Arthur Eubanks
b8cfb14c48 Rename scoped-noalias -> scoped-noalias-aa
Summary: To match NewPM name. Also the new name is clearer and more consistent.

Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84542
2020-07-24 12:14:27 -07:00
Valentin Clement
d92fc8cf98 [openmp] Clean up OMPKinds.def remove OMP_DIRECTIVE
This patch removes the OMP_DIRECTIVE definition from OMPKinds.def since they
are now defined in OMP.td and OMP_DIRECTIVE is not used anymore in the code.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D84329
2020-07-24 15:06:54 -04:00
madhur13490
2e4d158121 [AMDGPU] Fix incorrect arch assert while setting up FlatScratchInit
Reviewers: arsenm, foad, rampitec, scott.linder

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84391
2020-07-24 18:19:04 +00:00
Craig Topper
a1cd135fcb [X86] Move the implicit enabling of sse2 for 64-bit mode from X86Subtarget::initSubtargetFeatures to X86_MC::ParseX86Triple.
ParseX86Triple already checks for 64-bit mode and produces a
static string. We can just add +sse2 to the end of that static
string. This avoids a potential reallocation when appending it
to the std::string at runtime.

This is a slight change to the behavior of tools that only use
MC layer which weren't implicitly enabling sse2 before, but will
now. I don't think we check for sse2 explicitly in any MC layer
components so this shouldn't matter in practice. And if it did
matter the new behavior is more correct.
2020-07-24 11:14:20 -07:00
Francesco Petrogalli
5f0a55fcfc [llvm][sve] Reg + Imm addressing mode for ld1ro.
Reviewers: kmclaughlin, efriedma, sdesmalen

Subscribers: tschuett, hiraditya, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83357
2020-07-24 17:48:47 +00:00
Craig Topper
ab68f0ac7c [X86] Use X86_MC::ParseX86Triple to add mode features to feature string in X86Subtarget::initSubtargetFeatures.
Remove mode flags from constructor and remove calls to
ToggleFeature for the mode bits.

By adding them to the feature string we handle initializing the
mode member variables in X86Subtarget and the feature bits in
MCSubtargetInfo in one shot.
2020-07-24 10:48:22 -07:00