1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

213567 Commits

Author SHA1 Message Date
Tony
99c3eaf30c [NFC][AMDGPU] Add product names for gfx908 and gfx10 processors
Reviewed By: msearles

Differential Revision: https://reviews.llvm.org/D99781
2021-04-02 00:58:11 +00:00
Philip Reames
715b8fbf7c [indvars[ Fix pr49802 by checking for SCEVCouldNotCompute
The code is assuming that having an exact exit count for the loop implies that exit counts for every exit are known.  This used to be true, but when we added handling for dead exits we broke this invariant.  The new invariant is that an exact loop count implies that any exits non trivially dead have exit counts.

We could have fixed this by either a) explicitly checking for a dead exit, or b) just testing for SCEVCouldNotCompute.  I chose the second as it was simpler.

(Debugging this took longer than it should have since I'd mistyped the original assert and it wasn't checking what it was meant to...)

p.s. Sorry for the lack of test case.  Getting things into a state to actually hit this is difficult and fragile.  The original repro involves loop-deletion leaving SCEV in a slightly inprecise state which lets us bypass other transforms in IndVarSimplify on the way to this one.  All of my attempts to separate it into a standalone test failed.
2021-04-01 17:53:44 -07:00
Daniel Rodríguez Troitiño
da257e75a8 [TextAPI] Add support for arm64_32
Add a new architecture definition for arm64_32. The change should allow
the new architecture arm64_32 to be recognized in several pieces of
code, TextAPI parsing one of them. llvm-lipo will also recognize the
architecture and will allow lipoing files with this architecture without
failing.

Includes a small test that the architecture is recognized by llvm-nm.

Reviewed By: cishida

Differential Revision: https://reviews.llvm.org/D99673
2021-04-01 17:19:12 -07:00
Craig Topper
7b094dfc34 [RISCV] Add nxvXi64 test cases to the RV32 Zvamo intrinsic test files. NFC 2021-04-01 17:08:20 -07:00
Thomas Preud'homme
1242cc0617 [MIPS, test] Fix use of undef FileCheck var
LLVM test CodeGen/Mips/sr1.ll tries to check for the absence of a
sequence of instructions with several CHECK-NOT with one of those
directives using a variable defined in another. However CHECK-NOT are
checked independently so that is using a variable defined in a pattern
that should not occur in the input.

This commit removes the definition and uses of variable to check each
line independently, making the check stronger than the current one.

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D99776
2021-04-02 00:59:49 +01:00
Daniel Sanders
027750fc6f Revert "[globalisel][unittests] Rename setUp() to avoid potential mix up with SetUp() from gtest"
Forgot to apply commit message changes from phabricator

This reverts commit 3a016e31ecef7eeb876b540c928a25a7c5d2e07a.
2021-04-01 16:47:43 -07:00
Daniel Sanders
90f5d96d6c [globalisel][unittests] Rename setUp() to avoid potential mix up with SetUp() from gtest
Also, make it structurally required so it can't be forgotten and re-introduce
the bug that led to the rotten green tests.

Differential Revision: https://reviews.llvm.org/D99692
2021-04-01 16:42:07 -07:00
cchen
4feb461822 [OpenMP51] Accept primary as proc bind affinity policy in Clang
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99622
2021-04-01 18:07:12 -05:00
Tom Stellard
26663577f0 llvm-shlib: Create object libraries for each component and link against them
This makes it possible to build libLLVM.so without first creating a
static library for each component.  In the case where only libLLVM.so is
built (i.e. ninja LLVM) this eliminates 150 linker jobs.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D95727
2021-04-01 14:58:44 -07:00
Philip Reames
678c8737df [funcattrs] Respect nofree attribute on callsites (not just callee) 2021-04-01 14:45:49 -07:00
Craig Topper
a069e39329 [RISCV] Add isel patterns to handle vrsub intrinsic with 2 vector operands.
This occurs when we type legalize an i64 scalar input on RV32. We
need to manually splat, which requires a vector input. Rather
than special case this in lowering just pattern match it.
2021-04-01 14:10:21 -07:00
Philip Reames
8864c3c53e [tests] Add tests for forthcoming funcattrs nosync inference improvement
These are basically all the attributor tests for the same attribute with some minor cleanup for readability and autogened.
2021-04-01 13:59:01 -07:00
David Green
b105fade98 [ARM] Allow v6m runtime loop unrolling
This removes the restriction that only Thumb2 targets enable runtime
loop unrolling, allowing it for Thumb1 only cores as well. The existing
T2 heuristics are used (for the time being) to control when and how
unrolling is performed.

Differential Revision: https://reviews.llvm.org/D99588
2021-04-01 21:21:40 +01:00
Craig Topper
18b312e6d9 [RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat.
The default legalization strategy is PromoteFloat which keeps
half in single precision format through multiple floating point
operations. Conversion to/from float is done at loads, stores,
bitcasts, and other places that care about the exact size being 16
bits.

This patches switches to the alternative method softPromoteHalf.
This aims to keep the type in 16-bit format between every operation.
So we promote to float and immediately round for any arithmetic
operation. This should be closer to the IR semantics since we
are rounding after each operation and not accumulating extra
precision across multiple operations. X86 is the only other
target that enables this today. See https://reviews.llvm.org/D73749

I had to update getRegisterTypeForCallingConv to force f16 to
use f32 when the F extension is enabled. This way we can still
pass it in the lower bits of an FPR for ilp32f and lp64f ABIs.
The softPromoteHalf would otherwise always give i16 as the
argument type.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D99148
2021-04-01 12:41:57 -07:00
Philip Reames
fc759eff12 Update a test missed in 6ef4505 2021-04-01 12:17:01 -07:00
Philip Reames
4f645e3128 [Attributor] Cleanup detection of non-relaxed atomics in nosync inference
The code was checking for cases which are disallowed by the verifier.  Delete dead code and adjust style.
2021-04-01 12:01:29 -07:00
Philip Reames
6c0cb88df1 [Attributor] Cleanup intrinsic handling in nosync inference [mostly NFC]
Mostly stylistic adjustment, but the old code didn't handle the memcpy.inline intrinsic.  By using the matcher class, we now do.
2021-04-01 11:49:59 -07:00
Philip Reames
db585a3ff9 [funcattrs] Infer nosync from readnone and non-convergent
This implements the most basic possible nosync inference. The choice of inference rule is taken from the comments in attributor and the discussion on the review of the change which introduced the nosync attribute (0626367202c).

This is deliberately minimal. As noted in code comments, I do plan to add a more robust inference which actually scans the function IR directly, but a) I need to do some refactoring of the attributor code to use common interfaces, and b) I wanted to get something in. I also wanted to minimize the "interesting" analysis discussion since that's time intensive.

Context: This combines with existing nofree attribute inference to help prove dereferenceability in the ongoing deref-at-point semantics work.

Differential Revision: https://reviews.llvm.org/D99749
2021-04-01 11:37:34 -07:00
Philip Reames
5615539472 Infer dereferenceability from malloc and friends
Hookup TLI when inferring object size from allocation calls. This allows the analysis to prove dereferenceability for known allocation functions (such as malloc/new/etc) in addition to those marked explicitly with the allocsize attribute.

This is a follow up to 0129cd5 now that the bug fixed by e2c6621e6 is resolved.

As noted in the test, this relies on being able to prove that there is no free between allocation and context (e.g. hoist location). At the moment, this is handled conservatively. I'm working strengthening out ability to reason about no-free regions separately.

Differential Revision: https://reviews.llvm.org/D99737
2021-04-01 11:33:35 -07:00
Martin Storsjö
2ca36bbf67 [ARM] Remove an unused parameter in ARMWinCOFFObjectWriter. NFC.
This writer only ever operates on 32 bit arm code.

Differential Revision: https://reviews.llvm.org/D99575
2021-04-01 21:25:41 +03:00
Philip Reames
a931c3545b Extract isVolatile helper on Instruction [NFCI]
We have this logic duplicated in several cases, none of which were exhaustive.  Consolidate it in one place.

I don't believe this actually impacts behavior of the callers.  I think they all filter their inputs such that their partial implementations were correct.  If not, this might be fixing a cornercase bug.
2021-04-01 11:24:02 -07:00
Alexey Bataev
f1fcc29f01 [SLP]Test for min/max reductions bug, NFC. 2021-04-01 10:57:57 -07:00
Nick Desaulniers
af4c4d525c [MC][ARM] add .w suffixes for RSB/RSBS T1
See also:
F5.1.167 RSB, RSBS (register) T1 shift or rotate by value variant
of the Arm ARM.

Link: https://github.com/ClangBuiltLinux/linux/issues/1309

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D99542
2021-04-01 10:45:37 -07:00
Philip Reames
1695a86b7f Mark unordered memset/memmove/memcpy as nosync
Mostly a means to remove a bit of code from attributor in advance of implementing a FuncAttr inference for nosync.
2021-04-01 10:38:54 -07:00
Craig Topper
7aed91bb62 [RISCV] Fix handling of nxvXi64 vmsgt(u).vx intrinsics on RV32.
We need to splat the scalar separately and use .vv, but there is
no vmsgt(u).vv. So add isel patterns to select vmslt(u).vv with
swapped operands.

We also need to get VT to use for the splat from an operand rather
than the result since the result VT is nxvXi1.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D99704
2021-04-01 10:38:05 -07:00
Nick Desaulniers
b2447d108f [MC][ARM] add .w suffixes for ORN/ORNS T1
See also:
F5.1.128 ORN, ORNS (register) T1 shift or rotate by value variant
of the Arm ARM.

Link: https://github.com/ClangBuiltLinux/linux/issues/1309

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D99538
2021-04-01 10:27:09 -07:00
LLVM GN Syncbot
7469d16788 [gn build] Port fdc4f19e2f80 2021-04-01 17:18:32 +00:00
Craig Topper
499a26b0c7 [RISCV] Add custom type legalization to form MULHSU when possible.
There's no target independent ISD opcode for MULHSU, so custom
legalize 2*XLen multiplies ourselves. We have to be a little
careful to prefer MULHU or MULHSU.

I thought about doing this in isel by pattern matching the
(add (mul X, (srai Y, XLen-1)), (mulhu X, Y)) pattern. I decided
against this because the add might become part of a chain of adds.
I don't trust DAG combine not to reassociate with other adds making
it difficult to find both pieces again.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D99479
2021-04-01 10:15:55 -07:00
Craig Topper
c9063c684b [RISCV] Add MULHU and MULHS tests with a constant operand. 2021-04-01 10:15:55 -07:00
Jay Foad
de49b38254 [AMDGPU] Remove SIAddIMGInit pass which is now unused
Differential Revision: https://reviews.llvm.org/D99748
2021-04-01 18:13:17 +01:00
Jay Foad
24e3495f60 [AMDGPU][GlobalISel] Add IMG init in selectImageIntrinsic
Doing this during instruction selection avoids the cost of running
SIAddIMGInit which is yet another pass over the MIR.

Differential Revision: https://reviews.llvm.org/D99670
2021-04-01 18:13:17 +01:00
Jay Foad
950b5ea73f [AMDGPU][SDag] Add IMG init in AdjustInstrPostInstrSelection
Doing this in a post-isel hook avoids the cost of running SIAddIMGInit
which is yet another pass over the MIR.

Differential Revision: https://reviews.llvm.org/D99747
2021-04-01 18:13:17 +01:00
Simon Pilgrim
8ffb5c2aaf [PPC] Regenerate PR27078 test checks 2021-04-01 18:11:46 +01:00
Samuel
180f0bc045 [llvm-reduce] Move tests to tools folder
Move tests for llvm-reduce to tools folder

Reviewed By: fhahn, lebedev.ri

Differential Revision: https://reviews.llvm.org/D99632
2021-04-01 10:04:10 -07:00
Craig Topper
77f3656ce1 [RISCV] Improve 64-bit integer materialization for some cases.
This adds a new integer materialization strategy mainly targeted
at 64-bit constants like 0xffffffff where there are 32 or more trailing
ones with leading zeros. We can materialize these by using an addi -1
and srli to restore the leading zeros. This matches what gcc does.

I haven't limited to just these cases though. The implementation
here takes the constant, shifts out all the leading zeros and
shifts ones into the LSBs, creates the new sequence, adds an srli,
and checks if this is shorter than our original strategy.

I've separated the recursive portion into a standalone function
so I could append the new strategy outside of the recursion. Since
external users are no longer using the recursive function, I've
cleaned up the external interface to return the sequence instead of
taking a vector by reference.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D98821
2021-04-01 09:12:52 -07:00
Philip Reames
bbe4781aa8 [tests] Cover the most basic cases of nosync inference 2021-04-01 09:09:22 -07:00
Sanjay Patel
571933ce0a [LoopVectorize] auto-generate complete checks; NFC
We can't see how much overhead/redundancy is being
created with the partial checks.

To make it smaller and easier to read, I reduced the
vectorization factor because that does not add new
information - it just duplicates things.
2021-04-01 11:55:41 -04:00
Jay Foad
7e564344e2 [AMDGPU] Small cleanup to constructRetValue and its caller. NFC. 2021-04-01 16:36:16 +01:00
Philip Reames
0cedfde64d [deref-at-point] restrict inference of dereferenceability based on allocsize attribute
Support deriving dereferenceability facts from allocation sites with known object sizes while correctly accounting for any possibly frees between allocation and use site. (At the moment, we're conservative and only allowing it in functions where we know we can't free.)

This is part of the work on deref-at-point semantics. I'm making the change unconditional as the miscompile in this case is way too easy to trip by accident, and the optimization was only recently added (by me).

There will be a follow up patch wiring through TLI since that should now be doable without introducing widespread miscompiles.

Differential Revision: https://reviews.llvm.org/D95815
2021-04-01 08:34:40 -07:00
Mircea Trofin
9b55a6ca1b [regalloc] Ensure Query::collectInterferringVregs is called before interval iteration
The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs()
needs to be called before iterating the interfering live ranges.

The rest of the patch offers support that is the case: instead of  clearing the query's
InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix
is invalidated (existing triggering mechanism).

Without the change in RegAllocGreedy.cpp, the compiler ices.

This patch should make it more easily discoverable by developers that
collectInterferringVregs needs to be called before iterating.

I will follow up with a subsequent patch to improve the usability and maintainability of Query.

Differential Revision: https://reviews.llvm.org/D98232
2021-04-01 08:33:28 -07:00
Anirudh Prasad
388899404f [AsmParser][SystemZ][z/OS] Add in support to accept "#" as part of an Identifier token
- This patch adds in support to accept the "#" character as part of an Identifier.
- This support is needed especially for the HLASM dialect since "#" is treated as part of the valid "Alphabet" range
- The way this is done is by making use of the previous precedent set by the `AllowAtInIdentifier` field in `MCAsmLexer.h`. A new field called `AllowHashInIdentifier` is introduced.
- The static function `IsIdentifierChar` is also updated to accept the `#` character if the `AllowHashInIdentifier` field is set to true.
Note: The field introduced in `MCAsmLexer.h` could very well be moved to `MCAsmInfo.h`. I'm not opposed to it. I decided to put it in `MCAsmLexer` since there seems to be some sort of precedent already with `AllowAtInIdentifier`.

Reviewed By: abhina.sreeskantharajan, nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D99277
2021-04-01 11:24:43 -04:00
Bradley Smith
70e309f48b [AArch64][SVE] Improve codegen for select nodes with fixed types
Additionally, move the existing fixed vselect tests to *-vselect.ll.

Differential Revision: https://reviews.llvm.org/D99418
2021-04-01 15:54:37 +01:00
Bradley Smith
a1035d0f62 [AArch64][SVE] SVE functions should use the SVE calling convention for fast calls
When an SVE function calls another SVE function using the C calling
convention we use the more efficient SVE VectorCall PCS.  However,
for the Fast calling convention we're incorrectly falling back to
the generic AArch64 PCS.

This patch adds the same "can use SVE vector calling convention"
detection used by CallingConv::C to CallingConv::Fast.

Co-authored-by: Paul Walker <paul.walker@arm.com>

Differential Revision: https://reviews.llvm.org/D99657
2021-04-01 15:52:08 +01:00
Brendon Cahoon
b39d75fbb0 [AMDGPU] Enable output modifiers for double precision instructions
Update SIFoldOperands pass to recognize v_add_f64 and v_mul_f64
instructions for folding output modifiers.

Differential Revision: https://reviews.llvm.org/D99505
2021-04-01 10:08:17 -04:00
Alexey Bataev
604d8eb038 [SLP]Improve and fix getVectorElementSize.
1. Need to cleanup InstrElementSize map for each new tree, otherwise might
use sizes from the previous run of the vectorization attempt.
2. No need to include into analysis the instructions from the different basic
   blocks to save compile time.

Differential Revision: https://reviews.llvm.org/D99677
2021-04-01 06:51:26 -07:00
Simon Pilgrim
cf39147d1e [DAG] MergeInnerShuffle with BinOps - sometimes accept undef mask elements
If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well.

This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.
2021-04-01 14:33:00 +01:00
Alexey Bataev
4a573da3f8 [SLP]Remove else after return, NFC.` 2021-04-01 05:33:01 -07:00
Dmitry Preobrazhensky
985d51d2a5 [AMDGPU][MC][GFX10][GFX90A] Corrected _e32/_e64 suffices
Fixed bugs https://bugs.llvm.org//show_bug.cgi?id=49643, https://bugs.llvm.org//show_bug.cgi?id=49644, https://bugs.llvm.org//show_bug.cgi?id=49645.

Differential Revision: https://reviews.llvm.org/D99413
2021-04-01 14:21:00 +03:00
Simon Pilgrim
834f9c5710 [X86][SSE] Fold HOP(HOP(X,X),HOP(Y,Y)) -> HOP(PERMUTE(HOP(X,Y)),PERMUTE(HOP(X,Y))
For slow-hop targets, attempt to merge HADD/SUB pairs used in chains.
2021-04-01 11:54:10 +01:00
Simon Pilgrim
3a797eccba [X86][SSE] Enable (F)HADD/SUB handling to SimplifyMultipleUseDemandedVectorElts
Attempt to bypass unused horiz-op operands.

This is very similar to the PACKSS/PACKUS handling - we should try to merge these.
2021-04-01 11:54:09 +01:00