1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

199222 Commits

Author SHA1 Message Date
LLVM GN Syncbot
9c025517f3 [gn build] Port fe0a555aa3c 2020-06-29 16:19:53 +00:00
serge-sans-paille
4bd6fd3873 Correctly report Changed status in FoldBranchToCommonDest
It's possible for the first loop trip(s) to set the `Changed` Status, and to a
later one to early exit, in which case `Changed` must be return.

Differential Revision: https://reviews.llvm.org/D82753
2020-06-29 18:13:42 +02:00
Francesco Petrogalli
b0f83ed2ae [sve][acle] Implement some of the C intrinsics for brain float.
Summary:
The following intrinsics have been extended to support brain float types:

svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data)
bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data)
bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op)

svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data)
bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data)
bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op)

svbfloat16_t svdup[_n]_bf16(bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op)
svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op)

svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7)
svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index)

svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2)

Reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau, efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82345
2020-06-29 16:09:08 +00:00
Christudasan Devadasan
2c3749142b [AMDGPU] Moving SI_RETURN_TO_EPILOG handling out of SIInsertSkips.
For now, moving it to SIPreEmitPeephole.
Should find a right place to have this code.

Reviewed By: nhaehnle

Differential revision: https://reviews.llvm.org/D77544
2020-06-29 20:41:53 +05:30
David Green
6b973734b0 [ARM] Better reductions
MVE has native reductions for integer add and min/max. The others need
to be expanded to a series of extract's and scalar operators to reduce
the vector into a single scalar. The default codegen for that expands
the reduction into a series of in-order operations.

This modifies that to something more suitable for MVE. The basic idea is
to use vector operations until there are 4 remaining items then switch
to pairwise operations. For example a v8f16 fadd reduction would become:
Y = VREV X
Z = ADD(X, Y)
z0 = Z[0] + Z[1]
z1 = Z[2] + Z[3]
return z0 + z1

The awkwardness (there is always some) comes in from something like a
v4f16, which is first legalized by adding identity values to the extra
lanes of the reduction, and which can then not be optimized away through
the vrev; fadd combo, the inserts remain. I've made sure they custom
lower so that we can produce the pairwise additions before the extra
values are added.

Differential Revision: https://reviews.llvm.org/D81397
2020-06-29 16:04:13 +01:00
Simon Pilgrim
4bf9737876 Fix MSVC truncation of constant value warning. 2020-06-29 16:03:00 +01:00
Bjorn Pettersson
79f2047cca [llvm-objcopy] Fix "unused-function" warning in NDEBUG builds
Fixup of commit b925ca37a8f28851 to allow building with
-Werror -Wunused-function.
2020-06-29 16:59:15 +02:00
Simon Pilgrim
9d2475cbc2 [X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions (REAPPLIED)
This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero.

Reapplied with a fix for a chromium regression due to a missing isNullConstant() check in combineSetCC: https://bugs.chromium.org/p/chromium/issues/detail?id=1097758

Fixes PR45378

Differential Revision: https://reviews.llvm.org/D81547
2020-06-29 15:50:44 +01:00
Nemanja Ivanovic
5e8f6beb8e [PowerPC] Don't combine SCALAR_TO_VECTOR without VSX
Most of the patterns for PPCISD::SCALAR_TO_VECTOR_PERMUTED require
VSX. So don't emit them if the subtarget doesn't have VSX.
This resolves the issue reported on
https://reviews.llvm.org/rG1fed131660b2c5d3ea7007e273a7a5da80699445
2020-06-29 09:48:57 -05:00
Matt Arsenault
89d92f55da Inliner: Add missing test for alignment assume with byval
No tests were stressing the behavior for hasPassPointeeByValueAttr.
2020-06-29 10:39:58 -04:00
Sanjay Patel
8ded996357 [VectorCombine] try to form vector compare and binop to eliminate scalar ops
binop i1 (cmp Pred (ext X, Index0), C0), (cmp Pred (ext X, Index1), C1)
-->
vcmp = cmp Pred X, VecC
ext (binop vNi1 vcmp, (shuffle vcmp, Index1)), Index0

This is a larger pattern than the existing extractelement folds because we can't
reasonably vectorize the sub-patterns with constants based on cost model calcs
(it doesn't usually make sense to replace a single extracted scalar op with
constant operand with a vector op).

I salvaged as much of the existing logic as I could, but there might be better
ways to share and reduce code.

The motivating case from PR43745:
https://bugs.llvm.org/show_bug.cgi?id=43745
...is the special case of a 2-way reduction. We tried to get SLP to handle that
particular pattern in D59710, but that caused crashing and regressions.
This patch is more general, but hopefully safer.

The v2f64 test with SSE2 surprised me - the cost model accounting looks like this:
OldCost = 0 (free extract of f64 at index 0) + 1 (extract of f64 at index 1) + 2 (scalar fcmps) + 1 (and of bools) = 4
NewCost = 2 (vector fcmp) + 1 (shuffle) + 1 (vector 'and') + 1 (extract of bool) = 5

Differential Revision: https://reviews.llvm.org/D82474
2020-06-29 10:38:52 -04:00
Matt Arsenault
3f160e71eb AMDGPU: Use IsSSA property check instead of asserting on isSSA
Also fix an SSA violation in a test the MIRParser/verifier fails to
catch. It's illegal to define a subregister in SSA. For the purpose of
the test, it just needs to define the super-register to use the
subregister in the use operand.
2020-06-29 10:05:23 -04:00
Sanjay Patel
68105631e7 [VectorCombine] refactor - make helper function for extract to shuffle logic; NFC
Preliminary for D82474
2020-06-29 09:55:34 -04:00
LLVM GN Syncbot
3f729506e5 [gn build] Port 2cb0644f90b 2020-06-29 13:37:16 +00:00
Luís Marques
3e3d432461 [RISCV] Split the pseudo instruction splitting pass
Extracts the atomic pseudo-instructions' splitting from `riscv-expand-pseudo`
/ `RISCVExpandPseudo` into its own pass, `riscv-expand-atomic-pseudo` /
`RISCVExpandAtomicPseudo`. This allows for the expansion of atomic operations
to continue to happen late (the new pass is added in `addPreEmitPass2`, so
those expansions continue to happen in the same place), while the remaining
pseudo-instructions can now be expanded earlier and benefit from more
optimization passes. The nonatomics pass is now added in `addPreSched2`.

Differential Revision: https://reviews.llvm.org/D79635
2020-06-29 14:35:57 +01:00
Guillaume Chatelet
e127592414 [NFC] Fix typos 2020-06-29 13:00:37 +00:00
LLVM GN Syncbot
ca27ac2690 [gn build] Port b56b467a9a8 2020-06-29 12:53:46 +00:00
Guillaume Chatelet
9871a892f8 [ADT] Add Bitfield utilities
Context:
--------
There are places in LLVM where we need to pack typed fields into opaque values.
For instance, the `XXXInst` classes in `llvm/include/llvm/IR/Instructions.h` that extract informations from `Value::SubclassData` via `getSubclassDataFromInstruction()`.
The bit twiddling is done manually: this impairs readability and prevent consistent handling of out of range values (e.g. 435b458ad0/llvm/include/llvm/IR/Instructions.h (L564))
More importantly, the bit pattern is scattered throughout the implementation making it hard to pack additionnal fields or check for overlapping bits.

Design decisions:
-----------------
The Bitfield structs are to be declared together so it is clear which bits are used or not.
The code is designed with simplicity in mind, hence a few limitations:
 - Storage is limited to a single integer,
 - Enum values have to be `unsigned`,
 - Storage type has to be `unsigned`,
 - There are no automatic detection of overlapping fields (packed bitfield declaration should help though),
 - The interface is C like so `storage` needs to be passed in everytime (code is simpler and lifetime considerations more obvious)

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142196.html

Differential Revision: https://reviews.llvm.org/D81580
2020-06-29 12:48:44 +00:00
Sebastian Neubauer
1d4b51e294 Add intrinsic helper function
It simplifies getting generic argument types from intrinsics.

Differential Revision: https://reviews.llvm.org/D81084
2020-06-29 14:47:46 +02:00
Sander de Smalen
3a95e5caa2 [AArch64][SVE] NFCI: Choose consistent naming for predicated SDAG nodes
This patch proposes a naming convention for operations that take
a general predicate (and are thus predicated) that specifies
what happens to the false lanes.

Currently the _PRED suffix is used, which doesn't really say much other
than that it takes a predicate. In some instances this means it has
merging predication and in other cases it means zeroing-predication.

This patch also changes the order of operands to
AArch64ISD::DUP_MERGE_PASSTHRU, to pass the predicate as the first
operand, which is in line with all other predicates nodes. It takes the
passthru value as an explicit passthru value, which is always passed as
the last operand.

Reviewers: paulwalker-arm, cameron.mcinally, eli.friedman, dancgr, efriedma

Reviewed By: paulwalker-arm

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81850
2020-06-29 13:37:30 +01:00
Guillaume Chatelet
2348bf3e1e [NFC] Introduce a helper in BasicTTIImpl.h to cast to T
This patch makes access to `this` as a `T` uniform accross the file.

Differential Revision: https://reviews.llvm.org/D82648
2020-06-29 12:16:25 +00:00
John Brawn
8ab2d7892e [Driver] When forcing a crash print the bug report message
Commit a945037e8fd0c30e250a62211469eea6765a36ae moved the printing of the
"PLEASE submit a bug report" message to the crash handler, but that means we
don't print it when forcing a crash using FORCE_CLANG_DIAGNOSTICS_CRASH. Fix
this by adding a function to get the bug report message and printing it when
forcing a crash.

Differential Revision: https://reviews.llvm.org/D81672
2020-06-29 13:13:12 +01:00
Guillaume Chatelet
077a3432f3 [Alignment][NFC] Migrate AMDGPU backend to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82743
2020-06-29 11:56:06 +00:00
Guillaume Chatelet
90d9339006 [Alignment][NFC] migrate DataLayout::getPreferredAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82752
2020-06-29 11:24:36 +00:00
Simon Pilgrim
ad27455969 [X86] Add vector support to targetShrinkDemandedConstant for OR/XOR opcodes
If a constant is only allsignbits in the demanded/active bits, then sign extend it to an allsignbits bool pattern for OR/XOR ops.

This also requires SimplifyDemandedBits XOR handling to be modified to call ShrinkDemandedConstant on any (non-NOT) XOR pattern to account for non-splat cases.

Next step towards fixing PR45808 - with this patch we now get a <-1,-1,0,0> v4i64 constant instead of <1,1,0,0>.

Differential Revision: https://reviews.llvm.org/D82257
2020-06-29 12:19:05 +01:00
Cullen Rhodes
f038a9beac [AArch64][SVE] Add bfloat16 support to svext intrinsic
Reviewers: sdesmalen, kmclaughlin, efriedma, david-arm, fpetrogalli

Reviewed By: sdesmalen, fpetrogalli

Differential Revision: https://reviews.llvm.org/D82391
2020-06-29 11:08:38 +00:00
Kerry McLaughlin
df0704f42c [AArch64][SVE] Bail out of performPostLD1Combine for scalable types
Summary:
performPostLD1Combine will introduce either a LD1LANEpost
or LD1DUPpost node, which will cause selection failure if the
return type is a scalable vector.

Reviewers: sdesmalen, c-rhodes, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82670
2020-06-29 11:59:53 +01:00
LLVM GN Syncbot
dd71d5b223 [gn build] Port 8e5a56865f2 2020-06-29 10:55:06 +00:00
Simon Pilgrim
22c6c3140c [TargetLowering] Add DemandedElts arg to ShrinkDemandedConstant
Pre-commit for D82257, this adds a DemandedElts arg to ShrinkDemandedConstant/targetShrinkDemandedConstant which will allow future patches to (optionally) add vector support.
2020-06-29 11:46:58 +01:00
Georgy Komarov
87e6a20dbb [llvm-objcopy] Emit error if removing symtab referenced by group section
SHT_GROUP sections contain a reference to a symbol indicating their
"signature" symbol. The symbol table containing this symbol is referred
to by the group section's sh_link field. If llvm-objcopy is instructed
to remove the symbol table, it will emit an error.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46153.

Reviewed By: jhenderson, Higuoxing

Differential Revision: https://reviews.llvm.org/D82274
2020-06-29 10:42:03 +01:00
Guillaume Chatelet
d423ca71e1 Fix invalid alignment in DAGCombiner::isLegalNarrowLdSt
`ShAmt / 8` can be a non power of two, this can lead to an invalid alignment.
context: https://reviews.llvm.org/D41350#inline-749165

Differential Revision: https://reviews.llvm.org/D82565
2020-06-29 09:22:15 +00:00
LLVM GN Syncbot
63e4803d4d [gn build] Port 8f9ca561a2b 2020-06-29 08:10:01 +00:00
Xing GUO
81f234a2c6 [ObjectYAML][DWARF] Collect diagnostic message when YAMLParser fails.
Before this patch, the diagnostic message is printed to `errs()` directly, which makes it difficult to use `FailedWithMessage()` in unit testing.
In this patch, we add a custom error handler for YAMLParser, which helps collect diagnostic messages and make it easy to use `FailedWithMessage()` to check error messages.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D82630
2020-06-29 16:13:53 +08:00
Sergey Dmitriev
e668706cd2 [NFC] CallGraph related cleanup
Summary: Tidy up some CallGraph-related code in preparation for D82572.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82686
2020-06-28 15:27:39 -07:00
Nikita Popov
881c1da5d8 [SimplifyCFG] Make test more robust (NFC)
Avoid changing this test if blocks get merged.
2020-06-28 20:51:03 +02:00
Nikita Popov
8a71a5f127 [SimplifyCFG] Regenerate test checks (NFC) 2020-06-28 20:51:02 +02:00
Roman Lebedev
30e818852b [NFC][ScalarEvolution] Add a test showing SCEV failure to recognize 'urem'
While InstCombine trivially converts that `srem` into a `urem`,
it might happen later than wanted. SCEV should recognize this natively.
2020-06-28 20:35:02 +03:00
Xun Li
10e1d5734d [Coroutines] Optimize the lifespan of temporary co_await object
Summary:
If we ever assign co_await to a temporary variable, such as foo(co_await expr),
we generate AST that looks like this: MaterializedTemporaryExpr(CoawaitExpr(...)).
MaterializedTemporaryExpr would emit an intrinsics that marks the lifetime start of the
temporary storage. However such temporary storage will not be used until co_await is ready
to write the result. Marking the lifetime start way too early causes extra storage to be
put in the coroutine frame instead of the stack.
As you can see from https://godbolt.org/z/zVx_eB, the frame generated for get_big_object2 is 12K, which contains a big_object object unnecessarily.
After this patch, the frame size for get_big_object2 is now only 8K. There are still room for improvements, in particular, GCC has a 4K frame for this function. But that's a separate problem and not addressed in this patch.

The basic idea of this patch is during CoroSplit, look for every local variable in the coroutine created through AllocaInst, identify all the lifetime start/end markers and the use of the variables, and sink the lifetime.start maker to the places as close to the first-ever use as possible.

Reviewers: lewissbaker, modocache, junparser

Reviewed By: junparser

Subscribers: hiraditya, llvm-commits, rsmith, ChuanqiXu, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D82314
2020-06-28 10:18:15 -07:00
Sanjay Patel
6cd9edb9b6 [VectorCombine] add test for scalable vectors; NFC 2020-06-28 12:44:44 -04:00
Sanjay Patel
c102819993 Revert "[VectorCombine] add test for scalable vectors; NFC"
This reverts commit 700ec6b848c02ca3de9751d63a7a5a26671c3fe9.
An extra test diff snuck here.
2020-06-28 12:43:11 -04:00
Sanjay Patel
adcd418d36 [VectorCombine] add test for scalable vectors; NFC 2020-06-28 12:42:00 -04:00
Sanjay Patel
ff8699a103 [x86] add tests for rsqrt opportunities; NFC 2020-06-28 12:42:00 -04:00
Esme-Yi
7c2f3bcb24 [NFC][PowerPC] Add run lines to test DivRemPairsPass. 2020-06-28 16:26:05 +00:00
Nikita Popov
8f6fd95e02 [InstCombine] Add tests for assume implication (NFC) 2020-06-28 16:18:44 +02:00
Nikita Popov
e01bc5e88a [LVI] Refactor value from icmp cond handling (NFC)
Rewrite this in a way that is more amenable to extension.
2020-06-28 15:04:02 +02:00
Nikita Popov
4ff55cbf49 [CVP] Add tests for icmp or and/or edge conds (NFC) 2020-06-28 14:54:55 +02:00
Simon Pilgrim
eefd5fecb7 [X86] combineScalarToVector - handle (v2i64 scalar_to_vector(aextload)) as well as (v2i64 scalar_to_vector(aext))
We already fold (v2i64 scalar_to_vector(aext)) -> (v2i64 bitcast(v4i32 scalar_to_vector(x))), this adds support for similar aextload cases and also handles v2f64 cases that wrap the i64 extension behind bitcasts.

Fixes the remaining issue with PR39016
2020-06-28 13:00:32 +01:00
madhur13490
56cd330ee7 Revert accidentally landed patch citing o build errors
Summary: This reverts commit c73966c2f79290e4eefe6e481f7bc94dd6ca4437.

Reviewers:

Subscribers:
2020-06-28 11:52:33 +00:00
madhur13490
e648e95206 Improve stack object printing. NFC.
Reviewers: madhur13490

Reviewed By: madhur13490

Subscribers: qcolombet, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82712
2020-06-28 11:43:33 +00:00
dfukalov
cfda63f7ad SpeculativeExecution: fix incorrect debug info move
Summary:
Debug info related instructions got zero cost so hoisted unconditionally

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46267

Reviewers: arsenm, nhaehnle, chandlerc, aprantl

Reviewed By: aprantl

Subscribers: ormris, uabelho, wdng, aprantl, hiraditya, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D81730
2020-06-28 14:35:00 +03:00