1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

205759 Commits

Author SHA1 Message Date
Bing1 Yu
450c5e89bb [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...
In each 128-lane, if there is at least one index is demanded and not all
indices are demanded and this 128-lane is not the first 128-lane of the
legalized-vector, then this 128-lane needs a extracti128;
If in each 128-lane, there is at least one index is demanded, this 128-lane
needs a inserti128.

The following cases will help you build a better understanding:
Assume we insert several elements into a v8i32 vector in avx2,
Case#1: inserting into 1th index needs vpinsrd + inserti128
Case#2: inserting into 5th index needs extracti128 + vpinsrd +
inserti128
Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.

Reviewed By: pengfei, RKSimon

Differential Revision: https://reviews.llvm.org/D89767
2020-10-27 11:21:13 +08:00
Arthur Eubanks
b58f72c901 [AlwaysInliner] Pass callee AAResults to InlineFunction()
Test copied from noalias-calls.ll with small changes.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89609
2020-10-26 20:10:09 -07:00
Arthur Eubanks
e86f39b76e [PlaceSafepoints] Pin tests to legacy PM
This pass isn't used in tree and can be ported to the NPM later on if desired.

Differential Revision: https://reviews.llvm.org/D90189
2020-10-26 20:07:37 -07:00
Arthur Eubanks
7e6a52443b Port -objc-arc-expand to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90182
2020-10-26 20:05:10 -07:00
Arthur Eubanks
6085a8c54a Port -objc-arc-apelim to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90181
2020-10-26 20:01:46 -07:00
Chen Zheng
5e80159219 [LSR] fix typo in comments and rename for a new added hook. 2020-10-26 22:29:22 -04:00
Duncan P. N. Exon Smith
6c9c96836e IR: Simplify two loops walking ConstantDataSequential, NFC
Follow-up to b2b7cf39d596b1528cd64015575b3f5d1461c011.

Differential Revision: https://reviews.llvm.org/D90198
2020-10-26 21:55:48 -04:00
Craig Topper
ded2243113 Update email addresses in CODE_OWNERS. 2020-10-26 18:51:04 -07:00
Carl Ritson
d7606fe865 [AMDGPU] Move WQM Pass after MI Scheduler
Exec mask manipulation inserted by SIWholeQuadMode barriers to
instruction scheduling.  Move the entire pass after the machine
instruction scheduler and make changes so pass is correct for
non-SSA operation.  These changes should leave the pass still
usable pre-scheduler, although tests have be updated to reflect
post-scheduler results.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D88081
2020-10-27 10:25:53 +09:00
TaWeiTu
6b4f0b5bdf [NPM] Port -slsr to NPM
`-separate-const-offset-from-gep` has not yet be ported, so some tests are not updated.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D90149
2020-10-27 09:21:40 +08:00
Duncan P. N. Exon Smith
8696d90356 IR: Add a comment at missing std::make_unique calls from b2b7cf39d596b1528cd64015575b3f5d1461c011, NFC 2020-10-26 21:18:34 -04:00
Gaurav Jain
165fcf69d0 [NFC] Use [MC]Register in RegAllocPBQP & RegisterCoalescer
Differential Revision: https://reviews.llvm.org/D90008
2020-10-26 17:13:32 -07:00
Amy Kwan
b378d94cb6 [PowerPC] Implement Set Boolean Condition Instructions
This patch implements the set boolean condition instructions introduced in
POWER10.

The set boolean condition instructions (set[n]bc[r]) are used during
the following situations:
- sign/zero/any extending i1 to an i32 or i64,
- reg+reg, reg+imm or floating point comparisons being sign/zero extended to i32 or i64,
- spilling CR bits (using the setnbc instruction)

Differential Revision: https://reviews.llvm.org/D87705
2020-10-26 18:42:51 -05:00
Adrian Prantl
37ccbdeab3 [DebugInfo] Expose Fortran array debug info attributes through DIBuilder.
The support of a few debug info attributes specifically for Fortran
arrays have been added to LLVM recently, but there's no way to take
advantage of them through DIBuilder. This patch extends
DIBuilder::createArrayType to enable the settings of those attributes.

Patch by Chih-Ping Chen!

Differential Revision: https://reviews.llvm.org/D89817
2020-10-26 16:23:36 -07:00
Rahman Lavaee
682b6cd259 Explicitly check for entry basic block, rather than relying on MachineBasicBlock::pred_empty.
Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D89423
2020-10-26 16:15:56 -07:00
Stanislav Mekhanoshin
c9969d6a73 Fixed release build after D89170 2020-10-26 16:00:57 -07:00
Vedant Kumar
54baa09ec4 [cmake] Add LLVM_UBSAN_FLAGS, to allow overriding UBSan flags
Allow overriding the default set of flags used to enable UBSan when
building llvm.

This can be used to test new checks or opt out of certain checks.

Differential Revision: https://reviews.llvm.org/D89439
2020-10-26 15:48:19 -07:00
Duncan P. N. Exon Smith
9ad16f9a7c IR: Clarify ownership of ConstantDataSequentials, NFC
Change `ConstantDataSequential::Next` to a
`unique_ptr<ConstantDataSequential>` and update `CDSConstants` to a
`StringMap<unique_ptr<ConstantDataSequential>>`, making the ownership
more obvious.

Differential Revision: https://reviews.llvm.org/D90083
2020-10-26 18:47:25 -04:00
Amy Huang
4219deaaa7 [CodeView] Emit static data members as S_CONSTANTs.
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.

I changed CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.

Bug: https://bugs.llvm.org/show_bug.cgi?id=47580

Differential Revision: https://reviews.llvm.org/D89072
2020-10-26 15:30:35 -07:00
Quentin Colombet
fe6ee1f2e3 [TargetRegisterInfo] Fix a couple of typos in the comments
Spotted by Nicolas Guillemot <nguillemot@apple.com>.

Thanks Nicolas!

NFC
2020-10-26 15:19:38 -07:00
Puyan Lotfi
030ed89c59 [NFC] Fixing comment heading for MachineStableHash.h.
Wrong filename and description.
2020-10-26 18:07:26 -04:00
Stanislav Mekhanoshin
3016803f3e [AMDGPU] Use flat scratch instructions where available
The support is disabled by default. So far there is instruction
selection, spilling, and frame elimination. It also changes SP
from unswizzled to swizzled as used by flat scratch instructions,
so it cannot be mixed with MUBUF stack access.

At the very least missing:

- GlobalISel;
- Some optimizations in frame elimination in between vector
  and scalar ALU;
- It shall finally allow to always materialize frame index
  as an SGPR, but that is not implemented and frame elimination
  cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it
  is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;
- It will need scaling the value of the SP/FP in the DWARF
  expression to recover the unswizzled scratch address;

Differential Revision: https://reviews.llvm.org/D89170
2020-10-26 14:40:42 -07:00
Sriraman Tallam
9f253ca524 Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names.
Prepend the module name hash with a fixed string ".__uniq." which helps tools
that consume sampled profiles and attribute it to functions to understand
that this symbol belongs to a unique internal linkage type symbol.

Symbols with suffixes can result from various optimizations in the compiler.
Function Multiversioning, function splitting, parameter constant propogation,
unique internal linkage names.

External tools like sampled profile aggregators combine profiles from multiple
runs of a binary. They use various heuristics with symbols that have suffixes
to try and attribute the profile to the right function instance. For instance
multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different
should be attributed to the same source function if a single function is
versioned, using attribute target_clones (supported in GCC but yet to land in
LLVM). Similarly, functions that are split (split part having a .cold suffix)
could have profiles for both the original and split symbols but would be
aggregated and attributed to the original function that was split.

Unique internal linkage functions however have different source instances and
the aggregator must not put them together but attribute it to the appropriate
function instance. To be sure that we are dealing with a symbol of a unique
internal linkage function, we would like to prepend the hash with a known
string ".__uniq." which these tools can check to understand the suffix type.

Differential Revision: https://reviews.llvm.org/D89617
2020-10-26 14:24:28 -07:00
Duncan P. N. Exon Smith
b79d3e2664 Avoid unnecessary uses of MDNode::getTemporary, NFC
This is a long-delayed follow-up to
5e5b85098dbeaea2cfa5d01695b5d2982634d7dd.

`TempMDNode` includes a bunch of machinery for RAUW, and should only be
used when necessary. RAUW wasn't being used in any of these cases... it
was just a placeholder for a self-reference.

Where the real node was using `MDNode::getDistinct`, just replace the
temporary argument with `nullptr`.

Where the real node was using `MDNode::get`, the `replaceOperandWith`
call was "promoting" the node to a distinct one implicitly due to
self-reference detection in `MDNode::handleChangedOperand`. The
`TempMDNode` was serving a purpose by delaying uniquing, but it's way
simpler to just call `MDNode::getDistinct` in the first place.

Note that using a self-reference at all in these places is a hold-over
from before `distinct` metadata existed. It was an old trick to create
distinct nodes. It would be intrusive to change, including bitcode
upgrades, etc., and it's harmless so I'm not sure there's much value in
removing it from existing schemas. After this commit it still has a tiny
memory cost (in the extra metadata operand) but no more overhead in
construction.

Differential Revision: https://reviews.llvm.org/D90079
2020-10-26 17:03:25 -04:00
Sanjay Patel
52d6694ea4 [InstCombine] add folds for icmp+ctpop
https://alive2.llvm.org/ce/z/XjFPQJ

  define void @src(i64 %value) {
    %t0 = call i64 @llvm.ctpop.i64(i64 %value)
    %gt = icmp ugt i64 %t0, 63
    %lt = icmp ult i64 %t0, 64
    call void @use(i1 %gt, i1 %lt)
    ret void
  }

  define void @tgt(i64 %value) {
    %eq = icmp eq i64 %value, -1
    %ne = icmp ne i64 %value, -1
    call void @use(i1 %eq, i1 %ne)
    ret void
  }

  declare i64 @llvm.ctpop.i64(i64) #1
  declare void @use(i1, i1)
2020-10-26 16:48:56 -04:00
Sanjay Patel
f92ef4754f [InstCombine] add tests for ctpop at bitwidth limit; NFC 2020-10-26 16:48:56 -04:00
Sanjay Patel
f7a844a775 [InstCombine] reduce code duplication in icmp intrinsic folds; NFC 2020-10-26 16:48:56 -04:00
David Blaikie
be7bad8bb6 llvm-reduce: Test reduction for D88684 ( ee6e25e4391a6d3ac0a3c89615474e512f44cda6 ) 2020-10-26 13:16:00 -07:00
Nick Desaulniers
054ebcc03d [BitCode] decode nossp fn attr
I missed this in https://reviews.llvm.org/D87956.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D90177
2020-10-26 13:06:54 -07:00
Stanislav Mekhanoshin
b2232cc171 Fix SROA with a PHI mergig values from a same block
This fixes the bug 47945. It is legal to have a PHI with values
from from the same block, but values must stay the same. In this
case it is illegal to merge different values.

Differential Revision: https://reviews.llvm.org/D89978
2020-10-26 12:58:27 -07:00
Aaron Puchert
ec700eefc9 Add release tarballs for libclc
Fixes PR47917.

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D90100
2020-10-26 20:33:24 +01:00
Evgeny Leviant
da9fe51e32 [ARM][SchedModels] Move IsLDMBaseRegInListPred to ARMSchedule.td. NFC
This predicate is not specific to cortex-a57 and can be used in other processor
models as well.
2020-10-26 22:31:41 +03:00
Stanislav Mekhanoshin
3dc15fa04e [AMDGPU] Fix VC warning about singed/unsigned comparison. NFC.
This is the warning reported in https://reviews.llvm.org/D89599
2020-10-26 11:55:57 -07:00
Florian Hahn
1256d9b6ac [AArch64] Extend tests for insertelement improvements.
Extends the tests added in a562dc82a8d9488d35ff535302716141bc6feaa3 to
cover more vector variants.
2020-10-26 17:57:12 +00:00
Joe Ellis
50136e3679 [SVE] Fix TypeSize warning in llvm::getGEPInductionOperand
We do not need to use the implicit cast here. We can instead can rely on
a comparison between two TypeSize objects instead. This algorithm will
work fine with scalable vectors.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D90146
2020-10-26 17:40:32 +00:00
Joe Ellis
fa3637aa5e [SVE][InstCombine] Fix TypeSize warning in canReplaceGEPIdxWithZero
The warning would fire when calling canReplaceGEPIdxWithZero on a GEP
whose source element type is a scalable vector. The size of scalable
vector types is not known, so this optimization cannot be performed.

This patch fixes the issue by:

- bailing out early in this routine if the GEP instruction's source
  element type is a scalable vector.

- making use of getFixedSize -- this removes the dependency on the
  deprecated interface.

Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D89968
2020-10-26 17:40:26 +00:00
Joe Ellis
a376939c97 [SVE][AArch64] Fix TypeSize warning in GEP cost analysis
The warning would fire when calling getGEPCost for analyzing the cost of
a GEP instruction. This would result in the use of the now deprecated
implicit cast of TypeSize to uint64_t through the overloaded operator.

This patch fixes the issue by using getKnownMinSize instead of the
implicit cast. This is possible because the code is already
scalable-vector aware. The semantic behaviour of the code is unchanged
by this patch.

Reviewed By: sdesmalen, fpetrogalli

Differential Revision: https://reviews.llvm.org/D89872
2020-10-26 17:40:19 +00:00
Joe Ellis
18290a4a32 [SVE][AArch64] Fix TypeSize warning in loop vectorization legality
The warning would fire when calling isDereferenceableAndAlignedInLoop
with a scalable load. Calling isDereferenceableAndAlignedInLoop with a
scalable load would result in the use of the now deprecated implicit
cast of TypeSize to uint64_t through the overloaded operator.

This patch fixes this issue by:

- no longer considering vector loads as candidates in
  canVectorizeWithIfConvert. This doesn't make sense in the context of
  identifying scalar loads to vectorize.

- making use of getFixedSize inside isDereferenceableAndAlignedInLoop --
  this removes the dependency on the deprecated interface, and will
  trigger an assertion error if the function is ever called with a
  scalable type.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D89798
2020-10-26 17:40:04 +00:00
Evgeny Leviant
2575bcd8cb [ARM][SchedModels] Convert IsLdstsoScaledNotOptimalPred to MCSchedPredicate
Differential revision: https://reviews.llvm.org/D90150
2020-10-26 20:22:41 +03:00
Evgeny Leviant
2ff9958f77 Fix issue in cortex-a57 sched model
Differential revision: https://reviews.llvm.org/D90152
2020-10-26 20:16:40 +03:00
Benjamin Kramer
6892b335d9 [AMDGPU] Avoid unused variable warning in Release builds. NFC.
SIRegisterInfo.cpp:480:19: error: unused variable 'SOffset'
2020-10-26 18:11:57 +01:00
Peter Waller
ff2b8ca717 [SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.

This brings the logic in line with the comment on the context line immediately
above the added precondition.

Add a test in sve-redundant-store.ll that the warning is not triggered.

Differential Revision: https://reviews.llvm.org/D89701
2020-10-26 16:37:48 +00:00
Peter Waller
752c121e75 Revert "[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination"
This reverts commit 4604441386dc5fcd3165f4b39f5fa2e2c600f1bc.

Reverting because it was not the intended version of the patch, which
follows this patch.
2020-10-26 16:37:00 +00:00
Peter Waller
dec44ead4c [SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.

This brings the logic in line with the comment on the context line immediately
above the added precondition.

Add a test in Redundantstores.ll that the warning is not triggered.
2020-10-26 16:23:42 +00:00
Simon Pilgrim
23676156d9 [InstCombine] Add bswap test pattern using truncates 2020-10-26 16:11:03 +00:00
Florian Hahn
202ab143ec [AArch64] Add 2 cases where insertelement lowering could be improved. 2020-10-26 15:37:17 +00:00
Simon Pilgrim
260990f87d [X86] Use mtriple instead of march in MIR tests 2020-10-26 15:31:33 +00:00
Kazushi (Jam) Marukawa
4f90b31115 [VE] Add vector shift instructions
Add VSLL/VSLD/VSRL/VSLA/VSLAX/VSRA/VSRAX/VSFA instructionss.  Add
additonal AsmParser for VSLD special operand.  Also add regression
tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90143
2020-10-27 00:30:27 +09:00
Kazushi (Jam) Marukawa
f2d171e4f2 [VE] Add vector logical instructions
Add VAND/VOR/VXOE/VEQV/VLDZ/VPCNT/VBRV/VSEQ instrucitons and regression
tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90141
2020-10-27 00:29:33 +09:00
Kazushi (Jam) Marukawa
d1e71a6f1c [VE] Support atomic store
Support atomic store instructions and add a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D90137
2020-10-27 00:28:11 +09:00