1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

208151 Commits

Author SHA1 Message Date
Chris Sears
9ebc697f7d X86: Correcting X86OutgoingValueHandler typo (NFC)
https://reviews.llvm.org/D92631
2020-12-12 20:28:37 -05:00
Nico Weber
eb4dd1efc0 fix typos to cycle bots 2020-12-12 20:19:33 -05:00
Nico Weber
fad391b5a4 mac/arm: XFAIL the last 2 failing check-llvm tests
We should fix them, but let's XFAIL them for now so that we can start
running check-llvm on bots and lock in the passing tests.

Part of PR46647.
2020-12-12 20:12:02 -05:00
Nico Weber
4abfbbe941 [mac/arm] skip MappedMemoryTest that try to map w+x
macOS/arm is w^x, so these tests don't work. Fixes these failures:

  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.AllocAndRelease/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.AllocAndReleaseHuge/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.BasicWrite/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.DuplicateNear/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.EnabledWrite/3
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.EnabledWrite/4
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.EnabledWrite/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.MultipleAllocAndRelease/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.MultipleWrite/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.SuccessiveNear/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.UnalignedNear/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.ZeroNear/5
  LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.ZeroSizeNear/5

Part of PR46647.
2020-12-12 19:46:32 -05:00
Craig Topper
6db8e7e4ea [X86] Autogenerate complete checks. NFC 2020-12-12 16:37:28 -08:00
Amara Emerson
7bc095bd88 [[GlobalISel][IRTranslator] Fix a crash when the use of an extractvalue is a non-dominated metadata use.
We don't expect uses to come before defs in the CFG, so allocateVRegs() asserted.

Fixes PR48211
2020-12-12 14:58:54 -08:00
Roman Lebedev
cc68932a0f [SimplifyCFG] FoldBranchToCommonDest(): bonus instrns must only be used by PHI nodes in successors (PR48450)
In particular, if the successor block, which is about to get a new
predecessor block, currently only has a single predecessor,
then the bonus instructions will be directly used within said successor,
which is fine, since the block with bonus instructions dominates that
successor. But once there's a new predecessor, the IR is no longer valid,
and we don't fix it, because we only update PHI nodes.

Which means, the live-out bonus instructions must be exclusively used
by the PHI nodes in successor blocks. So we have to form trivial PHI nodes.
which will then be successfully updated to recieve cloned bonus instns.

This all works fine, except for the fact that we don't have access to
the dominator tree, and we don't ignore unreachable code,
so we sometimes do end up having to deal with some weird IR.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48450
2020-12-13 00:06:57 +03:00
Zarko Todorovski
1ca33f4559 [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors
Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode.  This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D92789
2020-12-12 15:28:28 -05:00
Nikita Popov
021d70e8fc [CVP] Simplify and generalize switch handling
CVP currently handles switches by checking an equality predicate
on all edges from predecessor blocks. Of course, this can only
work if the value being switched over is defined in a different block.

Replace this implementation with a call to getPredicateAt(), which
also does the predecessor edge predicate check (if not defined in
the same block), but can also do quite a bit more: It can reason
about phi-nodes by checking edge predicates for incoming values,
it can reason about assumes, and it can reason about block values.

As such, this makes the implementation both simpler and more
powerful. The compile-time impact on CTMark is in the noise.
2020-12-12 21:12:27 +01:00
Nikita Popov
37db5c8083 [CVP] Add additional switch tests (NFC)
These cover cases handled by getPredicateAt(), but not by the
current implementation:

 * Assumes based on context instruction.
 * Value from phi node in same block (using per-pred reasoning).
 * Value from non-phi node in same block (using block-val reasoning).
2020-12-12 20:58:00 +01:00
Krzysztof Parzyszek
577e1b0232 [Hexagon] Reconsider getMask fix, return original mask, convert later
The getPayload/getMask/getPassThrough functions should return values
that could be composed into a masked load/store without any additional
type casts. The previous fix violated that.
Instead, convert scalar mask to a vector right before rescaling.
2020-12-12 13:27:22 -06:00
Tony
313d9ab376 [NFC][AMDGPU] AMDGPUUsage updates
- Document which processors are supported by which runtimes.
- Add missing mappings for code object V2 note records

Differential Revision: https://reviews.llvm.org/D93016
2020-12-12 18:19:02 +00:00
Kazu Hirata
dc01966102 [Analysis/Interval] Remove isLoop (NFC)
The last use of isLoop was removed on Apr 29, 2002 in commit
09bbb5c015c6e40b3d45da057f955ddb7c8f8485 as part of an effort to
remove "old induction varaible cannonicalization pass built on top of
interval analysis".
2020-12-12 10:09:35 -08:00
Kazu Hirata
aca797bfd1 [Transforms] Use is_contained (NFC) 2020-12-12 09:37:49 -08:00
Krzysztof Parzyszek
988ff0aa45 [Hexagon] Create vector masks for scalar loads/stores
AlignVectors treats all loaded/stored values as vectors of bytes,
and masks as corresponding vectors of booleans, so make getMask
produce a 1-element vector for scalars from the start.
2020-12-12 11:12:17 -06:00
Harald van Dijk
90e4a4c68b [UpdateTestChecks] Add --(no-)x86_scrub_sp option.
This makes it possible to use update_llc_test_checks to manage tests
that check for incorrect x86 stack offsets. It does not yet modify any
test to make use of this new option.
2020-12-12 17:11:13 +00:00
Harald van Dijk
15a28c0a8f [X86] Avoid data16 prefix for lea in x32 mode
The ABI demands a data16 prefix for lea in 64-bit LP64 mode, but not in
64-bit ILP32 mode. In both modes this prefix would ordinarily be
ignored, but the instructions may be changed by the linker to
instructions that are affected by the prefix.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93157
2020-12-12 17:05:24 +00:00
David Green
5ed162a969 [ARM] Add basic masked load/store costs
This adds some basic MVE masked load/store costs, notably changing the
cost of legal loads/stores to the MVECostFactor and the cost of
scalarized instructions to 8*NumElts.

Differential Revision: https://reviews.llvm.org/D86538
2020-12-12 15:26:32 +00:00
David Green
68a0f80009 [LV] Fix scalar cost for tail predicated loops
When it comes to the scalar cost of any predicated block, the loop
vectorizer by default regards this predication as a sign that it is
looking at an if-conversion and divides the scalar cost of the block by
2, assuming it would only be executed half the time. This however makes
no sense if the predication has been introduced to tail predicate the
loop.

Original patch by Anna Welker

Differential Revision: https://reviews.llvm.org/D86452
2020-12-12 14:21:40 +00:00
Nikita Popov
cf4871bd6a [BasicAA] Make non-equal index handling simpler to extend (NFC) 2020-12-12 15:00:47 +01:00
Nikita Popov
36c749f8f1 [BasicAA] Add tests for non-zero var index (NFC) 2020-12-12 15:00:46 +01:00
Luo, Yuanke
826a8b01a7 [X86] Add chain in ISel for x86_tdpbssd_internal intrinsic. 2020-12-12 21:14:38 +08:00
Nathan James
b4d64251fd [YAML] Support extended spellings when parsing bools.
Support all the spellings of boolean datatypes according to https://yaml.org/type/bool.html

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D92755
2020-12-12 12:50:34 +00:00
David Green
109a6a32fa [ARM] Test for showing scalar vector costs. NFC 2020-12-12 11:43:14 +00:00
Jan Svoboda
fa6c9b63ca [clang][cli] Add flexible TableGen multiclass for boolean options
This introduces more flexible multiclass for declaring two flags controlling the same boolean keypath.

Compared to existing Opt{In,Out}FFlag multiclasses, the new syntax makes it easier to read option declarations and reason about the keypath.

This also makes specifying common properties of both flags possible.

I'm open to suggestions on the class names. Not 100% sure the benefits are worth the added complexity.

Depends on D92774.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D92775
2020-12-12 10:53:28 +01:00
Jan Svoboda
2bf0908f96 [clang][cli] Don't always emit -f[no-]legacy-pass-manager
We don't need to always generate `-f[no-]experimental-new-pass-manager`.

This patch does not change the behavior of any other command line flag. (For example `-triple` is still being always generated.)

Reviewed By: dexonsmith, Bigcheese

Differential Revision: https://reviews.llvm.org/D92857
2020-12-12 10:11:23 +01:00
Kazu Hirata
6f6801a6fa [Analysis] Use is_contained (NFC) 2020-12-11 21:19:31 -08:00
Mircea Trofin
44ff9e7909 [MLGO] Fix build break as result of new InstructionCost (D91174) 2020-12-11 20:28:39 -08:00
Fangrui Song
506848f563 [llvm-cov gcov] Replace Donald B. Johnson's cycle enumeration with iterative cycle finding
gcov computes the line execution count as the sum of (a) counts from
predecessors on other lines and (b) the sum of loop execution counts of blocks
on the same line (think of loops on one line).

For (b), we use Donald B. Johnson's cycle enumeration algorithm and perform
cycle cancelling for each cycle. This number of candidate cycles were
exponential and D93036 made it polynomial by skipping zero count cycles.  The
time complexity is high (O(V*E^2) (it could be O(E^2) but the linear `Blocks`
check made it higher) and the implementation is complex.

We could just identify loops and sum all back edges. However, this requires a
dominator tree construction which is more complex. The time complexity can be
decreased to almost linear, though.

This patch just performs cycle cancelling iteratively. Add two members
`traversable` and `incoming` to GCOVArc. There are 3 states:

* `!traversable`: blocks not on this line or explored blocks
* `traversable && incoming == nullptr`: unexplored blocks
* `traversable && incoming != nullptr`: blocks which are being explored (on the stack)

If an arc points to a block being explored, a cycle has been found.

Let E be the number of arcs. Every time a cycle is found, at least one arc is
saturated (`edgeCount` reduced to 0), so there are at most E cycles. Finding one
cycle takes O(E) time, so the overall time complexity is O(E^2). Note that we
always augment through a back edge and never need to augment its reverse edge so
reverse edges in traditional flow networks are not needed.

Reviewed By: xinhaoyuan

Differential Revision: https://reviews.llvm.org/D93073
2020-12-11 18:28:16 -08:00
Fangrui Song
3e1d1f4661 [Kaleidoscope] Migrate DebugInfo::get to DILocation::get 2020-12-11 18:01:04 -08:00
Jonas Paulsson
160287755d Reapply "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing."
Fixed to properly compute the live-in lists of new blocks.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D92803
2020-12-11 18:25:47 -06:00
Jonas Paulsson
0962109bbf [SystemZTTIImpl] Allow some non-prefetched accesses in getMinPrefetchStride().
The performance improvement on LBM previously achieved with improved software
prefetching (36d4421) have gone lost recently with e00f189. There now is one
memory access in the loop that LoopDataPrefetch cannot handle (while before
there was none) which the heuristic rejects.

This patch adds a small margin by allowing 1 non-prefetched memory access for
every 32 prefetched ones, so that the heuristic doesn't bail in this type of
case.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D92985
2020-12-11 18:06:07 -06:00
diggerlin
071d4d3232 [AIX] Fixed a link error.
Summary:

 "Speculative fix for link failure on bots" with a mention of "the clang-ppc64le-rhel bot fails on link: http://lab.llvm.org:8011/#/builders/57/builds/2307/steps/6/logs/stdio".

PPCAsmPrinter.cpp:(.text._ZN12_GLOBAL__N_116PPCAIXAsmPrinter19emitFunctionBodyEndEv+0x2f8): undefined reference to `llvm::XCOFF::getNameForTracebackTableLanguageId(llvm::XCOFF::TracebackTable::LanguageID)'
PPCAsmPrinter.cpp:(.text._ZN12_GLOBAL__N_116PPCAIXAsmPrinter19emitFunctionBodyEndEv+0x2170): undefined reference to `llvm::XCOFF::parseParmsType(unsigned int, unsigned int)'
2020-12-11 18:53:10 -05:00
Craig Topper
08d4def9b7 [LoopIdiomRecognize] Autogenerate complete checks for the X86 ctlz/cttz tests. NFC
Preparation for D92745 which will add more tests to these files.
2020-12-11 15:35:37 -08:00
diggerlin
24fcc0af13 [AIX][XCOFF] emit traceback table for function in aix
SUMMARY:
 1. added a new option -xcoff-traceback-table to control whether generate traceback table for function.
 2. implement the functionality of emit traceback table of a function.

Reviewers: hubert.reinterpretcast, Jason Liu
Differential Revision: https://reviews.llvm.org/D92398
2020-12-11 17:50:25 -05:00
Sanjay Patel
5daaa0b448 [InstCombine][x86] fix insertion point bug in vector demanded elts fold (PR48476)
This transform was added at:
c63799fc52ff

From what I see, it's the first demanded elements transform that adds
a new instruction using the IRBuilder. There are similar folds in
the generic demanded bits chunk of instcombine that also use the
InsertPointGuard code pattern.

The tests here would assert/crash because the new instruction was
being added at the start of the demanded elements analysis rather
than at the instruction that is being replaced.
2020-12-11 17:23:35 -05:00
Krzysztof Parzyszek
bc08eac614 [Hexagon] Workaround for compilation error with VS2017 2020-12-11 15:11:44 -06:00
Fangrui Song
0140db71cc Migrate deprecated DebugLoc::get to DILocation::get
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.

The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D93087
2020-12-11 12:45:22 -08:00
Nikita Popov
022645293f [BasicAA] Add extra check in phi-spec-order.ll (NFC)
The (scevgep, scevgep5) relation regressed with a patch I was
trying, but wasn't tested.
2020-12-11 21:20:51 +01:00
Florian Hahn
a480e97ce0 Revert "[AArch64] Lower calls with rv_marker attribute ."
This reverts commit a87fccb3ff9c11986d3110d9f77fb0ccea0daf79.

A test appears to fail with expensive checks. Reverting while I
investigate.
2020-12-11 20:12:59 +00:00
Florian Hahn
e5edf9f654 [LV] Precommit test for PR48429. 2020-12-11 19:56:48 +00:00
Florian Hahn
6d7047b1da [AArch64] Lower calls with rv_marker attribute .
This patch adds support for lowering function calls with the
rv_marker attribute. The goal is to expand such calls to the
following sequence of instructions:

    BL @fn
    mov x29, x29

This sequence of instructions triggers Objective-C runtime optimizations,
hence we want to ensure no instructions get moved in between them.
This patch achieves that by adding a new CALL_RVMARKER ISD node,
which gets turned into the BLR_RVMARKER pseudo, which eventually gets
expanded into the sequence mentioned above. The sequence is then marked
as instruction bundle, to avoid anything being moved in between.

@ahatanak is working on using this attribute in the front- & middle-end.

Together with the front- & middle-end changes, this should address
PR31925 for AArch64.

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D92569
2020-12-11 19:45:44 +00:00
Scott Linder
d0f7c7f0f0 [SmallVector][NFC] Link to ProgrammersManual from SmallVector docs
Add a "see also" link from the condensed doxygen description of
`SmallVector` to the more complete description in the ProgrammersManual.
2020-12-11 19:34:10 +00:00
Fangrui Song
701bea2805 [MCAsmInfo] Delete unused doesSupportExceptionHandling
ExceptionHandling:: is a bit misleading - we actually use the term for both
exceptions and non-exception .eh_frame usage.
2020-12-11 11:08:16 -08:00
LLVM GN Syncbot
3a9d02d48e [gn build] Port b577d2df7bd 2020-12-11 18:37:39 +00:00
Craig Topper
be7810833d [RISCV] Add a pass to remove duplicate VSETVLI instructions in a basic block.
Add simple pass for removing redundant vsetvli instructions within a basic block. This handles the case where the AVL register and VTYPE immediate are the same and no other instructions that change VTYPE or VL are between them.

There are going to be more opportunities for improvement in this space as we development more complex tests.

Differential Revision: https://reviews.llvm.org/D92679
2020-12-11 10:35:37 -08:00
Michael Kruse
f2e5d1dd3e [tests][OpenMPIRBuilder] Use EXPECT_EQ instead ASSERT_EQ.
Test execution can continue even if previous cases failed.
2020-12-11 11:49:50 -06:00
Nikita Popov
b4e8c6bdf0 [BasicAA] Handle two unknown sizes for GEPs
If we have two unknown sizes and one GEP operand and one non-GEP
operand, then we currently simply return MayAlias. The comment says
we can't do anything useful ... but we can! We can still check that
the underlying objects are different (and do so for the GEP-GEP case).

To reduce the compile-time impact, this a) checks this early, before
doing the relatively expensive GEP decomposition that will not be
used and b) doesn't do the check if the other operand is a phi or
select. In that case, the phi/select will already recurse, so this
would just do two slightly different recursive walks that arrive at
the same roots.

Compile-time is still a bit of a mixed bag: https://llvm-compile-time-tracker.com/compare.php?from=624af932a808b363a888139beca49f57313d9a3b&to=845356e14adbe651a553ed11318ddb5e79a24bcd&stat=instructions
On average this is a small improvement, but sqlite with ThinLTO has
a 0.5% regression (lencod has a 1% improvement).

The BasicAA test case checks this by using two memsets with unknown
size. However, the more interesting case where this is useful is
the LoopVectorize test case, as analysis of accesses in loops tends
to always us unknown sizes.

Differential Revision: https://reviews.llvm.org/D92401
2020-12-11 18:45:53 +01:00
Hiroshi Yamauchi
a4a8865fc2 [PGO] Adjust -vp-counters-per-site under dynamic linking.
Addressing clang bootstrap under the dynamic linking mode running out of static
allocation of value profile nodes, reported in D81682.

Differential Revision: https://reviews.llvm.org/D92669
2020-12-11 09:42:53 -08:00
Michael Kruse
15b1228eb3 [OpenMPIRBuilder] Various changes required for tileLoops.
Extract some changes not directly related to tileLoops out of D92974:
 * Refactor `createLoopSkeleton` out of `createCanonicalLoop`.
 * Introduce `ComputeIP` parameter to the `createCanonicalLoop` overload inserts instructions to compute the trip count. Specifying the location is necessary to make these instructions appear before the outermost loop of a loop nest that is tiled.
 * Introduce `Name` parameter to `createCanonicalLoop`. This can help better understanding the origin of values of basic blocks with many loops. The default value is "loop" instead of "for" which could be confused with the "for directive" (aka worksharing-loop) and does not apply to Fortran.
 * Remove `CanonicalLoopInfo::eraseFromParent` which is currently unused and untested and was added in anticipation to be used by `tileLoops`. `eraseFromParent` has shown to be insufficient when more than a single loop is involved and is replaced by `removeUnusedBlocksFromParent` in D92974.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D93088
2020-12-11 11:37:45 -06:00