1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

200822 Commits

Author SHA1 Message Date
Vy Nguyen
602a8b1df5 [llvm-exegesis] Check perf_branch_entry for field cycles
Summary: Follow up to breakages reported in D77422

Reviewers: ondrasej, gchatelet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84076
2020-07-27 11:31:13 -04:00
jasonliu
f9434ab9b2 [XCOFF][AIX] Handle llvm.used and llvm.compiler.used global array
For now, just return and do nothing when we see llvm.used and
llvm.compiler.used global array.
Hopefully, we could come up with a good solution later to prevent
linker from eliminating symbols in llvm.used array.

Reviewed By: DiggerLin, daltenty

Differential Revision: https://reviews.llvm.org/D84363
2020-07-27 15:28:32 +00:00
Jon Roelofs
c0e6798650 [AArch64] fjcvtzs,rmif,cfinv,setf* all clobber nzcv
Differential Revision: https://reviews.llvm.org/D83818
2020-07-27 09:17:53 -06:00
Sergej Jaskiewicz
8b9bd0fa18 [lit] Don't expand escapes until all substitutions have been applied
Otherwise, if a Lit script contains escaped substitutions (like %%p in this test https://github.com/llvm/llvm-project/blob/master/compiler-rt/test/asan/TestCases/Darwin/asan-symbolize-partial-report-with-module-map.cpp#L10), they are unescaped during recursive application of substitutions, and the results are unexpected.

We solve it using the fact that double percent signs are first replaced with #_MARKER_#, and only after all the other substitutions have been applied, #_MARKER_# is replaced with a single percent sign. The only change is that instead of replacing #_MARKER_# at each recursion step, we replace it once after the last recursion step.

Differential Revision: https://reviews.llvm.org/D83894
2020-07-27 18:09:00 +03:00
Amy Kwan
fc21604c62 Revert "Re-apply:" Emit DW_OP_implicit_value for Floating point constants""
This patch reverts commit `59a76d957a26` as it has caused failure on the
big endian PowerPC buildbots (as well as the SystemZ buildbots).
2020-07-27 09:44:13 -05:00
Luofan Chen
023f114f4c [Attributor] Fix qualifier warning of the unittest
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D84532
2020-07-27 22:28:39 +08:00
Guillaume Chatelet
d037bcfdd4 [NFC] Replace ".size() < 1" with ".empty()" 2020-07-27 13:54:53 +00:00
Simon Pilgrim
fbf73d6d1b [X86][AVX] Fold extract_subvector(truncate(x),0) -> truncate(extract_subvector(x),0)
This is currently only supported for VLX targets where the op should be legal.
2020-07-27 14:51:29 +01:00
Simon Pilgrim
b2a0b1f988 [X86] combineExtractSubvector - pull out repeated getSizeInBits() calls. NFCI. 2020-07-27 14:51:28 +01:00
Simon Pilgrim
6a70e74d4b IRPrintingPasses.h - simplify unnecessary header with forward declarations. NFC.
Remove duplicate PassManager.h include in IRPrintingPasses.cpp
2020-07-27 14:51:28 +01:00
Sergey Dmitriev
02505ee23a [CallGraph] Preserve call records vector when replacing call edge
Summary:
Try not to resize vector of call records in a call graph node when
replacing call edge. That would prevent invalidation of iterators
stored in the CG SCC pass manager's scc_iterator.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84295
2020-07-27 06:02:55 -07:00
Tim Northover
bb96b301c9 AArch64: avoid UB shift of negative value
Left shifting a negative value is undefined behaviour, so this just moves the
negation afterwards to avoid it.
2020-07-27 13:49:50 +01:00
Roman Lebedev
cd8f899c56 [JumpThreading] ProcessBranchOnXOR(): bailout if any pred ends in indirect branch (PR46857)
SplitBlockPredecessors() can not split blocks that have such terminators,
and in two other places we already ensure that we don't end up calling
SplitBlockPredecessors() on such blocks. Do so in one more place.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46857
2020-07-27 15:39:03 +03:00
Roman Lebedev
aa265a31e9 [Reduce] Argument reduction: shoe-horn new function into remaining uses of old function
Much like with function reduction, there may be remaining unhandled uses
of function, in particular in blockaddress. And in constants we can't
RAUW it with undef, because undef is not a function.
Instead, let's try to pretent that in the remaining cases, the new
signature didn't change, by bitcasting it.

A new (previously crashing) test case added.
2020-07-27 15:39:03 +03:00
Roman Lebedev
49caf4bfe4 [Reduce] Function reduction: replace all users of function with undef
There may be other users of a function other than CallInsts,
but what's more important, we can't actually replace function pointer
with undef, because for constants, that would not preserve the type
and RAUW would assert.

In particular, that affects blockaddress, however it proves to be
prohibitively complex to come up with a good test involving blockaddress:
we'd need to both ensure that the function body survives until
this pass, and is not interesting in this pass.
2020-07-27 15:39:02 +03:00
Nathan James
0c0782b95f [llvm][NFC] Silence unused variable warning by using isa over dyn_cast 2020-07-27 13:37:21 +01:00
Sanjay Patel
1555485a98 [InstSimplify] add tests for min/max intrinsics; NFC 2020-07-27 08:26:27 -04:00
Hans Wennborg
e73cfa27dc [gn] Set CLANGD_ENABLE_REMOTE=0
To fix the build after 37ac559fccd4.
2020-07-27 14:05:03 +02:00
Tim Northover
ebd705668f AArch64: diagnose out of range relocation addends on MachO.
MachO only has 24-bit addends for most relocations, small enough that it can
overflow in semi-reasonable functions and cause insidious bugs if compiled
without assertions enabled. Switch it to an actual error instead.

The condition isn't quite identical because ld64 treats the addend as a signed
number.
2020-07-27 13:01:22 +01:00
Juneyoung Lee
19be62e78d [JumpThreading] Add a test case that has a phi with undef; NFC 2020-07-27 19:08:45 +09:00
Juneyoung Lee
b6d1184186 [JumpThreading] Add a test that threads jumps with frozen branch conditions 2020-07-27 19:04:50 +09:00
Afanasyev Ivan
2f03196312 [Docs] remove unused arguments in documentation examples on vectorization passes
Reviewers: nadav, tyler.nowicki

Reviewed By: nadav

Differential Revision: https://reviews.llvm.org/D83851
2020-07-27 10:20:26 +01:00
Guillaume Chatelet
7ab8564b32 [Alignment][NFC] Update Bitcodewriter to use Align
Differential Revision: https://reviews.llvm.org/D83533
2020-07-27 08:16:45 +00:00
Juneyoung Lee
9587ec7520 [InstCombine] Fold freeze into phi if one operand is not undef
This patch adds folding freeze into phi if it has only one operand to target.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84601
2020-07-27 17:07:27 +09:00
Piotr Sobczak
2559516452 [AMDGPU] Make generating cache invalidating instructions optional
Summary:
D78800 skipped generating cache invalidating instrucions altogether
on AMDPAL. However, this is sometimes too restrictive - we want a
more flexible option to be able to toggle this behaviour on and off
while we work towards developing a correct implementation of the
alternative memory model.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84448
2020-07-27 09:24:11 +02:00
David Sherwood
7010946aee [SVE] Don't use LocalStackAllocation for SVE objects
I have introduced a new TargetFrameLowering query function:

  isStackIdSafeForLocalArea

that queries whether or not it is safe for objects of a given stack
id to be bundled into the local area. The default behaviour is to
always bundle regardless of the stack id, however for AArch64 this is
overriden so that it's only safe for fixed-size stack objects.
There is future work here to extend this algorithm for multiple local
areas so that SVE stack objects can be bundled together and accessed
from their own virtual base-pointer.

Differential Revision: https://reviews.llvm.org/D83859
2020-07-27 08:22:01 +01:00
Roman Lebedev
2e1aaf1aa5 [XRay] Account: recursion detection
Summary:
Recursion detection can be non-trivial. Currently, the state-of-the-art for LLVM,
as far as i'm concerned, is D72362 `[clang-tidy] misc-no-recursion: a new check`.
However, it is quite limited:
* It does very basic call-graph based analysis, in the sense it will report even dynamically-unreachable recursion.
* It is inherently limited to a single TU
* It is hard to gauge how problematic each recursion is in practice.

Some of that can be addressed by adding clang analyzer-based check,
then it would at least support multiple TU's.

However, we can approach this problem from another angle - dynamic run-time analysis.
We already have means to capture a run-time callgraph (XRay, duh),
and there are already means to reconstruct it within `llvm-xray` tool.

This proposes to add a `-recursive-calls-only` switch to the `account` tool.
When the switch is on, when re-constructing callgraph for latency reconstruction,
each time we enter/leave some function, we increment/decrement an entry for the function
in a "recursion depth" map. If, when we leave the function, said entry was at `1`,
then that means the function didn't call itself, however if it is at `2` or more,
then that means the function (possibly indirectly) called itself.

If the depth is 1, we don't account the time spent there,
unless within this call stack the function already recursed into itself.
Note that we don't pay for recursion depth tracking when `recursive-calls-only` is not on,
and the perf impact is insignificant (+0.3% regression)

The overhead of the option is actually negative, around -5.26% user time on a medium-sized (3.5G) XRay log.
As a practical example, that 3.5G log is a capture of the entire middle-end opt pipeline
at `-O3` for RawSpeed unity build. There are total of `5500` functions in the log,
however `-recursive-calls-only` says that `269`, or 5%, are recursive.

Having this functionality could be helpful for recursion eradication.

Reviewers: dberris, mboerger

Reviewed By: dberris

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84582
2020-07-27 10:15:44 +03:00
Yuanfang Chen
8d83600e34 [NewPM] NFC. remove obsolete TODO comment
The deleted TODO was implemented in D82344.
2020-07-26 22:32:24 -07:00
biplmish
d6124d6963 [PowerPC] Add Vector Extract Double Instruction Definitions and MC tests.
This patch adds the td definitions and asm/disasm tests for the following instructions:

Vector Extract Double Left Index - vextdubvlx, vextduhvlx, vextduwvlx, vextddvlx
Vector Extract Double Right Index - vextdubvrx, vextduhvrx, vextduwvrx, vextddvrx

Differential Revision: https://reviews.llvm.org/D84384
2020-07-26 23:56:19 -05:00
Fangrui Song
4d15fd3d7d [gcov] Simplify/speed up CFG hash calculation 2020-07-26 21:15:33 -07:00
Matt Arsenault
e77b1fa382 AMDGPU/GlobalISel: Don't assert in LegalizerInfo constructor
We don't really need these asserts. The LegalizerInfo is also
overly-aggressivly constructed, even when not in use. It needs to not
assert on dummy targets that have manually specified, unrelated
features.
2020-07-26 23:01:28 -04:00
QingShan Zhang
ab3ffdcb1e [Scheduling] Improve group algorithm for store cluster
Store Addr and Store Addr+8 are clusterable pair. They have memory(ctrl) dependency on different loads.
Current implementation will put these two stores into different group and miss to cluster them.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D84139
2020-07-27 02:02:40 +00:00
Juneyoung Lee
d03424ed2f [InstCombine] Add more tests to freeze-phi.ll; NFC 2020-07-27 09:43:00 +09:00
Lang Hames
c3522a840c [ORC] Remove a redundant call to getTargetMemory. 2020-07-26 17:34:31 -07:00
Craig Topper
7bfd2a49b9 [X86] Turn X86DAGToDAGISel::tryVPTERNLOG into a fully custom instruction selector that can handle bitcasts between logic ops
Previously we just matched the logic ops and replaced with an
X86ISD::VPTERNLOG node that we would send through the normal
pattern match. But that approach couldn't handle a bitcast
between the logic ops. Extending that approach would require us
to peek through the bitcasts and emit new bitcasts to match
the types. Those new bitcasts would then have to be properly
topologically sorted.

This patch instead switches to directly emitting the
MachineSDNode and skips the normal tablegen pattern matching.
We do have to handle load folding and broadcast load folding
ourselves now. Which also means commuting the immediate control.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D83630
2020-07-26 12:19:08 -07:00
Craig Topper
5a15b82ff5 [X86] Move getGatherOverhead/getScatterOverhead into X86TargetTransformInfo.
These cost methods don't make much sense in X86Subtarget. Make
them methods in X86's TTI and move the feature checks from the
X86Subtarget constructor into these methods.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D84594
2020-07-26 10:38:42 -07:00
Juneyoung Lee
8ca15c912d [InstCombine] Add a test for folding freeze into phi; NFC 2020-07-27 02:24:00 +09:00
Simon Pilgrim
1d52f2e8fe [X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening
If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result in the shuffle referencing more elements of the source vector than expected, affecting later shuffle combines and KnownBits/SimplifyDemanded calls.

By ensuring we widen the undef mask element we allow getV4X86ShuffleImm8 to use inline elements as the default, which are more likely to fold.
2020-07-26 16:04:22 +01:00
Matt Arsenault
d87103aee2 AMDGPU/GlobalISel: Fix not constraining ds_append/consume operands 2020-07-26 10:17:36 -04:00
Matt Arsenault
0cd077be9e GlobalISel: Handle G_PTR_ADD in narrowScalar 2020-07-26 10:08:17 -04:00
Matt Arsenault
e78438845a GlobalISel: Handle fewerElementsVector for G_PTR_ADD 2020-07-26 10:08:09 -04:00
Matt Arsenault
df5be19ad2 AMDGPU/GlobalISel: Reorder G_CONSTANT legality rules
The legal cases should be the first rules.
2020-07-26 10:05:05 -04:00
Matt Arsenault
98abcbc05e AMDGPU/GlobalISel: Make sure <2 x s1> phis are scalarized 2020-07-26 10:04:47 -04:00
Matt Arsenault
51d987f63c AMDGPU/GlobalISel: Legalize GDS atomics
I noticed these don't use the _gfx9, non-m0 reading variants but not
sure if that's a bug or not. It's the same in the DAG.
2020-07-26 10:03:34 -04:00
Matt Arsenault
2ff38d1a0d AMDGPU/GlobalISel: Pack constant G_BUILD_VECTOR_TRUNCs when selecting 2020-07-26 09:55:34 -04:00
Sanjay Patel
3c7b643ee6 [InstSimplify] fold integer min/max intrinsics with limit constant 2020-07-26 09:41:54 -04:00
Matt Arsenault
4250551b6d GlobalISel: Handle 'n' inline asm constraint 2020-07-26 09:30:41 -04:00
Matt Arsenault
ae1e0f7a68 AMDGPU/GlobalISel: Sign extend integer constants
This matches the DAG behavior and fixes immediate folding
2020-07-26 09:30:14 -04:00
Matt Arsenault
876b5f69da AMDGPU/GlobalISel: Replace selection tests for G_CONSTANT/G_FCONSTANT
Split into separate tests and make more consistent with the others.
2020-07-26 09:30:09 -04:00
Xing GUO
33fdabb077 [DWARFYAML] Rename getUsedSectionNames() to getNonEmptySectionNames().
This patch renames getUsedSectionNames() to getNonEmptySectionNames.
NFC.
2020-07-26 21:10:38 +08:00