1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

202588 Commits

Author SHA1 Message Date
Florian Hahn
c351843892 [DSE,MemorySSA] Remove short-cut to check if all paths are covered.
The post-order number early continue does not work in some cases, e.g.
if a path from EarlierAccess to an exit includes a node that dominates
EarlierAccess in a cycle.

The short-cut only has very minor impact on compile-time, so it seems
straight-forward to remove it for now:

http://llvm-compile-time-tracker.com/compare.php?from=062412e79fcfedf2cf004433e42036b0333e3f83&to=d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions

Fixes PR47285.
2020-08-27 12:42:40 +01:00
OCHyams
5a27fa8d7e Revert "[DWARF] Add cuttoff guarding quadratic validThroughout behaviour"
This reverts commit b9d977b0ca60c54f11615ca9d144c9f08b29fd85.

This cutoff is no longer required. The commit 34ffa7fc501 (D86153) introduces a
performance improvement which was tested against the motivating case for this
patch.

Discussed in differential revision: https://reviews.llvm.org/D86153
2020-08-27 11:52:30 +01:00
OCHyams
3e2347db56 [DwarfDebug] Improve validThroughout performance (4/4)
Almost NFC (see end).

The backwards scan in validThroughout significantly contributed to compile time
for a pathological case, causing the 'X86 Assembly Printer' pass to account for
roughly 70% of the run time. This patch guards the loop against running
unnecessarily, bringing the pass contribution down to 4%.

Almost NFC: There is a hack in validThroughout which promotes single constant
value DBG_VALUEs in the prologue to be live throughout the function. We're more
likely to hit this code path with this patch applied. Similarly to the parent
patches there is a small coverage change reported in the order of 10s of bytes.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D86153
2020-08-27 11:52:30 +01:00
OCHyams
4d679a7eae [DwarfDebug] Improve multi-BB single location detection in validThroughout (3/4)
With the changes introduced in D86151 we can now check for single locations
which span multiple blocks for inlined scopes and blocks.

D86151 introduced the InstructionOrdering parameter, replacing a scan through
MBB instructions. The functionality to compare instruction positions across
blocks was add there, and this patch just removes the exit checks that were
previously (but no longer) required.

CTMark shows a geomean binary size reduction of 2.2% for RelWithDebInfo builds.
llvm-locstats (using D85636) shows a very small variable location coverage
change in 5 of 10 binaries, but just like in D86151 it is only in the order of
10s of bytes.

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D86152
2020-08-27 11:52:29 +01:00
OCHyams
463930d1cb [DwarfDebug] Improve single location detection in validThroughout (2/4)
With this patch we're now accounting for two more cases which should be
considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where
RangeEnd comes before ScopeEnd when including meta instructions, but are both
preceded by the same non-meta instruction.

CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds.
`llvm-locstats` (using D85636) shows a very small variable location coverage
change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines
up with my expectations.

I've added a test which checks both of these new cases. The first check in the
test isn't strictly necessary for this patch. But I'm not sure that it is
explicitly tested anywhere else, and is useful for the final patch in the
series.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D86151
2020-08-27 11:52:29 +01:00
OCHyams
adfe449a17 [NFC][DebugInfo] Create InstructionOrdering helper class (1/4)
Group the map and methods used to query instruction ordering for trimVarLocs
(D82129) into a class. This will make it easier to reuse the functionality
upcoming patches.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D86150
2020-08-27 11:52:29 +01:00
Florian Hahn
40dcb7eec6 [DSE,MemorySSA] Add test for PR47285. 2020-08-27 11:27:06 +01:00
Vitaly Buka
c66ff34562 [NFC][ValueTracking] Cleanup a test 2020-08-27 03:25:10 -07:00
Mikhail Maltsev
63df72fb38 [AArch64] Optimize instruction selection for certain vector shuffles
This patch adds code to recognize vector shuffles which can be
represented as VDUP (splat) of a vector lane with of a different
(wider) type than the original vector lane type.

For example:
    shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 0, i32 1, i32 0, i32 1>
is essentially:
    shufflevector <2 x i32> %v, <2 x i32> undef, <2 x i32> <i32 0, i32 0>

Such patterns are generated by the SelectionDAG machinery in some cases
(see DAGCombiner::visitBITCAST in DAGCombiner.cpp, the "Remove double
bitcasts from shuffles" part).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86225
2020-08-27 11:06:49 +01:00
Vitaly Buka
368c7592ca [NFC][ValueTracking] Fix typo in test 2020-08-27 03:01:30 -07:00
Paul Walker
a71aa114da [SVE] Fallback to default expansion when lowering SIGN_EXTEN_INREG from non-byte based source.
Differential Revision: https://reviews.llvm.org/D86394
2020-08-27 10:57:37 +01:00
Sander de Smalen
33c91e76d2 [AArch64][SVE] Add missing debug info for ACLE types.
This patch adds type information for SVE ACLE vector types,
by describing them as vectors, with a lower bound of 0, and
an upper bound described by a DWARF expression using the
AArch64 Vector Granule register (VG), which contains the
runtime multiple of 64bit granules in an SVE vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86101
2020-08-27 10:56:42 +01:00
Sjoerd Meijer
7ef96b82c5 Follow up of rGca243b07276a: fixed a typo. NFC. 2020-08-27 10:53:41 +01:00
Alex Richardson
1928b323b7 [RISC-V] fmv.s/fmv.d should be as cheap as a move
Since the canonical floatig-point move is fsgnj rd, rs, rs, we should
handle this case in RISCVInstrInfo::isAsCheapAsAMove().

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D86518
2020-08-27 10:32:23 +01:00
Alex Richardson
aa5340b34a [RISC-V] Mark C_MV as a move instruction
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D86517
2020-08-27 10:32:23 +01:00
Alex Richardson
7a907ec504 [RISC-V] ADDI/ORI/XORI x, 0 should be as cheap as a move
The isTriviallyRematerializable hook is only called for instructions that are
tagged as isAsCheapAsAMove. Since ADDI 0 is used for "mv" it should definitely
be marked with "isAsCheapAsAMove". This change avoids one stack spill in most of
the atomic-rmw.ll tests functions. It also avoids stack spills in two of our
out-of-tree CHERI tests.
ORI/XORI with zero may or may not be the same as a move micro-architecturally,
but since we are already doing it for register == x0, we might as well
do the same if the immediate is zero.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D86480
2020-08-27 10:32:22 +01:00
Vitaly Buka
afd0abc1f8 [ValueTracking] Support select in findAllocaForValue 2020-08-27 02:13:52 -07:00
Georgii Rymar
0c993fd620 [unittests/Object] - Simplify the code in ELFObjectFileTest.cpp, NFCI.
This refactors/rewrites the code to remove duplication.

Differential revision: https://reviews.llvm.org/D86623
2020-08-27 12:06:04 +03:00
Florian Hahn
b3984cec53 [DSE,MemorySSA] Traverse use-def chain without MemSSA Walker.
For DSE with MemorySSA it is beneficial to manually traverse the
defining access, instead of using a MemorySSA walker, so we can
better control the number of steps together with other limits and
also weed out invalid/unprofitable paths early on.

This patch requires a follow-up patch to be most effective, which I will
share soon after putting this patch up.

This temporarily XFAIL's the limit tests, because we now explore more
MemoryDefs that may not alias/clobber the killing def. This will be
improved/fixed by the follow-up patch.

This patch also renames some `Dom*` variables to `Earlier*`, because the
dominance relation is not really used/important here and potentially
confusing.

This patch allows us to aggressively cut down compile time, geomean
-O3 -0.64%, ReleaseThinLTO -1.65%, at the expense of fewer stores
removed. Subsequent patches will increase the number of removed stores
again, while keeping compile-time in check.

http://llvm-compile-time-tracker.com/compare.php?from=d8e3294118a8c5f3f97688a704d5a05b67646012&to=0a929b6978a068af8ddb02d0d4714a2843dd8ba9&stat=instructions

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86486
2020-08-27 10:02:02 +01:00
Vitaly Buka
f918804878 [NFC] Add unittests for findAllocaForValue 2020-08-27 01:54:19 -07:00
Sjoerd Meijer
38fc834d2c Revert "[Verifier] Additional check for intrinsic get.active.lane.mask"
This reverts commit 8d5f64c4edbc190a5a8790157fa1d99cfac34016.

Thanks to Eli Friedma for pointing out that this check is not appropiate here,
this check will be moved to the Lint pass.
2020-08-27 09:27:05 +01:00
Piotr Sobczak
0a08f60d7f [AMDGPU] Preserve vcc_lo when shrinking V_CNDMASK
There is no justification for changing vcc_lo to vcc
when shrinking V_CNDMASK, and such a change could
later confuse live variable analysis.

Make sure the original register is preserved.

Differential Revision: https://reviews.llvm.org/D86541
2020-08-27 10:22:50 +02:00
Sjoerd Meijer
fe5ce52722 [LangRef] get.active.lane.mask can produce poison value
We had already specified that second argument `n` of this intrinsic is `n > 0`,
but now add to this that the result is a poison value if this is not the case.

Differential Revision: https://reviews.llvm.org/D86637
2020-08-27 08:57:35 +01:00
Shinji Okumura
cc1db81009 [Attributor] Add flag for undef value to the state of AAPotentialValues
Currently, an undef value is reduced to 0 when it is added to a set of potential values.
This patch introduces a flag for under values. By this, for example, we can merge two states `{undef}`, `{1}` to `{1}` (because we can reduce the undef to 1).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85592
2020-08-27 16:30:29 +09:00
Sam Parker
e6053bfbc3 [ARM] Enable outliner at -Oz for M-class
Enable default outlining when the function has the minsize attribute
and we're targeting an m-class core.

Differential Revision: https://reviews.llvm.org/D82951
2020-08-27 08:02:56 +01:00
Martin Storsjö
edc5aafa50 Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain"
This reverts commit 9936455204fd6ab72715cc9d67385ddc93e072ed.

That commit caused failed assertions e.g. like this:

$ cat alloca.c
a;
b() {
  float c;
  d();
  a = __builtin_alloca(d);
  c = e();
  f(a);
  return c;
}
$ clang -target aarch64-linux-gnu -c alloca.c -O2
clang: ../lib/Target/AArch64/AArch64InstrInfo.cpp:3446: void
llvm::emitFrameOffset(llvm::MachineBasicBlock&,
llvm::MachineBasicBlock::iterator, const llvm::DebugLoc&, unsigned int,
unsigned int, llvm::StackOffset, const llvm::TargetInstrInfo*,
llvm::MachineInstr::MIFlag, bool, bool, bool*):
Assertion `(DestReg != AArch64::SP || Bytes % 16 == 0) &&
"SP increment/decrement not 16-byte aligned"' failed.
2020-08-27 09:39:56 +03:00
luxufan
8dd24026c0 [RISCV] add the MC layer support of riscv vector Zvamo extension
Implements the assemble and disassemble support of RISCV Vector
extension zvamo instructions, base on the 0.9 spec version.

Reviewed  by HsiangKai

Differential Revision: https://reviews.llvm.org/D85069
2020-08-27 14:11:38 +08:00
Sam Parker
f65ec601b0 [ARM] Make MachineVerifier more strict about terminators
Fix the ARM backend's analyzeBranch so it doesn't ignore predicated
return instructions, and make the MachineVerifier rule more strict.

Differential Revision: https://reviews.llvm.org/D40061
2020-08-27 07:10:20 +01:00
LLVM GN Syncbot
23d0649d2e [gn build] Port cf918c809bb 2020-08-27 05:59:55 +00:00
LLVM GN Syncbot
752234b1b6 [gn build] Port 7394460d875 2020-08-27 05:59:55 +00:00
QingShan Zhang
f50d2b00a4 [NFC][Test] Update the test with utils/update_llc_test_checks.py 2020-08-27 05:02:55 +00:00
Jianzhou Zhao
0ded22b2ea Fix an overflow issue at BackpatchWord
This happens when generating a huge file by LTO, for example, with -gmlt.
When BitNo is > 2^35, ByteNo is overflowed, and an incorrect output offset is overwritten.
This generates ill-formed bitcodes.

Reviewed-by: tejohnson, vitalybuka

Differential Revision: https://reviews.llvm.org/D86645
2020-08-27 04:46:19 +00:00
Amy Kwan
daa2ee7b26 [PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang
This patch implements the function prototypes vec_mulh and vec_dive in order to
utilize the vector multiply high (vmulh[s|u][w|d]) and vector divide extended
(vdive[s|u][w|d]) instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82609
2020-08-26 23:14:34 -05:00
LLVM GN Syncbot
98083a3146 [gn build] Port 7a457593efe 2020-08-27 01:24:30 +00:00
Matt Arsenault
5f7f259e00 GlobalISel: IRTranslate minimum of pointer sizes on memcpy
I forgot to squash this with 0b7f6cc71a72a85f8a0cbee836a7a8e31876951a
2020-08-26 20:10:00 -04:00
Matt Arsenault
a03ce058a1 GlobalISel: Add generic instructions for memory intrinsics
AArch64, X86 and Mips currently directly consumes these and custom
lowering to produce a libcall, but really these should follow the
normal legalization process through the libcall/lower action.
2020-08-26 20:08:45 -04:00
Lang Hames
02bae0a8ff [ORC][JITLink] Switch to unique ownership for EHFrameRegistrars.
This will make stateful registrars (e.g. a future TargetProcessControl based
registrar) easier to deal with.
2020-08-26 16:59:45 -07:00
Craig Topper
7a5934b151 [X86] Update release notes for -mtune support. 2020-08-26 16:16:56 -07:00
Arthur Eubanks
dbabc094e5 Fix OCaml bindings
Caused by https://reviews.llvm.org/D85159
2020-08-26 16:11:11 -07:00
Arthur Eubanks
d74ec65308 [ConstProp] Remove ConstantPropagation
As discussed in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143801.html.

Currently no users outside of unit tests.

Replace all instances in tests of -constprop with -instsimplify.
Notable changes in tests:
* vscale.ll - @llvm.sadd.sat.nxv16i8 is evaluated by instsimplify, use a fake intrinsic instead
* InsertElement.ll - insertelement undef is removed by instsimplify in @insertelement_undef
llvm/test/Transforms/ConstProp moved to llvm/test/Transforms/InstSimplify/ConstProp

Reviewed By: lattner, nikic

Differential Revision: https://reviews.llvm.org/D85159
2020-08-26 15:51:30 -07:00
Craig Topper
47ba73f241 [X86] Change pentium4 tuning settings and scheduler model back to their values before D83913.
Clang now defaults to -march=pentium4 -mtune=generic so we don't
need modern tune settings on pentium4.
2020-08-26 15:38:12 -07:00
Alina Sbirlea
9e04c74adc Use properlyDominates in RDFLiveness when sorting on dominance.
Summary:
When looking for all reaching definitions, we sort basic blocks on dominance. When sorting looking for properlyDominates() handles the case A == B.

Authored by: pranavb

Differential Revision: https://reviews.llvm.org/D86661
2020-08-26 15:16:40 -07:00
Adrian Prantl
cf37209c05 Bring llvm::Optional data formatter back in sync with the implementation. 2020-08-26 15:10:39 -07:00
Juneyoung Lee
ecc3e51e55 [IR] Remove noundef from masked store/load/gather/scatter's pointer operands
As discussed in D86576, noundef attribute is removed from masked store/load/gather/scatter's
pointer operands.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86656
2020-08-27 06:41:43 +09:00
Ahmed Bougacha
dee1317373 [AArch64] Use CCAssignFnForReturn helper in more spots. NFC.
It was added for GISel, but SDAG could use it too!
2020-08-26 14:39:11 -07:00
Juneyoung Lee
caa42f61a0 [LangRef] Memset/memcpy/memmove can take undef/poison pointer if the size is 0
According to the current LangRef, Memset/memcpy/memmove can take a
null/dangling pointer if the size is zero.
(Relevant thread: http://lists.llvm.org/pipermail/llvm-dev/2017-July/115665.html )
This patch expands it and allows the functions to take undef/poison pointers
too.

This required the updates in the align attribute since it isn't specified
what is the alignment of undef/poison pointers.
This patch states that their alignment is 1.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86643
2020-08-27 06:19:28 +09:00
Sanjay Patel
cb49e0d880 [VectorCombine] adjust test for better coverage; NFC
A >2x insert might crash if we do not generate the shuffle mask carefully.

D86160
2020-08-26 16:52:48 -04:00
Nikita Popov
d8d374b2b3 [InstSimplify] Fold min/max intrinsic based on icmp of operands
This is a reboot of D84655, now performing the inner icmp
simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp()
fold based on this, by moving the logic into icmp of min/max instead,
making it more general. We can't drop the folds for constant operands,
because those also allow undef, which we exclude here.

The tests use assumes for exhaustive coverage, and have a few
more examples of misc folds we get based on icmp simplification.

Differential Revision: https://reviews.llvm.org/D85929
2020-08-26 22:02:57 +02:00
Nikita Popov
9af50bfcdc [InstSimplify] Add additional umax tests (NFC)
A sample of some folds we get if we perform icmp simplification
on min/max intrinsics.
2020-08-26 22:02:56 +02:00
Muhammad Asif Manzoor
9eaab3b90f [AArch64][SVE] Add lowering for llvm fceil
Add the functionality to lower fceil for passthru variant

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D84548
2020-08-26 15:59:44 -04:00