1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

216611 Commits

Author SHA1 Message Date
Andrew Kelley
08aca8b420 WindowsSupport.h: do not depend on private config header
WindowsSupport.h is a public header, however if it gets included, will cause a compile error indicating that llvm/Config/config.h cannot be found, because config.h is a private header. However there is no actual dependency on the private things in this header, so it can be changed to the public config header.

Reviewed By: amccarth

Differential Revision: https://reviews.llvm.org/D103370
2021-06-01 23:05:03 +03:00
Sanjay Patel
6012aaaacf [InstCombine] add tests for cast folding; NFC
https://llvm.org/PR49543
2021-06-01 16:03:24 -04:00
Anirudh Prasad
2991e3d38e [SystemZ][z/OS] Stricter condition for HLASM class instantiation
- A lot of lit tests simply specify the arch minus the triple. On z/OS, this could result in a scenario of some-other-triple-unknown-ibm-zos. This points to an incorrect triple + arch combo.
- To prevent this, isOSzOS change is switched in favour of isOSBinFormatGOFF.
- This is because, the GOFF format is set only if the triple is systemz and if the operating system is GOFF. And currently, there are no other architectures/os's using the GOFF file format.
- An argument could be made that the problematic tests be fixed to explicitly specify the arch-vendor-triple string, but there's a large number of these tests, and adding this stricter scope ensures that we aren't instantiating the incorrect instance of the AsmParser for other platforms when run on z/OS.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D103343
2021-06-01 15:56:50 -04:00
LLVM GN Syncbot
92d9ca26bc [gn build] Port 5671ff20d92b 2021-06-01 19:37:29 +00:00
madhur13490
8a89499a33 [AMDGPU][NFC] Remove author's name from codebase
This must have made to code by accident.

Differential Revision: https://reviews.llvm.org/D103484
2021-06-02 00:51:48 +05:30
Harald van Dijk
fb6500dfb1 [SLPVectorizer] Ignore unreachable blocks
As the existing test unreachable.ll shows, we should be doing more
work to avoid entering unreachable blocks: we should not stop
vectorization just because a PHI incoming value from an unreachable
block cannot be vectorized. We know that particular value will never
be used so we can just replace it with poison.
2021-06-01 20:21:04 +01:00
Jessica Paquette
852a8449e7 [GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width
Also add a target hook which allows us to get around custom legalization on
AArch64.

Differential Revision: https://reviews.llvm.org/D99283
2021-06-01 10:56:17 -07:00
Guozhi Wei
b2dfe60e88 [X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB
This patch transforms the sequence

    lea (reg1, reg2), reg3
    sub reg3, reg4

to two sub instructions

    sub reg1, reg4
    sub reg2, reg4

Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register of ADD).

Differential Revision: https://reviews.llvm.org/D101970
2021-06-01 10:31:30 -07:00
Jonas Paulsson
5e3b55a7dc [SystemZ] Return true from hasBitPreservingFPLogic().
This is currently NFC on benchmarks and tests.

Review: Ulrich Weigand
2021-06-01 11:52:50 -05:00
Eli Friedman
a4632c066a [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
Nikita Popov
0d55b59b6a [ADT] Move DenseMapInfo for APInt into APInt.h (PR50527)
As suggested in https://bugs.llvm.org/show_bug.cgi?id=50527, this
moves the DenseMapInfo for APInt and APSInt into the respective
headers, removing the need to include APInt.h and APSInt.h from
DenseMapInfo.h.

We could probably do the same from StringRef and ArrayRef as well.

Differential Revision: https://reviews.llvm.org/D103422
2021-06-01 18:31:41 +02:00
Craig Topper
ec4f4175eb [RISCV] Remove earlyclobber from vnsrl/vnsra/vnclip(u) when the source and dest are a single vector register.
This guarantees they meet this overlap exception:

"The destination EEW is smaller than the source EEW and the overlap
is in the lowest-numbered part of the source register group"

Being a single register guarantees the overlap is always in the
lowerst-number part of the group.

Reviewed By: frasercrmck, khchen

Differential Revision: https://reviews.llvm.org/D103351
2021-06-01 09:17:52 -07:00
Craig Topper
7b6a74df3c [RISCV] Remove earlyclobber from compares with LMUL<=1.
Compares are considered a narrowing operation for register overlap.
I believe for LMUL<=1 they meet this exception to allow overlap

"The destination EEW is smaller than the source EEW and the overlap is in the
lowest-numbered part of the source register group"

Both the result and the sources will occupy a single register for
LMUL<=1 so the overlap would always be in the "lowest-numbered part".

Reviewed By: frasercrmck, HsiangKai

Differential Revision: https://reviews.llvm.org/D103336
2021-06-01 09:08:11 -07:00
Sanjay Patel
28aa6ad4db [x86] add test for sext-of-setcc; NFC 2021-06-01 11:12:52 -04:00
Xun Li
1825d50d24 Simplify coro-zero-alloca.ll
D101841 added this test. It appears to generate different outcome on different platforms.
Make it to only call -coro-split instead of entire O2 pipeline to simplify the test flow.
Hope this will make  the test more robust.

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D103418
2021-06-01 08:12:35 -07:00
Alexey Bataev
8c3d0ae3df [SLP]Better detection of perfect/shuffles matches for gather nodes.
Implemented better scheme for perfect/shuffled matches of the gather
nodes which allows to fix the performance regressions introduced by
earlier patches. Starting detecting matches for broadcast nodes and
extractelement gathering.

Differential Revision: https://reviews.llvm.org/D102920
2021-06-01 07:08:07 -07:00
gbreynoo
081a4bd35e [llvm-dwarfdump][test] Add missing dedicated tests for some options
This change adds tests specifically for --parent-recurse-depth, --quiet
and -o. The test for -o found a typo in an error message which is also
fixed in this change.

Differential Revision: https://reviews.llvm.org/D103250
2021-06-01 14:57:00 +01:00
Daniil Seredkin
764783428b [InstCombine] Relax constraints of uses for exp(X) * exp(Y) -> exp(X + Y)
InstCombine didn't perform the transformations when fmul's operands were
the same instruction because it required to have one use for each of them
which is false in the case. This patch fixes this + adds tests for them
and introduces a new function isOnlyUserOfAnyOperand to check these cases
in a single place.

This patch is a result of discussion in D102574.

Differential Revision: https://reviews.llvm.org/D102698
2021-06-01 08:33:23 -04:00
Florian Hahn
f40584cef6 [LoopDeletion] Consider infinite loops alive, unless mustprogress.
The current loop or any of its sub-loops may be infinite. Unless the
function or the loops are marked as mustprogress, this in itself makes
the loop *not* dead.

This patch moves the logic to check whether the current loop is finite
or mustprogress to `isLoopDead` and also extends it to check the
sub-loops. This should fix PR50511.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D103382
2021-06-01 13:07:36 +01:00
Sanjay Patel
5888c31732 [SDAG] add helper function for sext-of-setcc folds; NFC
Try to make this easier to read as noted in D103280
2021-06-01 08:07:17 -04:00
Florian Hahn
00bbac35a4 [VectorCombine] Freeze index unless it is known to be non-poison.
If the index itself is already poison, the poison propagates through
instructions clamping the index to a valid range. This still causes
introducing a load of poison, as flagged by Alive2 and pointed out
at 575e2aff5574.

This patch updates the code to freeze the index, unless it is proven to
not be poison.

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D103378
2021-06-01 10:40:57 +01:00
Fraser Cormack
9fec15e2c5 [RISCV] Support vector types in combination with fastcc
This patch extends the RISC-V lowering of the 'fastcc' calling
convention to vector types, both fixed-length and scalable. Without this
patch, any function passing or returning vector types by value would
throw a compiler error.

Vectors are handled in 'fastcc' much as they are in the default calling
convention, the noticeable difference being the extended set of scalar
GPR registers that can be used to pass vectors indirectly.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D102505
2021-06-01 10:31:18 +01:00
Andy Wingo
a2b88794ad [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-06-01 11:31:39 +02:00
Florian Hahn
49cafe1d7b [VectorCombine] Add tests with multiple noundef indices for scalarization. 2021-06-01 10:17:50 +01:00
Douglas Yung
1ab0dec8b7 Mark test as requiring asserts. 2021-06-01 02:01:01 -07:00
Roman Lebedev
0573b00888 [X86] AMD Zen 3 has fast variable per-lane shuffles
... but lane-crossing shuffles are slow.
2021-06-01 10:46:05 +03:00
Roman Lebedev
19a9e819da [X86] Split FeatureFastVariableShuffle tuning into Lane-Crossing and Per-Lane variants
Currently, X86 backend only has a global one-size-fits-all `FeatureFastVariableShuffle` feature,
which controls profitability of both the cross-lane and per-lane variable shuffles.
I guess, this has been fine so far.

But at least on AMD Zen 3, while per-line variable shuffles (e.g. `VPSHUFB`)
are as fast as as shuffles with fixed/immediate mask,
while lane-crossing shuffles, e.g. `VPERMPS` is performing worse.

So to get the benefits of variable-mask shuffles, but not the drawbacks of lane-crossing shuffles,
as suggested by @RKSimon, split the feature flag into two.

Differential Revision: https://reviews.llvm.org/D103274
2021-06-01 10:39:36 +03:00
Martin Storsjö
89b11af641 [libcxx] [test] Fix the _supportsVerify check on Windows by fixing quoting
The pipes.quote function quotes using single quotes, the same goes
for the newer shlex.quote (which is the preferred form in Python 3).
This isn't suitable for quoting in command lines on Windows (and the
documentation for shlex.quote even says it's only usable for Unix
shells).

In general, the python subprocess.list2cmdline function should do
proper quoting for the platform's current shell. However, it doesn't
quote the ';' char, which we pass within some arguments to run.py.
Therefore use the custom reimplementation from lit.TestRunner which
is amended to quote ';' too.

The fact that arguemnts were quoted with single quotes didn't matter
for command lines that were executed by either bash or the lit internal
shell, but if executing things directly using subprocess.call, as in
_supportsVerify, the quoted path to %{cxx} fails to be resolved by the
Windows shell.

This unlocks 114 tests that previously were skipped on Windows.

Differential Revision: https://reviews.llvm.org/D103310
2021-06-01 09:51:41 +03:00
Serge Pavlov
fbabf77db8 [PowerPC] Split tests for constrained intrinsics
The test CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll checks code
generation for constrained floating point intrinsics. Many test cases in
it were implemented using operations on constants. Constant folding of
constrained intrinsics would make these test cases almost useless,
because they would check only constant loading.

To keep the tests useful, operations on constants were replaced with
operations on function parameters.

Differential Revision: https://reviews.llvm.org/D103259
2021-06-01 12:30:17 +07:00
Max Kazantsev
20503a71a6 [Test] Add one more loop deletion irreducible CFG test 2021-06-01 11:11:15 +07:00
Nathan Chancellor
048a1209a8 Revert "[InstCombine] Fix miscompile on GEP+load to icmp fold (PR45210)"
This reverts commit 4f2fd3818b0eb26806f366bc37369349aeedcaf9.

The Linux kernel fails to build after this commit. See
https://reviews.llvm.org/D99481 for a reproducer.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2021-05-31 20:21:26 -07:00
Arthur Eubanks
8f3353aa63 [OpaquePtr] Remove some uses of PointerType::getElementType() 2021-05-31 16:11:25 -07:00
Albion Fung
ad749c2d32 [PowerPC] Improve f32 to i32 bitcast code gen
The code gen for f32 to i32 bitcast is not currently the most efficient;
this patch removes some unneccessary instructions gerneated.

Differential revision: https://reviews.llvm.org/D100782
2021-05-31 16:00:58 -05:00
Congzhe Cao
d250d24521 [LoopInterhcange] Handle movement of reduction phis appropriately
This patch fixes pr43326 and pr48212.

Currently when we move reduction phis to the right place,
loop interchange assumes the first phi in loop headers is
an induction phi, skips the first phi and assumes the rest
of phis are candidate reduction phis to move. However, it
may not always be the case.

This patch loops over all phis in loop headers and considers
a phi node as a candidate reduction phi to move only when it
is indeed a reduction phi across outer and inner loop.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102743
2021-05-31 16:27:38 -04:00
Florian Hahn
71c6cc7ea0 [LoopDeletion] Add additional test cases with more nested loops.
Also remove mustprogress function attribute from one of the tests

Extends test coverage for D103382.
2021-05-31 20:27:07 +01:00
Florian Hahn
fd8a91542c [LV] Try to sink users recursively for first-order recurrences.
Update isFirstOrderRecurrence to  explore all uses of a recurrence phi
and check if we can sink them. If there are multiple users to sink, they
are all mapped to the previous instruction.

Fixes PR44286 (and another PR or two).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D84951
2021-05-31 19:55:33 +01:00
Nico Weber
13cf3435ae [gn build] make libunwind build on macOS 2021-05-31 13:17:16 -04:00
Arthur Eubanks
cfc7e26853 [OpaquePtr] Clean up some uses of Type::getPointerElementType()
These depend on pointee types.
2021-05-31 09:54:57 -07:00
Arthur Eubanks
8967f0ab59 [test] Properly match parameter/argument ABI attributes
These were found with D103412.
2021-05-31 09:12:18 -07:00
Andrea Di Biagio
44ebb7579c [MCA][NFCI] Minor changes to InstrBuilder and Instruction.
This is based on the assumption that most simulated instructions don't define
more than one or two registers. This is true for example on x86, where
most instruction definitions don't declare more than one register write.

The default code region size has been increased from 8 to 16. This is based on
the assumption that, for small microbenchmarks, the typical code snippet size is
often less than 16 instructions.

mca::Instruction now uses bitfields to pack flags.
No functional change intended.
2021-05-31 17:05:13 +01:00
Arthur Eubanks
0bb0030f11 [test] Fix addr-label.ll after D99707
Needs REQUIRES.
2021-05-31 09:02:07 -07:00
Arthur Eubanks
7178161bf7 Remove "Rewrite Symbols" from codegen pipeline
It breaks up the function pass manager in the codegen pipeline.

With empty parameters, it looks at the -mllvm flag -rewrite-map-file.
This is likely not in use.

Add a check that we only have one function pass manager in the codegen
pipeline.

Some tests relied on the fact that we had a module pass somewhere in the
codegen pipeline.

addr-label.ll crashes on ARM due to this change. This is because a
ARMConstantPoolConstant containing a BasicBlock to represent a
blockaddress may hold an invalid pointer to a BasicBlock if the
blockaddress is invalidated by its BasicBlock getting removed. In that
case all referencing blockaddresses are RAUW a constant int. Making
ARMConstantPoolConstant::CVal a WeakVH fixes the crash, but I'm not sure
that's the right fix. As a workaround, create a barrier right before
ISel so that IR optimizations can't happen while a
ARMConstantPoolConstant has been created.

Reviewed By: rnk, MaskRay, compnerd

Differential Revision: https://reviews.llvm.org/D99707
2021-05-31 08:32:36 -07:00
Anirudh Prasad
37fdc69b24 [AsmParser][SystemZ][z/OS] Introducing HLASM Parser support to AsmParser - Part 2
- This patch is the second (and hopefully final) part of providing HLASM syntax for inline asm statements for z/OS to LLVM (continuing on from https://reviews.llvm.org/D98276)
- This second part deals with providing label support
- As mentioned in https://reviews.llvm.org/D98276, if the first token is not a space we process the first token as a label, and the remaining tokens as a possible machine instruction
- To achieve this, a new `parseAsHLASMLabel` function is introduced. This function processes the first token, validates whether it is an "acceptable" label according to HLASM standards, and then emits it
- After handling and emitting the label, call the `parseAsMachineInstruction` instruction to process the remaining tokens as a machine instruction.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D103320
2021-05-31 11:27:02 -04:00
Daniil Fukalov
47fca931a0 [NFC] MemoryDependenceAnalysis cleanup.
1. Removed redundant includes,
2. Removed never defined and used `releaseMemory()`.
3. Fixed member functions names first letter case.
4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name
   `NonLocalDeps` to `NonLocalDepsMap`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102358
2021-05-31 18:07:55 +03:00
Sanjay Patel
4f81b668e8 [SDAG] add check to sext-of-setcc fold to bypass changing a legal op
I accidentaly pushed a draft of D103280 that was discussed
during the review, but it was not supposed to be the final
version.

Rather than revert and recommit, I'm updating the existing
code. This way we have a record of the codegen diff that
would result if we decide to remove this predicate in the
future.
2021-05-31 08:58:11 -04:00
Roman Lebedev
471f397407 [NFC] ScalarEvolution: apply SSO to the ExprValueMap value
ExprValueMap is a map from SCEV * to a set-vector of (Value *, ConstantInt *) pair,
and while the map itself will likely be big-ish (have many keys),
it is a reasonable assumption that each key will refer to a small-ish
number of pairs.

In particular looking at n=512 case from
https://bugs.llvm.org/show_bug.cgi?id=50384,
the small-size of 4 appears to be the sweet spot,
it results in the least allocations while minimizing memory footprint.
```
$ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i | tail -n 6; echo ""; done
heaptrack.opt.0-orig.gz
total runtime: 14.32s.
calls to allocation functions: 8222442 (574192/s)
temporary memory allocations: 2419000 (168924/s)
peak heap memory consumption: 190.98MB
peak RSS (including heaptrack overhead): 239.65MB
total memory leaked: 67.58KB

heaptrack.opt.1-n1.gz
total runtime: 13.72s.
calls to allocation functions: 7184188 (523705/s)
temporary memory allocations: 2419017 (176338/s)
peak heap memory consumption: 191.38MB
peak RSS (including heaptrack overhead): 239.64MB
total memory leaked: 67.58KB

heaptrack.opt.2-n2.gz
total runtime: 12.24s.
calls to allocation functions: 6146827 (502355/s)
temporary memory allocations: 2418997 (197695/s)
peak heap memory consumption: 163.31MB
peak RSS (including heaptrack overhead): 211.01MB
total memory leaked: 67.58KB

heaptrack.opt.3-n4.gz
total runtime: 12.28s.
calls to allocation functions: 6068532 (494260/s)
temporary memory allocations: 2418985 (197017/s)
peak heap memory consumption: 155.43MB
peak RSS (including heaptrack overhead): 201.77MB
total memory leaked: 67.58KB

heaptrack.opt.4-n8.gz
total runtime: 12.06s.
calls to allocation functions: 6068042 (503321/s)
temporary memory allocations: 2418992 (200646/s)
peak heap memory consumption: 166.03MB
peak RSS (including heaptrack overhead): 213.55MB
total memory leaked: 67.58KB

heaptrack.opt.5-n16.gz
total runtime: 12.14s.
calls to allocation functions: 6067993 (499958/s)
temporary memory allocations: 2418999 (199307/s)
peak heap memory consumption: 187.24MB
peak RSS (including heaptrack overhead): 233.69MB
total memory leaked: 67.58KB
```

While that test may be an edge worst-case scenario,
https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions
agrees that this also results in improvements in the usual situations.
2021-05-31 15:34:03 +03:00
Alexey Lapshin
7ec130df24 [llvm-objcopy][NFC] Refactor CopyConfig structure - remove lazy options processing.
During reviewing D102277 it was decided to remove lazy options processing
from llvm-objcopy CopyConfig structure. This patch transforms processing of ELF
lazy options into the in-place processing.

Differential Revision: https://reviews.llvm.org/D103260
2021-05-31 14:40:27 +03:00
Sanjay Patel
dceff7ed2b [SDAG] try harder to fold casts into vector compare
sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) --
(when the zexts are free and a bunch of other conditions)

We have a couple of similar folds to this already for vector selects,
but this pattern slips through because it is only a setcc.

The tests are based on the motivating case from:
https://llvm.org/PR50055
...but we need extra logic to get that example, so I've left that as
a TODO for now.

Differential Revision: https://reviews.llvm.org/D103280
2021-05-31 07:14:01 -04:00
Djordje Todorovic
c362f0a17e [LiveDebugVariables] Stop trimming locations of non-inlined vars
The D35953, D62650 and D73691 introduced trimming of variables locations
in LiveDebugVariables pass, since there are some cases where after
the virtregrewrite we have exploded number of DBG_VALUEs created for some
inlined variables. As it looks, all problematic cases were regarding
inlined variables, so it seems reasonable to stop trimming the location
ranges for non-inlined variables.
It has very good impact on the llvm-locstats report.

Differential Revision: https://reviews.llvm.org/D102917
2021-05-31 02:59:19 -07:00
Fraser Cormack
0cd9460e0b [RISCV] Scale scalably-typed split argument offsets by VSCALE
This patch fixes a bug in lowering scalable-vector types in RISC-V's
main calling convention. When scalable-vector types are split and passed
indirectly, the target is responsible for scaling the offset --
initially set to the known-minimum store size -- by the scalable factor.

Before this we were issuing overlapping loads or stores to the different
parts, leading to incorrect codegen.

Credit to @HsiangKai for spotting this.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D103262
2021-05-31 10:43:13 +01:00