1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

215079 Commits

Author SHA1 Message Date
Craig Topper
3b8bc4cd8f [RISCV] Store SEW in RISCV vector pseudo instructions in log2 form.
This shrinks the immediate that isel table needs to emit for these
instructions. Hoping this allows me to change OPC_EmitInteger to
use a better variable length encoding for representing negative
numbers. Similar to what was done a few months ago for OPC_CheckInteger.

The alternative encoding uses less bytes for negative numbers, but
increases the number of bytes need to encode 64 which was a very
common number in the RISCV table due to SEW=64. By using Log2 this
becomes 6 and is no longer a problem.
2021-05-02 12:09:20 -07:00
Arthur Eubanks
2b3350003c [NFC] Use Aliasee to determine Type and AddrSpace in GlobalAlias::create()
As opposed to going through the Aliasee type.

For opaque pointers, we're trying to remove uses of PointerType::getElementType().

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101715
2021-05-02 11:50:08 -07:00
Florian Hahn
b7eb3dfd3b [VPlan] Add VPBasicBlock::phis() helper (NFC).
This patch introduces a helper to obtain an iterator range for the
PHI-like recipes in a block.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D100101
2021-05-02 19:20:13 +01:00
Nikita Popov
afad6987f7 [SCEV] Add test for non-unit stride with multiple exits (NFC)
We currently can't determine any exit counts here, because there
is no "controlling exit".
2021-05-02 18:14:05 +02:00
Juneyoung Lee
a6a81be470 [InstCombine] Add a few more patterns for folding select of select
This is a patch that folds select of select to salvage some optimizations after select -> and/or folding is disabled.

```
select (select a, true, b), c, false -> select a, c, false
select c, (select a, true, b), false -> select c, a, false
  if c implies that b is false (isImpliedCondition).
```
https://alive2.llvm.org/ce/z/ANatjt, https://alive2.llvm.org/ce/z/rv8zTB

```
sel (sel c, a, false), true, (sel !c, b, false) -> sel c, a, b
sel (sel !c, a, false), true, (sel c, b, false) -> sel c, b, a
```
https://alive2.llvm.org/ce/z/U2kp-t, https://alive2.llvm.org/ce/z/bc88EE

See D101191

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101375
2021-05-02 19:00:42 +09:00
Juneyoung Lee
5ffbd4dbc2 [InstCombine] Precommit tests for D101375 (NFC) 2021-05-02 19:00:42 +09:00
Juneyoung Lee
33fda542f4 Fix MSan crash after 1977c53b 2021-05-02 13:44:43 +09:00
Arthur Eubanks
0086e8ade4 [NFC] Use getParamByValType instead of pointee type
To reduce dependence on pointee types for opaque pointers.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101706
2021-05-01 21:22:41 -07:00
Juneyoung Lee
0884a1465d run update_test_checks.py for the tests in D101191 (NFC)
This is an NFC that reruns update_test_checks.py on the tests that are
going to be updated in D101191.
2021-05-02 13:11:57 +09:00
Juneyoung Lee
b4b7d6d0cc [ValueTracking] ctpop propagates poison
This is a patch that adds ctpop intrinsics to propagatesPoison.

Splitted from D101191
2021-05-02 13:04:37 +09:00
Juneyoung Lee
75528017f8 [ValueTracking] Improve impliesPoison to look into overflow intrinsics
This update supports the following transformation:

```
select(extract(mul_with_overflow(a, _), _), (a == 0), false)
=>
and(extract(mul_with_overflow(a, _), _), (a == 0))
```

which is correct because if `a` was poison the select's condition was
also poison.

This update is splitted from D101423.
2021-05-02 12:03:55 +09:00
LLVM GN Syncbot
6038981d86 [gn build] Port 1977c53b2ae4 2021-05-02 02:55:47 +00:00
Juneyoung Lee
c112624b0b [InstCombine] Fold overflow bit of [u|s]mul.with.overflow in a poison-safe way
As discussed in D101191, this patch adds a poison-safe folding of overflow bit check:
```
  %Op0 = icmp ne i4 %X, 0
  %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y)
  %Op1 = extractvalue { i4, i1 } %Agg, 1
  %ret = select i1 %Op0, i1 %Op1, i1 false
=>
  %Y.fr = freeze %Y
  %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr)
  %Op1 = extractvalue { i4, i1 } %Agg, 1
  %ret = %Op1
```

https://alive2.llvm.org/ce/z/zgPUGT
https://alive2.llvm.org/ce/z/h2gZ_6

Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`.
In this case, LLVM is already good because `%ret` is already successfully folded into `and`,
triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K

Differential Revision: https://reviews.llvm.org/D101423
2021-05-02 11:54:12 +09:00
Juneyoung Lee
e820cde371 [InstCombine] Precommit tests for D101423 (NFC) 2021-05-02 11:41:47 +09:00
Harald van Dijk
9a4efe9fab [X32][CET] Fix size and alignment of .note.gnu.property section
X32 uses 32-bit ELF object files with 32-bit alignment, so the
.note.gnu.property section needs to be emitted as it is for X86.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101689
2021-05-01 22:17:04 +01:00
Nikita Popov
adfe3b159b [LVI] Handle mask not equal zero conditions
If V & Mask != 0, we know that at least one of the bits in Mask
must be set, so the value must be >= the lowest bit in Mask.
2021-05-01 23:08:49 +02:00
Nikita Popov
1339e10140 [CVP] Add tests for mask not equal zero guard (NFC) 2021-05-01 23:08:48 +02:00
Roman Lebedev
fed21911ba [X86] AMD Zen 3 Scheduler Model
Introduce basic schedule model for AMD Zen 3 CPU's, a.k.a `znver3`.

This is fully built from scratch, from llvm-mca measurements
and documented reference materials.
Nothing was copied from `znver2`/`znver1`.

I believe this is in a reasonable state of completion for inclusion,
probably better than D52779 `bdver2` was :)

Namely:
* uops are pretty spot-on (at least what llvm-mca can measure)
  {F16422596}
* latency is also pretty spot-on (at least what llvm-mca can measure)
  {F16422601}
* throughput is within reason
  {F16422607}

I haven't run much benchmarks with this,
however RawSpeed benchmarks says this is beneficial:
{F16603978}
{F16604029}

I'll call out the obvious problems there:
* i didn't really bother with X87 instructions
* i didn't really bother with obviously-microcoded/system instructions
* There are large discrepancy in throughput for `mr` and `rm` instructions.
  I'm not really sure if it's a modelling defect that needs to be fixed,
  or it's a defect of measurments.
* Pipe distributions are probably bad :)
  I can't do much here until AMD allows that to be fixed
  by documenting the appropriate counters and updating libpfm

That being said, as @RKSimon notes:
>>! In D94395#2647381, @RKSimon wrote:
> I'll mention again that all the znver* models appear to be very inaccurate wrt SIMD/FPU instructions <...>
so how much worse this could possibly be?!

Things that aren't there:
* Various tunings: zero idioms, etc. That is follow-ups.

Differential Revision: https://reviews.llvm.org/D94395
2021-05-01 22:08:13 +03:00
Nikita Popov
b39114fa53 [SCEV] Simplify backedge count clearing (NFC)
This seems to be a leftover from when the BackedgeTakenInfo
stored multiple exit counts with manual memory management. At
some point this was switchted to a simple vector, and there should
be no need to micro-manage the clearing anymore. We can simply
drop the loop from the map and the the destructor do its job.
2021-05-01 17:50:01 +02:00
Nikita Popov
26eb9e64e1 [IndVars] Remove redundant loop invariance check (NFC)
This is checked again directly below this condition.
2021-05-01 17:22:00 +02:00
LemonBoy
f58a03724a [AArch64] Prevent spilling between ldxr/stxr pairs
Apply the same logic used to check if CMPXCHG nodes should be expanded
at -O0: the register allocator may end up spilling some register in
between the atomic load/store pairs, breaking the atomicity and possibly
stalling the execution.

Fixes PR48017

Reviewed By: efriedman

Differential Revision: https://reviews.llvm.org/D101163
2021-05-01 17:17:05 +02:00
Nikita Popov
0c935681c7 [SCEV] Add tests for and/or loop guards (NFC) 2021-05-01 17:10:23 +02:00
LemonBoy
c556195c00 [NFC][ARM] Regenerate arm64-atomic.ll test
Pre-requisite for D101163, the `NOLSE-0O` case shows registers being
spilled inside the rmw loop.

Use two separate prefixes for the `LSE-O0` case as some outputs differ
only by a comment that update_llc_test_checks.py ignores but lit does
not, causing the test to fail unexpectedly when run.
2021-05-01 16:29:03 +02:00
LemonBoy
2056e830c7 Revert "[NFC][ARM] Regenerate arm64-atomic.ll test"
This reverts commit d9856b12f2be257f1b7aaccde71eb0421a1aaaf3.
2021-05-01 13:00:45 +02:00
LemonBoy
72ba5f53d1 [NFC][ARM] Regenerate arm64-atomic.ll test
Pre-requisite for D101163, the NOLSE-0O case shows registers being
spilled inside the rmw loop.
2021-05-01 12:09:14 +02:00
Nikita Popov
7ce20b169c [InstCombine] Add eq-of-parts tests using or (NFC)
or-ne is the conjugated pattern for and-eq.
2021-05-01 11:57:09 +02:00
Nathan Chancellor
58eb64c4bd Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands""
This reverts commit 791930d74087b8ae8901172861a0fd21a211e436, as per
https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy.

I observed breakage with the Linux kernel, as reported at
https://reviews.llvm.org/D91722#2724321

Fixes exist at
https://reviews.llvm.org/D101523
https://reviews.llvm.org/D101540

but they have not landed so to unbreak the tree for the weekend, revert
this commit.

Commit b11e4c990771 ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an
excessive number of debug operands"") only reverted one follow-up fix,
not the original patch that broke the kernel.

e
2021-04-30 20:23:21 -07:00
Nemanja Ivanovic
315bbd239b [PowerPC] Add missing requirement to test case
Commit 70c433a184a54819835e54c62c3e6891e7069861 added this
test case that has -stop-before that mentions a pass that is
only added for non-release builds. Add the requirement for asserts.
2021-04-30 19:36:58 -05:00
LLVM GN Syncbot
43818b0f1b [gn build] Port 02c5ba867987 2021-05-01 00:09:37 +00:00
Adrian Prantl
32b1ee25e7 Revert "[VP,Integer,#2] ExpandVectorPredication pass"
This reverts commit 43bc584dc05e24c6d44ece8e07d4bff585adaf6d.

The commit broke the -DLLVM_ENABLE_MODULES=1 builds.

http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561
2021-04-30 17:02:28 -07:00
Nick Desaulniers
af68671197 Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"
This reverts commit b623df3c93983c4512aa54f2c706716bdf865a90, as per
https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy.

Breakages observed downstream reported in:
https://reviews.llvm.org/D91722#2724321

Fixes exist in:
https://reviews.llvm.org/D101523
https://reviews.llvm.org/D101540

but haven't landed yet going into the weekend.
2021-04-30 16:45:37 -07:00
George Balatsouras
db10f9f59c [dfsan] Fix origin tracking for fast8
The problem is the following. With fast8, we broke an important
invariant when loading shadows.  A wide shadow of 64 bits used to
correspond to 4 application bytes with fast16; so, generating a single
load was okay since those 4 application bytes would share a single
origin.  Now, using fast8, a wide shadow of 64 bits corresponds to 8
application bytes that should be backed by 2 origins (but we kept
generating just one).

Let’s say our wide shadow is 64-bit and consists of the following:
0xABCDEFGH. To check if we need the second origin value, we could do
the following (on the 64-bit wide shadow) case:

 - bitwise shift the wide shadow left by 32 bits (yielding 0xEFGH0000)
 - push the result along with the first origin load to the shadow/origin vectors
 - load the second 32-bit origin of the 64-bit wide shadow
 - push the wide shadow along with the second origin to the shadow/origin vectors.

The combineOrigins would then select the second origin if the wide
shadow is of the form 0xABCDE0000.  The tests illustrate how this
change affects the generated bitcode.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D101584
2021-04-30 15:57:33 -07:00
Jon Roelofs
738d110269 [EarlyIfConversion] Avoid producing selects with identical operands
This extends the early-ifcvt pass to avoid a few more cases where the resulting
select instructions would have matching operands.  Additionally, we now use TII
to determine "sameness" of the operands so that as TII gets smarter, so too
will ifcvt.

The attached test case was bugpoint-reduced down from CINT2000/252.eon in the
test-suite. See: https://clang.godbolt.org/z/WvnrcrGEn

Differential Revision: https://reviews.llvm.org/D101508
2021-04-30 15:51:14 -07:00
Jon Roelofs
0386626f22 [PowerPC] modernize test via update_llc_test_checks.py. NFC 2021-04-30 15:51:13 -07:00
Dávid Bolvanský
79c3f9a4d0 [X86] Promote 16-bit CTTZ_ZERO_UNDEF to 32-bit variant
Related to PR50172.

Protects us against regressions after we will start doing cttz(zext(x)) -> zext(cttz(x)) transformation in the middle-end.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D101662
2021-05-01 00:42:15 +02:00
Jon Roelofs
8f8ffa214f Revert "[EarlyIfConversion] Avoid producing selects with identical operands"
This reverts commit 3d27b5d28aabf8516aa1fefc78a6878b89a992f0.

Broke one of the PPC tests, which I didn't see because I usually build with
only the x86/AARch64 targets enabled... oops.

https://lab.llvm.org/buildbot#builders/109/builds/13834

llvm/test/CodeGen/PowerPC/expand-foldable-isel.ll
2021-04-30 14:55:34 -07:00
Amara Emerson
ccca686898 [AArch64][GlobalISel] Use a single MachineIRBuilder for most of isel. NFC.
This is a long overdue cleanup. Not every use is eliminated, I stuck to uses
that were directly being called from select(), and not the render functions.

Differential Revision: https://reviews.llvm.org/D101590
2021-04-30 14:49:41 -07:00
Jon Roelofs
af8be2e6a5 [EarlyIfConversion] Avoid producing selects with identical operands
This extends the early-ifcvt pass to avoid a few more cases where the resulting
select instructions would have matching operands.  Additionally, we now use TII
to determine "sameness" of the operands so that as TII gets smarter, so too
will ifcvt.

The attached test case was bugpoint-reduced down from CINT2000/252.eon in the
test-suite. See: https://clang.godbolt.org/z/WvnrcrGEn

Differential Revision: https://reviews.llvm.org/D101508
2021-04-30 14:42:39 -07:00
Jez Ng
3cbe568195 [llvm-readobj] Recognize N_THUMB_DEF as a symbol flag
The right symbol flag mask is ~0x7, not ~0xf.

Also emit string names for the other flags (we were missing some).

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D101548
2021-04-30 17:39:56 -04:00
Gulfem Savrun Yeniceri
127755a578 [NewPM] Disable RelLookupTableConverter pass in LTO
Relative look table converter pass caused an issue when full lto
is enabled (reported in https://reviews.llvm.org/D94355).
This patch disables that pass from full lto pre-link phase optimization
pipeline until the issue is fixed.

Differential Revision: https://reviews.llvm.org/D101664
2021-04-30 21:23:40 +00:00
Jay Foad
b4ca76eab1 [AMDGPU] Remove set_gpr_idx instructions in conditional blocks
SIPreEmitPeephole did not try to remove redundant s_set_gpr_idx_*
instructions in blocks that end with a conditional branch instruction.
This seems like a simple oversight.

Differential Revision: https://reviews.llvm.org/D101629
2021-04-30 22:15:45 +01:00
Nikita Popov
8d71021388 [ValueTracking] Slightly clean up programUndefinedIfUndefOrPoison() (NFC)
Use contains() to check set membership, and adjust an oddly
structured loop.
2021-04-30 23:05:41 +02:00
Nikita Popov
1aee99e8ca [ValueTracking] Limit scan when checking poison UB (PR50155)
The current code can scan an unlimited number of instructions,
if the containing basic block is very large. The test case from
PR50155 contains a basic block with approximately 100k instructions.

To avoid this, limit the number of instructions we inspect. At
the same time, drop the limit on the number of basic blocks, as
this will be implicitly limited by the number of instructions as
well.
2021-04-30 23:04:49 +02:00
Justin Bogner
61b0781814 Add support for llvm.assume intrinsic to the LoadStoreVectorizer pass
Patch by Viacheslav Nikolaev. Thanks!
2021-04-30 13:39:46 -07:00
LLVM GN Syncbot
586fbcca5e [gn build] Port 2d28100bf2e4 2021-04-30 20:17:55 +00:00
Daniil Fukalov
3f292e8925 [TTI] NFC: Change getTypeLegalizationCost to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen, kparzysz

Differential Revision: https://reviews.llvm.org/D101533
2021-04-30 22:51:51 +03:00
Dávid Bolvanský
8ceccff3f0 [InstCombine] Added tests for PR50172, NFC 2021-04-30 21:26:26 +02:00
Guozhi Wei
c7561ad6e9 [MachineFunction] Make comment for TracksLiveness more clearer
As discussed in https://lists.llvm.org/pipermail/llvm-dev/2021-April/150225.html,
the current comments for TracksLiveness property and isKill flag are confusing.
This patch makes the comments more clearer.

Differential Revision: https://reviews.llvm.org/D101500
2021-04-30 12:10:36 -07:00
Arthur Eubanks
956b0ff1ba [llvm-reduce] Don't unset dso_local on implicitly dso_local GVs
This introduces a flag that aborts if we ever reduce to IR that fails
the verifier.

Reviewed By: swamulism, arichardson

Differential Revision: https://reviews.llvm.org/D101279
2021-04-30 11:57:22 -07:00
Arthur Eubanks
62d0262970 [llvm-reduce] Add flag to only run specific passes
Reviewed By: fhahn, hans

Differential Revision: https://reviews.llvm.org/D101278
2021-04-30 11:51:01 -07:00