1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

16650 Commits

Author SHA1 Message Date
Roman Lebedev
e5adaf4abf [NFC][LoopIdiom] Add basic test coverage for 'left-shift until bittest` idiom 2020-11-08 22:35:41 +03:00
Sanjay Patel
de04fc5560 [InstSimplify] allow vector folds for (Pow2C << X) == NonPow2C
Existing pre-conditions seem to be correct:
https://rise4fun.com/Alive/lCLB

  Name: non-zero C1
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C1 != 0
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: one == C2
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C2 == 1
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nuw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nuw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nsw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nsw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false
2020-11-08 09:52:05 -05:00
Sanjay Patel
0bda6c3b3e [InstSimplify] add tests for icmp with power-of-2 operand; NFC 2020-11-08 09:52:05 -05:00
Simon Pilgrim
2ab654ea1c [SLPVectorizer][X86] Remove unused check-prefixes 2020-11-08 14:03:55 +00:00
Simon Pilgrim
94abf31408 [GVN] Remove unused check-prefixes 2020-11-08 13:36:42 +00:00
Simon Pilgrim
3887b4f429 [InstCombine] Fix malformed CHECK32/64 test checks. 2020-11-08 13:34:07 +00:00
Simon Pilgrim
da7a617b84 [InstCombine] Remove unused check-prefixes
Just use default CHECK
2020-11-08 13:33:21 +00:00
Simon Pilgrim
ef176bfe6c [PhaseOrdering] Remove unused check-prefixes
Just use default CHECK in most cases.
2020-11-08 13:30:18 +00:00
Simon Pilgrim
6d91edb95a [PhaseOrdering][X86] Remove unused check-prefixes
Just use default CHECK
2020-11-08 13:20:17 +00:00
Simon Pilgrim
3bc6fadf68 [InstCombine] foldSelectFunnelShift - block poison in funnel shift value
As raised by @nlopes on D90382 - if this is not a rotate then the select was blocking poison from the 'shift-by-zero' non-TVal, but a funnel shift won't - so freeze it.
2020-11-08 12:58:30 +00:00
Florian Hahn
738feb9acf [LoopInterchange] Skip non SCEV-able operands in cost function.
This fixes a crash when trying to get a SCEV expression for operands
that are not SCEV-able.
2020-11-08 11:41:19 +00:00
Pedro Tammela
b59848c68a [Reg2Mem] add support for the new pass manager
This patch refactors the pass to accomodate the new pass manager
boilerplate.

Differential Revision: https://reviews.llvm.org/D91005
2020-11-08 11:14:05 +00:00
Atmn Patel
3dd5790777 Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
This reverts commit 0b17c6e4479d62bd4ff05c48d6cdf340b198832f. This patch
causes a compile-time error in SCEV.
2020-11-07 00:32:12 -05:00
Atmn Patel
51ad1efef5 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-11-06 22:06:58 -05:00
Atmn Patel
2dde5b4c98 [Inliner] Handle mustprogress functions
When inlining `mustprogress` functions, if the caller or the callee has
the attribute, we drop the function attribute. The loops that have the
`llvm.loop.mustprogress` metadata keep their metadata. We do not need to
add new loop metadata to inlined functions because the patch in D86841
already adds the relevant loop metadata in all of the necessary places.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87262
2020-11-06 20:03:46 -05:00
Atmn Patel
f15e3a4579 [LoopDeletion] Remove dead loops with no exit blocks
Currently, LoopDeletion refuses to remove dead loops with no exit blocks
because it cannot statically determine the control flow after it removes
the block. This leads to miscompiles if the loop is an infinite loop and
should've been removed.

Differential Revision: https://reviews.llvm.org/D90115
2020-11-06 17:08:34 -05:00
Quentin Colombet
c2874d6a67 Prevent LICM and machineLICM from hoisting convergent operations
Results of convergent operations are implicitly affected by the
enclosing control flows and should not be hoisted out of arbitrary
loops.

Patch by Xiaoqing Wu <xiaoqing_wu@apple.com>

Differential Revision: https://reviews.llvm.org/D90361
2020-11-06 10:26:39 -08:00
Simon Pilgrim
964dad51f4 [SLP][AMDGPU] Regenerate packed-math tests and remove unused check prefix 2020-11-06 17:27:13 +00:00
Simon Pilgrim
1274be49b8 [VectorCombine][X86] Removed unused check prefixes 2020-11-06 17:27:12 +00:00
Roman Lebedev
bba9249b3e [NFC][InstCombine] Update few comment updates i missed in 0ac56e8eaaeb
As pointed out in post-commit review in that commit
2020-11-06 17:38:00 +03:00
Arnold Schwaighofer
3fe61868a9 llvm.coro.id.async lowering: Parameterize how-to restore the current's continutation context and restart the pipeline after splitting
The `llvm.coro.suspend.async` intrinsic takes a function pointer as its
argument that describes how-to restore the current continuation's
context from the context argument of the continuation function. Before
we assumed that the current context can be restored by loading from the
context arguments first pointer field (`first_arg->caller_context`).

This allows for defining suspension points that reuse the current
context for example.

Also:

llvm.coro.id.async lowering: Add llvm.coro.preprare.async intrinsic

Blocks inlining until after the async coroutine was split.

Also, change the async function pointer's context size position

   struct async_function_pointer {
     uint32_t relative_function_pointer_to_async_impl;
     uint32_t context_size;
   }

And make the position of the `async context` argument configurable. The
position is specified by the `llvm.coro.id.async` intrinsic.

rdar://70097093

Differential Revision: https://reviews.llvm.org/D90783
2020-11-06 06:22:46 -08:00
Florian Hahn
c9d167829a [SLP] Also try to vectorize incoming values of PHIs .
Currently we do not consider incoming values of PHIs as roots for SLP
vectorization. This means we miss scenarios like the one in the test
case and PR47670.

It appears quite straight-forward to consider incoming values of PHIs as
roots for vectorization, but I might be missing something that makes
this problematic.

In terms of vectorized instructions, this applies to quite a few
benchmarks across MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto

    Same hash: 185 (filtered out)
    Remaining: 52
    Metric: SLP.NumVectorInstructions

    Program                                        base    patch   diff
     test-suite...ProxyApps-C++/HPCCG/HPCCG.test     9.00   27.00  200.0%
     test-suite...C/CFP2000/179.art/179.art.test     8.00   22.00  175.0%
     test-suite...T2006/458.sjeng/458.sjeng.test    14.00   30.00  114.3%
     test-suite...ce/Benchmarks/PAQ8p/paq8p.test    11.00   18.00  63.6%
     test-suite...s/FreeBench/neural/neural.test    12.00   18.00  50.0%
     test-suite...rimaran/enc-3des/enc-3des.test    65.00   95.00  46.2%
     test-suite...006/450.soplex/450.soplex.test    63.00   89.00  41.3%
     test-suite...ProxyApps-C++/CLAMR/CLAMR.test   177.00  250.00  41.2%
     test-suite...nchmarks/McCat/18-imp/imp.test    13.00   18.00  38.5%
     test-suite.../Applications/sgefa/sgefa.test    26.00   35.00  34.6%
     test-suite...pplications/oggenc/oggenc.test   100.00  133.00  33.0%
     test-suite...6/482.sphinx3/482.sphinx3.test   103.00  134.00  30.1%
     test-suite...oxyApps-C++/miniFE/miniFE.test   169.00  213.00  26.0%
     test-suite.../Benchmarks/Olden/tsp/tsp.test    59.00   73.00  23.7%
     test-suite...TimberWolfMC/timberwolfmc.test   503.00  622.00  23.7%
     test-suite...T2006/456.hmmer/456.hmmer.test    65.00   79.00  21.5%
     test-suite...libquantum/462.libquantum.test    58.00   68.00  17.2%
     test-suite...ternal/HMMER/hmmcalibrate.test    84.00   98.00  16.7%
     test-suite...ications/JM/ldecod/ldecod.test   351.00  401.00  14.2%
     test-suite...arks/VersaBench/dbms/dbms.test    52.00   57.00   9.6%
     test-suite...ce/Benchmarks/Olden/bh/bh.test   118.00  128.00   8.5%
     test-suite.../Benchmarks/Bullet/bullet.test   6355.00 6880.00  8.3%
     test-suite...nsumer-lame/consumer-lame.test   480.00  519.00   8.1%
     test-suite...000/183.equake/183.equake.test   226.00  244.00   8.0%
     test-suite...chmarks/Olden/power/power.test   105.00  113.00   7.6%
     test-suite...6/471.omnetpp/471.omnetpp.test    92.00   99.00   7.6%
     test-suite...ications/JM/lencod/lencod.test   1173.00 1261.00  7.5%
     test-suite...0/253.perlbmk/253.perlbmk.test    55.00   59.00   7.3%
     test-suite...oxyApps-C/miniAMR/miniAMR.test    92.00   98.00   6.5%
     test-suite...chmarks/MallocBench/gs/gs.test   446.00  473.00   6.1%
     test-suite.../CINT2006/403.gcc/403.gcc.test   464.00  491.00   5.8%
     test-suite...6/464.h264ref/464.h264ref.test   998.00  1055.00  5.7%
     test-suite...006/453.povray/453.povray.test   5711.00 6007.00  5.2%
     test-suite...FreeBench/distray/distray.test   102.00  107.00   4.9%
     test-suite...:: External/Povray/povray.test   4184.00 4378.00  4.6%
     test-suite...DOE-ProxyApps-C/CoMD/CoMD.test   112.00  117.00   4.5%
     test-suite...T2006/445.gobmk/445.gobmk.test   104.00  108.00   3.8%
     test-suite...CI_Purple/SMG2000/smg2000.test   789.00  819.00   3.8%
     test-suite...yApps-C++/PENNANT/PENNANT.test   233.00  241.00   3.4%
     test-suite...marks/7zip/7zip-benchmark.test   417.00  428.00   2.6%
     test-suite...arks/mafft/pairlocalalign.test   627.00  643.00   2.6%
     test-suite.../Benchmarks/nbench/nbench.test   259.00  265.00   2.3%
     test-suite...006/447.dealII/447.dealII.test   4641.00 4732.00  2.0%
     test-suite...lications/ClamAV/clamscan.test   106.00  108.00   1.9%
     test-suite...CFP2000/177.mesa/177.mesa.test   1639.00 1664.00  1.5%
     test-suite...oxyApps-C/RSBench/rsbench.test    66.00   65.00  -1.5%
     test-suite.../CINT2000/252.eon/252.eon.test   3416.00 3444.00  0.8%
     test-suite...CFP2000/188.ammp/188.ammp.test   1846.00 1861.00  0.8%
     test-suite.../CINT2000/176.gcc/176.gcc.test   152.00  153.00   0.7%
     test-suite...CFP2006/444.namd/444.namd.test   3528.00 3544.00  0.5%
     test-suite...T2006/473.astar/473.astar.test    98.00   98.00   0.0%
     test-suite...frame_layout/frame_layout.test    NaN     39.00   nan%

On ARM64, there appears to be a slight regression on SPEC2006, which
might be interesting to investigate:

   test-suite...T2006/473.astar/473.astar.test   0.9%

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D88735
2020-11-06 12:50:32 +00:00
Simon Pilgrim
fbc5698619 [InstCombine] Regenerate narrow-math.ll tests 2020-11-06 11:35:54 +00:00
Simon Moll
9bff67230d [VE][TTI] don't advertise vregs/vops
Claim to not have any vector support to dissuade SLP, LV and friends
from generating SIMD IR for the VE target.  We will take this back once
vector isel is stable.

Reviewed By: kaz7, fhahn

Differential Revision: https://reviews.llvm.org/D90462
2020-11-06 11:12:10 +01:00
Max Kazantsev
d6f4f83809 [Test] One more test on IndVars with negative step 2020-11-06 14:55:50 +07:00
Max Kazantsev
3151371b49 [Test] Run test with expensive SE inference. NFC
The planned changes require expensive inference to kick in
2020-11-06 14:23:44 +07:00
Anna Thomas
751d510c01 Revert "[CaptureTracking] Avoid overly restrictive dominates check"
This reverts commit 15694fd6ad955c6a16b446a6324364111a49ae8b.
Need to investigate and fix a failing clang test: synchronized.m.
Might need a test update.
2020-11-05 12:27:15 -05:00
Anna Thomas
4cbfba9811 [CaptureTracking] Avoid overly restrictive dominates check
CapturesBefore tracker has an overly restrictive dominates check when
the `BeforeHere` and the capture point are in different basic blocks.
All we need to check is that there is no path from the capture point
to `BeforeHere` (which is less stricter than the dominates check).
See added testcase in one of the users of CapturesBefore.

Reviewed-By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90688
2020-11-05 11:38:50 -05:00
Florian Hahn
0fbc387566 [GVN] Fix MemorySSA update when replacing assume(false) with stores.
When replacing an assume(false) with a store, we have to be more careful
with the order we insert the new access. This patch updates the code to
look at the accesses in the block to find a suitable insertion point.

Alterantively we could check the defining access of the assume, but IIRC
there has been some discussion about making assume() readnone, so
looking at the access list might be more future proof.

Fixes PR48072.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90784
2020-11-05 12:09:32 +00:00
Arnold Schwaighofer
d90984c1dd Start of an llvm.coro.async implementation
This patch adds the `async` lowering of coroutines.

This will be used by the Swift frontend to lower async functions. In
contrast to the `retcon` lowering the frontend needs to be in control
over control-flow at suspend points as execution might be suspended at
these points.

This is very much work in progress and the implementation will change as
it evolves with the frontend. As such the documentation is lacking
detail as some of it might change.

rdar://70097093

Reapply with fix for memory sanitizer failure and sphinx failure.

Differential Revision: https://reviews.llvm.org/D90612
2020-11-04 10:29:21 -08:00
Arnold Schwaighofer
c8e9566a32 Revert "Start of an llvm.coro.async implementation"
This reverts commit ea606cced0583d1dbd4c44680601d1d4e9a56e58.

This patch causes memory sanitizer failures sanitizer-x86_64-linux-fast.
2020-11-04 08:26:20 -08:00
Arnold Schwaighofer
3e8facdd39 Start of an llvm.coro.async implementation
This patch adds the `async` lowering of coroutines.

This will be used by the Swift frontend to lower async functions. In
contrast to the `retcon` lowering the frontend needs to be in control
over control-flow at suspend points as execution might be suspended at
these points.

This is very much work in progress and the implementation will change as
it evolves with the frontend. As such the documentation is lacking
detail as some of it might change.

rdar://70097093

Differential Revision: https://reviews.llvm.org/D90612
2020-11-04 07:32:29 -08:00
Sanjay Patel
75c976d15a [InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80 2020-11-04 08:12:48 -05:00
Sanjay Patel
9993666887 [InstSimplify] add vector cmp tests; NFC 2020-11-04 08:12:47 -05:00
Roman Lebedev
a30754006b [Reassociate] Guard add-like or conversion into an add with profitability check
This is slightly better compile-time wise,
since we avoid potentially-costly knownbits analysis that will
ultimately not allow us to actually do anything with said `add`.
2020-11-04 16:10:34 +03:00
Martin Storsjö
3bd74bda76 Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c15d2830f41381a327f5d6bf49ff416.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Sanjay Patel
4b28ca8f6f [ARM] remove cost-kind predicate for most math op costs
This is based on the same idea that I am using for the basic model implementation
and what I have partly already done for x86: throughput cost is number of
instructions/uops, so size/blended costs are identical except in special cases
(for example, fdiv or other known-expensive machine instructions or things like
MVE that may require cracking into >1 uop)).

Differential Revision: https://reviews.llvm.org/D90692
2020-11-03 17:23:46 -05:00
Roman Lebedev
3e46fb4064 [Reassociate] Convert add-like or's into an add's to allow reassociation
InstCombine is quite aggressive in doing the opposite transform,
folding `add` of operands with no common bits set into an `or`,
and that not many things support that new pattern..

In this case, teaching Reassociate about it is easy,
there's preexisting art for `sub`/`shl`:
just convert such an `or` into an `add`:
https://rise4fun.com/Alive/Xlyv
2020-11-03 22:30:51 +03:00
Roman Lebedev
0d7ab60f8c [NFC][Reassociate] Add tests with add-like or (w/ no common bits set) 2020-11-03 22:30:51 +03:00
Joe Ellis
ec11b50ddf [SVE][InstCombine] Improve specificity of InstCombine TypeSize test
The test was using -O2, where -instcombine will suffice.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D90684
2020-11-03 15:24:44 +00:00
Anton Afanasyev
6cdccb2414 [SLP][X86][Test] Extend test coverage for PR47629
Add two cases for `<i32 x 8>`. Precommit for PR47629 and D90445. NFC
2020-11-03 17:51:24 +03:00
Roman Lebedev
67e1befafc [InstCombine] Perform C-(X+C2) --> (C-C2)-X transform before using Negator
In particular, it makes it fire for C=0, because negator doesn't want
to perform that fold since in general it's not beneficial.
2020-11-03 16:06:52 +03:00
Roman Lebedev
ca79221be4 [InstCombine] Negator: - (C - %x) --> %x - C (PR47997)
This relaxes one-use restriction on that `sub` fold,
since apparently the addition of Negator broke
preexisting `C-(C2-X) --> X+(C-C2)` (with C=0) fold.
2020-11-03 16:06:51 +03:00
Roman Lebedev
7eb9baa95b [NFC][InstCombine] Negator: add test coverage for (?? - (%y + C)) pattern (PR47997) 2020-11-03 16:06:51 +03:00
Roman Lebedev
dd44c5f9ca [NFC][InstCombine] Negator: add test coverage for (?? - (C - %y)) pattern (PR47997) 2020-11-03 16:06:51 +03:00
Roman Lebedev
8c309b9c21 [NFC][InstCombine] Add test coverage for PR47997 2020-11-03 16:06:50 +03:00
Florian Hahn
8985e25f47 [SCCP] Handle bitcast of vector constants.
Vectors where all elements have the same known constant range are treated as a
single constant range in the lattice. When bitcasting such vectors, there is a
mis-match between the width of the lattice value (single constant range) and
the original operands (vector). Go to overdefined in that case.

Fixes PR47991.
2020-11-03 12:58:39 +00:00
Simon Pilgrim
63512b001b [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625
2020-11-03 10:49:49 +00:00
Florian Hahn
3a9ee235ce [SLP] Pass VecPred argument to getCmpSelInstrCost.
Check if all compares in VL have the same predicate and pass it to
getCmpSelInstrCost, to improve cost-modeling on targets that only
support compare/select combinations for certain uniform predicates.

This leads to additional vectorization in some cases

```
Same hash: 217 (filtered out)
Remaining: 19
Metric: SLP.NumVectorInstructions

Program                                        base    slp2    diff
 test-suite...marks/SciMark2-C/scimark2.test    11.00   26.00  136.4%
 test-suite...T2006/445.gobmk/445.gobmk.test    79.00  135.00  70.9%
 test-suite...ediabench/gsm/toast/toast.test    54.00   71.00  31.5%
 test-suite...telecomm-gsm/telecomm-gsm.test    54.00   71.00  31.5%
 test-suite...CI_Purple/SMG2000/smg2000.test   426.00  542.00  27.2%
 test-suite...ch/g721/g721encode/encode.test    30.00   24.00  -20.0%
 test-suite...000/186.crafty/186.crafty.test   116.00  138.00  19.0%
 test-suite...ications/JM/ldecod/ldecod.test   697.00  765.00   9.8%
 test-suite...6/464.h264ref/464.h264ref.test   822.00  886.00   7.8%
 test-suite...chmarks/MallocBench/gs/gs.test   154.00  162.00   5.2%
 test-suite...nsumer-lame/consumer-lame.test   621.00  651.00   4.8%
 test-suite...lications/ClamAV/clamscan.test   223.00  231.00   3.6%
 test-suite...marks/7zip/7zip-benchmark.test   680.00  695.00   2.2%
 test-suite...CFP2000/177.mesa/177.mesa.test   2121.00 2129.00  0.4%
 test-suite...:: External/Povray/povray.test   2406.00 2412.00  0.2%
 test-suite...TimberWolfMC/timberwolfmc.test   634.00  634.00   0.0%
 test-suite...CFP2006/433.milc/433.milc.test   1036.00 1036.00  0.0%
 test-suite.../Benchmarks/nbench/nbench.test   321.00  321.00   0.0%
 test-suite...ctions-flt/Reductions-flt.test    NaN      5.00   nan%
```

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D90124
2020-11-03 10:16:43 +00:00
Max Kazantsev
b0c85eefb0 [IndVars] Use knowledge about execution on last iteration when removing checks
If we know that some check will not be executed on the last iteration, we can use this
fact to eliminate its check.

Differential Revision: https://reviews.llvm.org/D88210
Reviwed By: ebrevnov
2020-11-03 13:38:58 +07:00