1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

201540 Commits

Author SHA1 Message Date
Christian Kühnel
e33dafd15e Revert "[CMake] Simplify CMake handling for zlib"
This reverts commit 1adc494bce44f6004994deed61b30d4b71fe1d05.
This patch broke the Windows compilation on buildbot and pre-merge testing:
http://lab.llvm.org:8011/builders/mlir-windows/builds/5945
https://buildkite.com/llvm-project/llvm-master-build/builds/780
2020-08-07 09:36:49 +02:00
Max Kazantsev
2a20f7c161 [Test] Add one more test on IndVars that was failing on one of older builds 2020-08-07 14:23:55 +07:00
QingShan Zhang
abbd439db7 [NFC] Add the stats for load/store cluster
We have the stats for MacroFusion but miss it for load/store cluster.
2020-08-07 07:09:48 +00:00
David Sherwood
396a61103c [SVE][CodeGen] Fix bug with store of unpacked FP scalable vectors
Fixed an incorrect pattern in lib/Target/AArch64/AArch64SVEInstrInfo.td
for storing out <vscale x 2 x f32> unpacked scalable vectors. Added
a couple of tests to

  test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll

Differential Revision: https://reviews.llvm.org/D85441
2020-08-07 07:19:09 +01:00
biplmish
f87266ec05 [PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang
This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D84622
2020-08-07 01:02:29 -05:00
QingShan Zhang
b0aa16911a [PowerPC] Support constrained fp operation for setcc
The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281.
This patch is trying to add the support for PowerPC backend.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D81727
2020-08-07 05:16:36 +00:00
QingShan Zhang
f8c2a4c0d9 [Scheduling] Create the missing dependency edges for store cluster
If it is load cluster, we don't need to create the dependency edges(SUb->reg) from SUb to SUa
as they both depend on the base register "reg"

     +-------+
+---->  reg  |
|    +---+---+
|        ^
|        |
|        |
|        |
|    +---+---+
|    |  SUa  |  Load 0(reg)
|    +---+---+
|        ^
|        |
|        |
|    +---+---+
+----+  SUb  |  Load 4(reg)
     +-------+

But if it is store cluster, we need to create it as follow shows to avoid the instruction store
depend on scheduled in-between SUb and SUa.

     +-------+
+---->  reg  |
|    +---+---+
|        ^
|        |         Missing       +-------+
|        | +-------------------->+   y   |
|        | |                     +---+---+
|    +---+-+-+                       ^
|    |  SUa  |  Store x 0(reg)       |
|    +---+---+                       |
|        ^                           |
|        |  +------------------------+
|        |  |
|    +---+--++
+----+  SUb  |  Store y 4(reg)
     +-------+

Reviewed By: evandro, arsenm, rampitec, foad, fhahn

Differential Revision: https://reviews.llvm.org/D72031
2020-08-07 04:58:03 +00:00
Vitaly Buka
f8fc07d99e [StackSafety,NFC] Fix tests in debug 2020-08-06 20:46:39 -07:00
Shinji Okumura
ae45682066 [Attributor] Check violation of returned position nonnull and noundef attribute in AAUndefinedBehavior
This patch is a follow up of D84733.
If a function has noundef attribute in returned position, instructions that return undef or poison value cause UB.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85178
2020-08-07 12:02:42 +09:00
Vitaly Buka
4cdf5a3f95 [StackSafety,NFC] Add more tests 2020-08-06 19:50:05 -07:00
Vitaly Buka
156223b38e [StackSafety,NFC] Sort llvm-lto2 resolutions in tests 2020-08-06 19:46:52 -07:00
Vitaly Buka
15a533fc32 [StackSafety,NFC] Add debug counters 2020-08-06 19:24:02 -07:00
Vitaly Buka
4124f20201 [StackSafety,NFC] Use CHECK-EMPTY in tests 2020-08-06 19:19:51 -07:00
Vitaly Buka
68e4929619 [LLParser,NFC] Simplify forward GV refs update
Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D85238
2020-08-06 19:18:51 -07:00
Vitaly Buka
3b944733de [StackSafety] Skip ambiguous lifetime analysis
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D84630
2020-08-06 19:10:33 -07:00
Vitaly Buka
567e88646c [LTO,NFC] Skip generateParamAccessSummary when empty
addGlobalValueSummary can check newly added FunctionSummary
and set HasParamAccess to mark that generateParamAccessSummary
is needed.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D85182
2020-08-06 19:01:19 -07:00
Arthur Eubanks
f51f94852f [NewPM] Add callback for skipped passes
Parallel to https://reviews.llvm.org/D84772.

Will use this for printing when a pass is skipped.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85478
2020-08-06 18:58:59 -07:00
Nico Weber
20496ec614 fix doc typo to cycle bots 2020-08-06 21:02:41 -04:00
Kazushi (Jam) Marukawa
847b3d4c65 [VE] Optimize trunc related instructions
Change to not generate truncate instructions if all use of a truncate
operation don't care about higher bits.  For example, an i32 add
instruction doesn't care about higher 32 bits in 64 bit registers.
Updates regression tests also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D85418
2020-08-07 09:21:05 +09:00
Jessica Paquette
e43ddd00f6 [GlobalISel] Fix computing known bits for loads with range metadata
In GlobalISel, if you have a load into a small type with a range, you'll hit
an assert if you try to compute known bits on it starting at a larger type.

e.g.

```
%x:_(s8) = G_LOAD %whatever(p0) :: (load 1 ... !range !n)
...
%y:_(s32) = G_SOMETHING %x
```

When we walk through G_SOMETHING and hit the load, the width of our known bits
is 32. However, the width of the range is going to be 8. This will cause us
to hit an assert.

To fix this, make computeKnownBitsFromRangeMetadata zero extend or truncate
the range type to match the bitwidth of the known bits we're calculating.

Add a testcase in CodeGen/GlobalISel/KnownBitsTest.cpp to reflect that this
works now.

https://reviews.llvm.org/D85375
2020-08-06 16:47:07 -07:00
Matt Arsenault
b1038416e3 GlobalISel: Implement lower for G_INSERT_VECTOR_ELT 2020-08-06 19:29:17 -04:00
Mark Mentovai
5eedc0d9cb [gn build] mac: use frameworks instead of libs where appropriate
As of GN 3028c6a426a4, the hack that transformed "libs" ending in
".framework" from -l arguments to -framework arguments has been removed.
Instead, "frameworks" must be used, and the toolchain must provide
support.

Differential Revision: https://reviews.llvm.org/D84219
2020-08-06 18:58:57 -04:00
Arthur Eubanks
de96a56470 [NewPM][GuardWidening] Fix loop guard widening tests under NPM
Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D85394
2020-08-06 15:32:59 -07:00
Yonghong Song
9cb523a893 BPF: fix libLLVMBPFCodeGen.so build failure
Buildbot reported a build failure when building shared
library libLLVMBPFCodeGen.so with unknown reference to
"createCFGSimplificationPass".

Commit 87cba434027b ("BPF: add a SimplifyCFG IR pass during
generic Scalar/IPO optimization") added an IR pass SimplifyCFG
by BPF target. The commit called function
createCFGSimplificationPass() defined in "Scalar" library.
Add this library in Target/BPF/LLVMBuild.txt so
shared library build can succeed.
2020-08-06 15:27:15 -07:00
Tony
9357771d35 [AMDGPU] Correct missing sram-ecc target feature for gfx906
Differential Revision: https://reviews.llvm.org/D85476
2020-08-06 22:12:25 +00:00
Matt Arsenault
c97959510d AMDGPU/GlobalISel: Enable s_{and|or}n2_{b32|b64} patterns 2020-08-06 18:00:38 -04:00
Roman Lebedev
7abe7a3b10 [InstCombine] Fold (x + C1) * (-1<<C2) --> (-C1 - x) * (1<<C2)
Negator knows how to do this, but the one-use reasoning is getting
a bit muddy here, we don't really want to increase instruction count,
so we need to both lie that "IsNegation" and have an one-use check
on the outermost LHS value.
2020-08-06 23:40:16 +03:00
Roman Lebedev
962c037575 [InstCombine] Generalize %x * (-1<<C) --> (-%x) * (1<<C) fold
Multiplication is commutative, and either of operands can be negative,
so if the RHS is a negated power-of-two, we should try to make it
true power-of-two (which will allow us to turn it into a left-shift),
by trying to sink the negation down into LHS op.

But, we shouldn't re-invent the logic for sinking negation,
let's just use Negator for that.

Tests and original patch by: Simon Pilgrim @RKSimon!

Differential Revision: https://reviews.llvm.org/D85446
2020-08-06 23:39:53 +03:00
Roman Lebedev
961feee4d7 [NFC][InstCombine] Add some more tests for negation sinking into mul 2020-08-06 23:37:17 +03:00
Roman Lebedev
6ce2a55160 [InstCombine] Fold sdiv exact X, -1<<C --> -(ashr exact X, C)
While that does increases instruction count,
shift is obviously better than a division.

Name: base
Pre: (1<<C1) >= 0
%o0 = shl i8 1, C1
%r = sdiv exact i8 C0, %o0
  =>
%r = ashr exact i8 C0, C1

Name: neg
%o0 = shl i8 -1, C1
%r = sdiv exact i8 C0, %o0
  =>
%t0 = ashr exact i8 C0, C1
%r = sub i8 0, %t0

Name: reverse
Pre: C1 != 0 && C1 u< 8
%t0 = ashr exact i8 C0, C1
%r = sub i8 0, %t0
  =>
%o0 = shl i8 -1, C1
%r = sdiv exact i8 C0, %o0

https://rise4fun.com/Alive/MRplf
2020-08-06 23:37:16 +03:00
Roman Lebedev
29f3ac95a1 [NFC][InstCombine] Negator: add a comment about negating exact arithmentic shift 2020-08-06 23:37:16 +03:00
Roman Lebedev
2b5f9109d7 [InstCombine] Generalize sdiv exact X, 1<<C --> ashr exact X, C fold to handle non-splat vectors 2020-08-06 23:37:15 +03:00
Roman Lebedev
70733c0997 [NFC][InstCombine] Better tests for x s/EXACT (1 << y) pattern 2020-08-06 23:37:15 +03:00
Roman Lebedev
2fc42182de [NFC][InstCombine] Tests for x s/EXACT (-1 << y) pattern 2020-08-06 23:37:15 +03:00
Craig Topper
b793ada8aa [LegalTypes] Move VSELECT node creation out of WidenVSELECTAndMask and push to 2 of the 3 callers.
One of the callers only wants the condition, but the vselect can
be simplified by getNode making it hard or impossible to retrieve
the condition.

Instead, return the condition and make the other 2 callers
responsible for creating the vselect node using the condition.
Rename the function to WidenVSELECTMask accordingly.

Differential Revision: https://reviews.llvm.org/D85468
2020-08-06 13:18:16 -07:00
Yonghong Song
36ce966578 BPF: add a SimplifyCFG IR pass during generic Scalar/IPO optimization
The following bpf linux kernel selftest failed with latest
llvm:
  $ ./test_progs -n 7/10
  ...
  The sequence of 8193 jumps is too complex.
  verification time 126272 usec
  stack depth 320
  processed 114799 insns (limit 1000000)
  ...
  libbpf: failed to load object 'pyperf600_nounroll.o'
  test_bpf_verif_scale:FAIL:110
  #7/10 pyperf600_nounroll.o:FAIL
  #7 bpf_verif_scale:FAIL

After some investigation, I found the following llvm patch
  https://reviews.llvm.org/D84108
is responsible. The patch disabled hoisting common instructions
in SimplifyCFG by default. Later on, the code changes and a
SimplifyCFG phase with hoisting on cannot do the work any more.

A test is provided to demonstrate the problem.
The IR before simplifyCFG looks like:
  for.cond:
    %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
    %cmp = icmp ult i32 %i.0, 6
    br i1 %cmp, label %for.body, label %for.cond.cleanup

  for.cond.cleanup:
    %2 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
    %cmp2 = icmp eq i8* %2, null
    %conv = zext i1 %cmp2 to i32
    call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3
    call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3
    ret i32 %conv

  for.body:
    %3 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
    %tobool.not = icmp eq i8* %3, null
    br i1 %tobool.not, label %for.inc, label %land.lhs.true

The first two insns of `for.cond.cleanup` and `for.body`, load and
icmp, can be hoisted to `for.cond` block. With Patch D84108, the
optimization is delayed. But unfortunately, later on loop rotation
added addition phi nodes to `for.body` and hoisting cannot
be done any more.

Note such a hoisting is beneficial to bpf programs as
bpf verifier does path sensitive analysis and verification.
The hoisting preverts reloading from stack which will assume
conservative value and increase exploited insns. In this case,
it caused verifier failure.

To fix this problem, I added an IR pass from bpf target
to performance additional simplifycfg with hoisting common inst
enabled.

Differential Revision: https://reviews.llvm.org/D85434
2020-08-06 13:16:00 -07:00
Snehasish Kumar
bfbcf062be [NFC] Rename BBSectionsPrepare -> BasicBlockSections.
Rename the BBSectionsPrepare pass as suggested by the review comment in
https://reviews.llvm.org/D85368.

Differential Revision: https://reviews.llvm.org/D85380
2020-08-06 13:12:06 -07:00
Sanjay Patel
01843e6c65 [InstSimplify] avoid crashing by trying to rem-by-zero
Bug was noted in the post-commit comments for:
rGe8760bb9a8a3
2020-08-06 16:06:31 -04:00
Sanjay Patel
5a10b27f37 [VectorCombine] add tests for load+insert; NFC 2020-08-06 15:45:02 -04:00
Anton Afanasyev
c038d11ee3 [SLP] Fix order of insertelement/insertvalue seed operands
Summary:
This patch takes the indices operands of `insertelement`/`insertvalue`
into account while generation of seed elements for `findBuildAggregate()`.
This function has kept the original order of `insert`s before.
Also this patch optimizes `findBuildAggregate()` preventing it from
redundant temporary vector allocations and its multiple reversing.

Fixes llvm.org/pr44067

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83779
2020-08-06 22:09:24 +03:00
Matt Arsenault
302794de3c Add freeze keyword to IR emacs mode 2020-08-06 14:58:28 -04:00
dfukalov
36b3187e5e [AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation.
Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.

Fixed line endings in test.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D84995
2020-08-06 21:43:27 +03:00
Matt Arsenault
fc03bd4465 GlobalISel: Implement fewerElementsVector for G_EXTRACT_VECTOR_ELT
Use the same basic strategy as LegalizeVectorTypes. Try to index into
smaller pieces if there's a constant index, and otherwise fall back to
a stack temporary.
2020-08-06 14:33:16 -04:00
Arthur Eubanks
54c69d097e [NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch
As mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html,
loop-unswitch has not been ported to the NPM. Instead people are using
simple-loop-unswitch.

Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all
other uses of loop-unswitch with simple-loop-unswitch.

One test that didn't fit into the above was
2014-06-21-congruent-constant.ll which seems to only pass with
loop-unswitch. That is also pinned to legacy PM.

Now all tests containing "-loop-unswitch" anywhere in the test succeed with
NPM turned on by default.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85360
2020-08-06 10:56:00 -07:00
Arthur Eubanks
96980a0b56 [NewPM] Pin -assumption-cache-tracker tests to legacy PM
All tests have corresponding NPM RUN lines.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85395
2020-08-06 10:38:03 -07:00
Matt Arsenault
f305dea485 AMDGPU: Define raw/struct variants of buffer atomic fadd
Somehow the new FP atomic buffer intrinsics ended up using the legacy
style for buffer intrinsics.
2020-08-06 13:36:19 -04:00
Matt Arsenault
2f97a6b627 AArch64/GlobalISel: Fix verifier error after selecting returnaddress
This was caching the wrong register to re-use later.
2020-08-06 13:18:05 -04:00
Simon Pilgrim
32846a6328 [SLP][X86] Regenerate sdiv test noticed in D83779. NFC. 2020-08-06 18:00:21 +01:00
Mircea Trofin
1f12cdca6a [NFC]{MLInliner] Point out the tests' model dependencies 2020-08-06 09:57:26 -07:00
Matt Arsenault
b212016cb2 AMDGPU: Fix spilling of 96-bit AGPRs 2020-08-06 12:42:07 -04:00