1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

64077 Commits

Author SHA1 Message Date
Akira Hatanaka
b441a22f21 Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions
to intrinsic calls

This fixes a bug in r368311.

It turns out that the ARC runtime functions in the IR can have pointer
parameter types that are not i8* or i8**. Instead of RAUWing normal
functions with intrinsics, manually bitcast the arguments before passing
them to the intrinsic functions and bitcast the return value back to the
type of the original call instruction.

rdar://problem/54125406

llvm-svn: 368634
2019-08-12 23:53:23 +00:00
Reid Kleckner
13cbb221ad [WinEH] Fix catch block parent frame pointer offset
r367088 made it so that funclets store XMM registers into their local
frame instead of storing them to the parent frame. However, that change
forgot to update the parent frame pointer offset for catch blocks. This
change does that.

Fixes crashes when an exception is rethrown in a catch block that saves
XMMs, as described in https://crbug.com/992860.

llvm-svn: 368631
2019-08-12 23:02:00 +00:00
Craig Topper
28248b6f86 [X86] Allow combineTruncateWithSat to use pack instructions for i16->i8 without AVX512BW.
We need AVX512BW to be able to truncate an i16 vector. If we don't
have that we have to extend i16->i32, then trunc, i32->i8. But we
won't be able to remove the min/max if we do that. At least not
without more special handling.

llvm-svn: 368623
2019-08-12 22:18:23 +00:00
Kang Zhang
56462c7a94 [NFC][PowerPC] Add the test case shrink-wrap.mir and shrink-wrap.ll for PPC
llvm-svn: 368597
2019-08-12 17:50:01 +00:00
Roman Lebedev
9944d0bc51 [CostModel][X86][AArch64] Check all 3 cost kinds in aggregates.ll
llvm-svn: 368595
2019-08-12 17:45:12 +00:00
Craig Topper
1a2bd102eb [X86] Disable use of zmm registers for varargs musttail calls under prefer-vector-width=256 and min-legal-vector-width=256.
Under this config, the v16f32 type we try to use isn't to a register
class so the getRegClassFor call will fail.

llvm-svn: 368594
2019-08-12 17:43:26 +00:00
David Green
8254d827f7 [ARM] sext of a load is free
This teaches the cost model that the sext or zext of a load is going to be
free.

Differential Revision: https://reviews.llvm.org/D66006

llvm-svn: 368593
2019-08-12 17:39:56 +00:00
Stanislav Mekhanoshin
7306f3fe48 [AMDGPU] Printf runtime binding pass
This pass is a port of the according pass from the HSAIL compiler.
It parses printf calls and setup runtime printf buffer.
After that it copies printf arguments to the buffer and fills in
module metadata for runtime.

Differential Revision: https://reviews.llvm.org/D24035

llvm-svn: 368592
2019-08-12 17:12:29 +00:00
David Green
1d8ffb16d1 [ARM] MVE shuffle broadcast costs
A VDUP will perform a vector broadcast in a single instruction. Update the cost
model for MVE accordingly.

Code originally by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63448

llvm-svn: 368589
2019-08-12 16:54:07 +00:00
David Green
3072ad5cf5 [ARM] Put some of the TTI costmodel behind hasNeon calls.
This puts some of the calls in ARMTargetTransformInfo.cpp behind hasNeon()
checks, now that we have MVE, and updates all the tests accordingly.

Differential Revision: https://reviews.llvm.org/D63447

llvm-svn: 368587
2019-08-12 15:59:52 +00:00
David Green
18f1bc6e4e [ARM] Add or update a number of costmodel tests. NFC
This adds a number of cost model tests for ARM, useful for MVE. It also re-jigs
some of the existing tests to make them easier to update and read.

llvm-svn: 368586
2019-08-12 15:40:27 +00:00
Sanjay Patel
6c32ebc5f9 [InstCombine] add tests for scalar-select-of-vectors; NFC
llvm-svn: 368583
2019-08-12 15:21:11 +00:00
Hans Wennborg
1c46c7087d Revert r368339 "[MBP] Disable aggressive loop rotate in plain mode"
It caused assertions to fire when building Chromium:

  lib/CodeGen/LiveDebugValues.cpp:331: bool
  {anonymous}::LiveDebugValues::OpenRangesSet::empty() const: Assertion
  `Vars.empty() == VarLocs.empty() && "open ranges are inconsistent"' failed.

See https://crbug.com/992871#c3 for how to reproduce.

> Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse.
>
> To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true.
>
> Differential Revision: https://reviews.llvm.org/D65673

llvm-svn: 368579
2019-08-12 14:23:13 +00:00
Jordan Rupprecht
0d351d4f7c [llvm-readobj] Downgrade 'PT_DYNAMIC segment offset + size exceeds the size of the file' from an error to a warning
Summary: This allows llvm-readobj to print other useful information for truncated files instead of giving up.

Reviewers: jhenderson, grimar, MaskRay

Reviewed By: jhenderson, grimar, MaskRay

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66036

llvm-svn: 368576
2019-08-12 14:05:37 +00:00
Simon Pilgrim
83efc11093 [X86][SSE] Add test showing missing demanded elts PSADBW handling
llvm-svn: 368575
2019-08-12 14:01:16 +00:00
Kang Zhang
6c18f8cea5 Revert r368565: [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
llvm-svn: 368574
2019-08-12 14:00:31 +00:00
Owen Reynolds
dc9ffeae80 [llvm-ar] Accept file paths with windows format slashes
The internal representation of llvm-ar archives uses linux style slashes
for paths, no matter the OS. In the case of windows this meant file
paths input intending to match existing members would only match if
linux style slashes where used. This change allows either slash
direction to be input by the user.

This change includes removing an unnecessary call to normalisePath and
moving the call of another.

Differential Revision: https://reviews.llvm.org/D65743

llvm-svn: 368573
2019-08-12 14:00:28 +00:00
Sam Elliott
8662789349 [RISCV] Fix ICE in isDesirableToCommuteWithShift
Summary:
Ana Pazos reported a bug where we were not checking that an APInt would
fit into 64-bits before calling `getSExtValue()`. This caused asserts when
compiling large constants, such as i128s, as happens when compiling compiler-rt.

This patch adds a testcase and makes the callback less error-prone.

Reviewers: apazos, asb, luismarques

Reviewed By: luismarques

Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66081

llvm-svn: 368572
2019-08-12 13:51:00 +00:00
David Bolvansky
2430b215ca [InstCombine] x /c fabs(x) -> copysign(1.0, x)
Summary:
x / fabs(x) -> copysign(1.0, x)
fabs(x) / x -> copysign(1.0, x)

Reviewers: spatel, foad, RKSimon, efriedma

Reviewed By: spatel

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65898

llvm-svn: 368570
2019-08-12 13:43:35 +00:00
David Stenberg
3746324622 [DebugInfo] Remove call sites when eliminating unreachable blocks
Summary:
When eliminating an unreachable block we must remove any call site
information for calls residing in the block.

This was originally found on a downstream target, and the attached x86
test case was produced by hand-modifying some MIR.

Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk

Reviewed By: NikolaPrica, vsk

Subscribers: vsk, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D64500

llvm-svn: 368566
2019-08-12 13:22:29 +00:00
Kang Zhang
cafaf9bdfb [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 368565
2019-08-12 13:15:31 +00:00
Owen Reynolds
dc3eda837d [llvm-ar][test] Correct tests marked as expected fails
In diff D64802 I marked three tests as expected failures for darwin but
James Nagurne saw these fail on his downstream embedded ARM cross
compiler.
I believe XFAIL: system-darwin should be used instead of using XFAIL:
darwin due to the problem being related to the darwin host and not the
target.

Differential Revision: https://reviews.llvm.org/D65745

llvm-svn: 368564
2019-08-12 13:04:02 +00:00
Hans Wennborg
85ec3cf9ad Revert r368509 "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks"
> In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
> But the `early-ret` pass is before `block-placement`, we don't want to run it again.
> This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.
>
> Reviewed By: efriedma
>
> Differential Revision: https://reviews.llvm.org/D63972

This also revertes follow-ups r368514 and r368532.

llvm-svn: 368560
2019-08-12 12:43:51 +00:00
Simon Pilgrim
f6ff10d8c4 [X86][SSE] ComputeKnownBits - add basic PSADBW handling
llvm-svn: 368558
2019-08-12 12:19:19 +00:00
Simon Pilgrim
f01bfd8575 [X86][SSE] Add test showing missing compute known bits PSADBW handling
The upper 48-bits of each i64 element is guaranteed to be zero.

llvm-svn: 368557
2019-08-12 12:13:08 +00:00
James Henderson
6f6eb9e7e1 NFC. Remove trailing whitespace in test
llvm-svn: 368556
2019-08-12 11:39:54 +00:00
James Henderson
9faf6091cf [llvm-strings] Improve testing of llvm-strings
This patch tidies up the llvm-strings testing by:

1. Adding comments to every test.
2. Getting rid of canned input files, and having the tests generate
   them on the fly (this makes the tests self-contained).
3. Adding missing test coverage.
4. Renaming some tests that weren't clear as to their purpose.
5. Adding extra checking of various cases, formatting etc.
6. Removing a test that didn't seem to have any useful purpose for
   testing llvm-strings.

Reviewed by: rupprecht, grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D66015

llvm-svn: 368555
2019-08-12 11:36:11 +00:00
Roman Lebedev
caf7add21b [InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962)
Instead of matching value and then blindly casting to BinaryOperator
just to get the opcode, just match instruction and do no cast.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42962

llvm-svn: 368554
2019-08-12 11:28:02 +00:00
Simon Pilgrim
70688517d3 [TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::TRUNCATE
llvm-svn: 368553
2019-08-12 10:56:05 +00:00
Roman Lebedev
4d08b96366 [CostModel][X86][AArch64] Add some tests for extractvalue
In https://reviews.llvm.org/D65148 it is suggested that
it should have zero cost, always.

llvm-svn: 368548
2019-08-12 09:24:33 +00:00
Craig Topper
4a675ce557 [X86] Add some reduction add test cases that show sub-optimal code on avx2 and later.
For v4i8 and v8i8 when the reduction starts with a load we end up
shifting the data in the scalar domain and copying to the vector
domain a second time using a broadcast.

We already copied it to the vector domain once. It's better to
just shuffle it there.

llvm-svn: 368544
2019-08-12 06:55:58 +00:00
Pengfei Wang
c35df9a0ee [X86] Support -march=tigerlake
Support -march=tigerlake for x86.
Compare with Icelake Client, It include 4 more new features ,they are
avx512vp2intersect, movdiri, movdir64b, shstk.

Patch by Xiang Zhang (xiangzhangllvm)

Differential Revision: https://reviews.llvm.org/D65840

llvm-svn: 368543
2019-08-12 01:29:46 +00:00
Bjorn Pettersson
47fce18f1a [X86] Remove redundant ';' chars ending IR lines in lit tests. NFC
Reviewers: RKSimon, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66053

llvm-svn: 368541
2019-08-11 19:27:14 +00:00
Bjorn Pettersson
729026b2c4 [SelectionDAG] Widen vector results of SMULFIX/UMULFIX/SMULFIXSAT
Summary:
After the commits that changed x86 backend to widen vectors
instead of using promotion some of our downstream tests
started to fail. It was noticed that WidenVectorResult has
been missing support for SMULFIX/UMULFIX/SMULFIXSAT. This
patch adds the missing functionality.

Reviewers: craig.topper, RKSimon

Reviewed By: craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66051

llvm-svn: 368540
2019-08-11 19:27:06 +00:00
David Green
15e5321ece [ARM] MVE spill vector test. NFC
llvm-svn: 368531
2019-08-11 09:12:57 +00:00
David Green
7afcef84c7 [MVE] Don't try to unroll vectorised MVE loops
Due to the nature of the beat system in the MVE architecture, along with tail
predication and low-overhead loops, unrolling has less benefit compared to
normal loops. You can not, for example, hide the latency of a load with other
instructions as you can for scalar code. Preventing unrolling also makes the
code easier to read and reason about.

So if a loop contains vector code, don't enable the runtime unrolling. At least
for the time being.

Differential Revision: https://reviews.llvm.org/D65803

llvm-svn: 368530
2019-08-11 08:53:18 +00:00
David Green
5b0a3c0432 [ARM] Permit auto-vectorization using MVE
With enough codegen complete, we can now correctly report the number and size
of vector registers for MVE, allowing auto vectorisation. This also allows FP
auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE
FP arithmetic and parity between scalar and vector FP behaviour.

Patch by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63728

llvm-svn: 368529
2019-08-11 08:42:57 +00:00
Heejin Ahn
01f5818544 Fix __clang_call_termiante's argument for foreign exceptions
Summary:
When exceptions are repeatedly thrown in the middle of handling another
exception, we call `__clang_call_terminate` with the exception pointer
(i32) as an argument. But in case of foreign exceptions, we don't have
the pointer, so we call the function with 0. (This requires
`__clang_call_terminate` can deal with 0 argument, which will be done
later)

But previously the 0 argument was not added as a `i32.const 0` but an
immediate by mistake, causing the `call` instruction to take not an i32
but rather an exnref, because an `exnref` is left on top of the value
stack if `br_on_exn` is not taken.

```
block i32
  br_on_exn 0, __cpp_exception
                               ;; exnref is on top of stack now
  i32.const 0                  ;; This was missing!
  call __clang_call_terminate
  unreachable
end
call __clang_call_terminate    ;; This takes i32 extracted by br_on_exn
```

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65475

llvm-svn: 368527
2019-08-11 06:24:07 +00:00
Wenlei He
c096312f12 [LICM] Make Loop ICM profile aware
Summary:
Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block.

Test Plan:

ninja check

Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk

Reviewed By: asbirlea

Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65060

llvm-svn: 368526
2019-08-11 06:05:35 +00:00
Craig Topper
09d1f0ad6d [X86] Remove some code from combineShuffle that seems largely unnecessary with widening legalization.
The test case that changed is probably better served through
allowing combineTruncatedArithmetic to create narrow vectors. It
also appears InstCombine would have simplified this test case
to remove the zext and trunc anyway.

llvm-svn: 368522
2019-08-11 02:08:38 +00:00
Roman Lebedev
e10873abd8 [NFC][InstCombine] Tests for shift amount reassociation in bittest with truncated shl (PR42399)
trunc-of-shl:
  https://rise4fun.com/Alive/zGx
  https://rise4fun.com/Alive/sl0L
I.e. no extra legality check needed.

https://bugs.llvm.org/show_bug.cgi?id=42399

llvm-svn: 368520
2019-08-10 19:29:03 +00:00
Roman Lebedev
8ad5ce7b36 [InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant
If one of the values being shifted is a constant, since the new shift
amount is known-constant, the new shift will end up being constant-folded
so, we don't need that one-use restriction then.

llvm-svn: 368519
2019-08-10 19:28:54 +00:00
Roman Lebedev
4fb511a51d [InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction
That one-use restriction is not needed for correctness - we have already
ensured that one of the shifts will go away, so we know we won't increase
the instruction count. So there is no need for that restriction.

llvm-svn: 368518
2019-08-10 19:28:44 +00:00
Roman Lebedev
b3bb26537d [NFC][InstCombine] Tests for shift amount reassociation in bittest with shift of const
llvm-svn: 368517
2019-08-10 19:28:12 +00:00
Simon Pilgrim
41fb8c9c5c [X86][SSE] Lower shuffle as ANY_EXTEND_VECTOR_INREG
On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits.

This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes.

Differential Revision: https://reviews.llvm.org/D65741

llvm-svn: 368515
2019-08-10 16:46:07 +00:00
Michael Liao
b0f9c14a99 [TableGen] Correct the shift to the proper bit width.
- Replace the previous 32-bit shift with 64-bit one matching `OpInit`.

llvm-svn: 368513
2019-08-10 16:15:06 +00:00
Sanjay Patel
691f754300 [Reassociate] try harder to convert negative FP constants to positive
This is an extension of a transform that tries to produce positive floating-point
constants to improve canonicalization (and hopefully lead to more reassociation
and CSE).

The original patches were:
D4904
D5363 (rL221721)

But as the test diffs show, these were limited to basic patterns by walking from
an instruction to its single user rather than recursively moving up the def-use
sequence. No fast-math is required here because we're only rearranging implicit
FP negations in intermediate ops.

A motivating bug is:
https://bugs.llvm.org/show_bug.cgi?id=32939

Differential Revision: https://reviews.llvm.org/D65954

llvm-svn: 368512
2019-08-10 13:17:54 +00:00
Kang Zhang
b37f16cb07 [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 368509
2019-08-10 09:58:52 +00:00
Craig Topper
b7845c5bbc [X86] Match the IR pattern form movmsk on SSE1 only targets where v4i32 isn't legal
Summary:
This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that.

This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type.

Reviewers: spatel, RKSimon, xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65689

llvm-svn: 368506
2019-08-10 07:51:13 +00:00
Craig Topper
d1665a0e76 [X86] Improve the diagnostic for larger than 4-bit immediate for vpermil2pd/ps. Only allow MCConstantExprs.
llvm-svn: 368505
2019-08-10 04:28:52 +00:00