1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

48176 Commits

Author SHA1 Message Date
Yaxun Liu
db879213ed Fix assembler for alloca of multiple elements in non-zero addr space
Currently llvm assembler emits parsing error for valid IR assembly

alloca i32, i32 9, addrspace(5)
when alloca addr space is 5.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D38713

llvm-svn: 315791
2017-10-14 03:23:18 +00:00
Daniel Sanders
d400a290e5 [globalisel][tablegen] Simplify named operand/operator lookups and fix a wrong-code bug this revealed.
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.

Depends on D36569

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: aemerson, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36618

llvm-svn: 315780
2017-10-14 00:31:58 +00:00
Craig Topper
97770df374 [X86] Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is available
This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations.

For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP.

We may be able to use this for AVX1 as well which would allow us to remove more isel patterns.

I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP.

Differential Revision: https://reviews.llvm.org/D38836

llvm-svn: 315768
2017-10-13 21:56:48 +00:00
Craig Topper
e302360d53 [X86] Use fsub in the movddup scheduling tests to prevent a future patch from folding movddup as a broadcast load.
llvm-svn: 315767
2017-10-13 21:56:45 +00:00
Daniel Sanders
eb33df745e [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36569

llvm-svn: 315761
2017-10-13 21:28:03 +00:00
Matt Arsenault
083a3bce83 AMDGPU: Implement hasBitPreservingFPLogic
llvm-svn: 315754
2017-10-13 21:10:22 +00:00
Peter Collingbourne
e0db0dbafd LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias.
It is possible for both a base and a derived class to be satisfied
with a unique vtable. If a program contains casts of the same pointer
to both of those types, the CFI checks will be lowered to this
(with ThinLTO):

if (p != &__typeid_base_global_addr)
  trap();
if (p != &__typeid_derived_global_addr)
  trap();

The optimizer may then use the first condition combined
with the assumption that __typeid_base_global_addr and
__typeid_derived_global_addr may not alias to optimize away the second
comparison, resulting in an unconditional trap.

This patch fixes the bug by giving imported globals the type [0 x i8]*,
which prevents the optimizer from assuming that they do not alias.

Differential Revision: https://reviews.llvm.org/D38873

llvm-svn: 315753
2017-10-13 21:02:16 +00:00
Sanjay Patel
0209a3929b [Reassociate] auto-generate better checks; NFC
These would fail if the created variable names changed.

llvm-svn: 315752
2017-10-13 20:56:35 +00:00
Matt Arsenault
46d9fa7220 AMDGPU: Look for src mods before fp_extend
When selecting modifiers for mad_mix instructions,
look at fneg/fabs that occur before the conversion.

llvm-svn: 315748
2017-10-13 20:45:49 +00:00
Sanjay Patel
a6b70ea7c0 [InstCombine] move code to remove repeated constant check; NFCI
Also, consolidate tests for this fold in one place.

llvm-svn: 315745
2017-10-13 20:29:11 +00:00
Matt Arsenault
adee5d7b69 AMDGPU: Implement isFPExtFoldable
This helps match v_mad_mix* in some cases.

llvm-svn: 315744
2017-10-13 20:18:59 +00:00
Krzysztof Parzyszek
fb2c0df237 [Hexagon] Minimize number of repeated constant extenders
Each constant extender requires an extra instruction, which adds to the
code size and also reduces the number of available slots in an instruction
packet. In most cases, the value of a repeated constant extender could be
loaded into a register, and the instructions using the extender could be
replaced with their counterparts that use that register instead.

This patch adds a pass that tries to reduce the number of constant
extenders, including extenders which differ only in an immediate offset
known at compile time, e.g. @global and @global+12.

llvm-svn: 315735
2017-10-13 19:02:59 +00:00
Craig Topper
da7db73bdc [X86] Add initial skeleton support for knm cpu
This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it.

Differential Revision: https://reviews.llvm.org/D38811

llvm-svn: 315722
2017-10-13 18:10:17 +00:00
Sanjay Patel
d9c40769a5 [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions
llvm-svn: 315718
2017-10-13 17:47:25 +00:00
Sanjay Patel
4531168c33 [InstCombine] add tests for add (zext (add nuw X, C2)), C --> zext (add nuw X, C2 + C); NFC
llvm-svn: 315717
2017-10-13 17:42:12 +00:00
Max Moroz
b7a95bc5e9 [llvm-cov] Reland sources-specified.test with addition of "-path-equivalence".
Summary: This version of tests should be working properly.

Reviewers: vsk

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D38889

llvm-svn: 315714
2017-10-13 17:27:39 +00:00
Simon Pilgrim
76e0c2b01f [X86] Test scalar integer absolutes on 32-bit targets with/without CMOV
llvm-svn: 315711
2017-10-13 17:09:20 +00:00
Reid Kleckner
5ca47fa941 Not all buildbots seem to dump the nuw flag in SDAG
llvm-svn: 315710
2017-10-13 17:00:49 +00:00
Simon Pilgrim
c147d37d65 [X86] Updated scalar integer absolute tests to cover i8/i16/i32/i64
llvm-svn: 315706
2017-10-13 16:53:07 +00:00
Sanjay Patel
dfbf86bd23 [InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types
The backend should be prepared for this transform after:
https://reviews.llvm.org/rL311731

llvm-svn: 315701
2017-10-13 16:29:38 +00:00
Reid Kleckner
6c1d1da003 Update test to expect nuw flag in SDAG dump, fixes test after r315690
llvm-svn: 315698
2017-10-13 16:13:23 +00:00
Daniel Neilson
2ff4466d0b [RS4GC] Look through vector bitcasts when looking for base pointer
Summary:
 In RS4GC it is possible that a base pointer is contained in a vector that
has undergone a bitcast from one element-pointertype to another. We teach
RS4GC how to look through bitcasts of vector types when looking for a base
pointer.

Reviewers: anna

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38849

llvm-svn: 315694
2017-10-13 15:59:13 +00:00
Max Moroz
30d967dfc0 [llvm-cov] Temporary delete sources-specified.test, it is failing on some bots.
Summary: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/5950/steps/test-stage1-compiler/logs/stdio

Reviewers: vsk, Dor1s

Reviewed By: Dor1s

Subscribers: mehdi_amini

Differential Revision: https://reviews.llvm.org/D38888

llvm-svn: 315693
2017-10-13 15:58:58 +00:00
Krzysztof Parzyszek
b3ac829dc5 [Hexagon] Add patterns for cmpb/cmph with immediate arguments
Patch by Sumanth Gundapaneni.

llvm-svn: 315692
2017-10-13 15:43:12 +00:00
Max Moroz
4bc4d86ade [llvm-cov] Fix sources-specified.test so it ignores the order of files printed.
Summary: https://reviews.llvm.org/D38884#896964

Reviewers: vsk, Dor1s

Reviewed By: Dor1s

Differential Revision: https://reviews.llvm.org/D38887

llvm-svn: 315691
2017-10-13 15:41:51 +00:00
Max Moroz
1d9817aa71 [llvm-cov] An attempt to fix sources_specified.test failing on some buildbots.
Summary: https://reviews.llvm.org/rL315685#115380

Reviewers: vsk, Dor1s

Reviewed By: Dor1s

Differential Revision: https://reviews.llvm.org/D38884

llvm-svn: 315687
2017-10-13 15:30:24 +00:00
Max Moroz
80a27c3a17 [llvm-cov] Generate "report" for given source paths if sources are specified.
Summary:
Documentation says that user can specify sources for both "show" and
"report" commands. "Show" command respects specified sources, but "report" does
not. It is useful to have both "show" and "report" generated for specified
sources. Also added tests to for both commands with sources specified.

Reviewers: vsk, kcc

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D38860

llvm-svn: 315685
2017-10-13 14:44:51 +00:00
Jonas Devlieghere
4d22ee666b Re-land "[dsymutil] Timestmap verification for __swift_ast"
This patch adds timestamp verification for swiftmodule files. A new flag
is provided to allows us to disable this check in order to allow testing
of this feature.

Differential revision: https://reviews.llvm.org/D38686

llvm-svn: 315684
2017-10-13 14:41:23 +00:00
Anna Thomas
b4afc533f7 [SCEV] Teach SCEV to find maxBECount when loop endbound is variant
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.

This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.

This provides more information to users of SCEV and can be used to
improve identification of finite loops.

Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38825

llvm-svn: 315683
2017-10-13 14:30:43 +00:00
Sanjay Patel
2f4fc5080e [InstCombine] add tests for boolean extend + add; NFC
llvm-svn: 315681
2017-10-13 14:09:45 +00:00
Daniel Jasper
3863a9d4f9 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

llvm-svn: 315680
2017-10-13 14:04:21 +00:00
Craig Topper
543c89beee [X86] Remove patterns that select unmasked vbroadcastf2x32/vbroadcasti2x32. Prefer vbroadcastsd/vpbroadcastq instead.
There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table.

llvm-svn: 315674
2017-10-13 06:07:10 +00:00
Craig Topper
4ad72ddd85 [X86] Add the test case for r315613 that I forgot to 'git add'.
llvm-svn: 315649
2017-10-13 00:20:47 +00:00
Matt Morehouse
e184e29f2e [llvm-isel-fuzzer] Use "--" as separator rather than '='.
Summary: OSS-Fuzz doesn't support '=' in filenames.

Reviewers: bogner, kcc

Reviewed By: kcc

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38866

llvm-svn: 315647
2017-10-13 00:18:32 +00:00
Justin Bogner
a4398c87a4 llvm-isel-fuzzer: Use the right REQUIRES line for r315599
I'd mixed up ENABLE_SHARED and BUILD_SHARED_LIBS before, so these
tests were being disabled in too many places.

llvm-svn: 315646
2017-10-13 00:17:54 +00:00
Anna Thomas
8906552cf1 [CVP] Process binary operations even when def is local
Summary:
This patch adds processing of binary operations when the def of operands are in
the same block (i.e. local processing).

Earlier we bailed out in such cases (the bail out was introduced in rL252032)
because LVI at that time was more precise about context at the end of basic
blocks, which implied local def and use analysis didn't benefit CVP.

Since then we've added support for LVI in presence of assumes and guards. The
test cases added show how local def processing in CVP helps adding more
information to the ashr, sdiv, srem and add operators.

Note: processCmp which suffers from the same problem will
be handled in a later patch.

Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38766

llvm-svn: 315634
2017-10-12 22:39:52 +00:00
Artur Pilipenko
04bd1d7c25 [LoopPredication] Check whether the loop is already guarded by the first iteration check condition
llvm-svn: 315623
2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes
00a4cd9317 Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP.""
This reverts commit r315593: still affect two bots:

http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/

llvm-svn: 315618
2017-10-12 20:52:34 +00:00
Artur Pilipenko
c44c3f779d [LoopPredication] Support ule, sle latch predicates
This is a follow up for the loop predication change 313981 to support ule, sle latch predicates.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D38177

llvm-svn: 315616
2017-10-12 20:40:27 +00:00
Wei Ding
725c541dbe Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

llvm-svn: 315610
2017-10-12 19:37:14 +00:00
Craig Topper
9c484ab111 [X86] Add a bunch of -mcpu strings to the cpus.ll test.
We were missing most of the "core" aliases as well as skylake, cannonlake, and knights landing.

llvm-svn: 315606
2017-10-12 18:55:57 +00:00
Artem Belevich
848056d8ad [NVPTX] Implemented wmma intrinsics and instructions.
WMMA = "Warp Level Matrix Multiply-Accumulate".
These are the new instructions introduced in PTX6.0 and available
on sm_70 GPUs.

Differential Revision: https://reviews.llvm.org/D38645

llvm-svn: 315601
2017-10-12 18:27:55 +00:00
Reid Kleckner
9f8f153184 [codeview] Don't emit FPO data in funclet prologues
Attempt 3 to work around bugs in FPO data with funclets.

llvm-svn: 315600
2017-10-12 18:20:35 +00:00
Justin Bogner
8ad5a761f8 llvm-isel-fuzzer: Work around BUILD_SHARED_LIBS testing issues
Building with BUILD_SHARED_LIBS makes it tricky to copy around
executables at will, since they won't be able to find the LLVM
libraries any more. This makes testing a feature that's based on the
executable name problematic, so we'll just disable these two tests in
that configuration.

We could potentially fix this by symlinking the lib directory into the
test directory, but that wouldn't work on windows, and losing testing
on windows would be far worse than losing testing on a configuration
that's barely even supported.

llvm-svn: 315599
2017-10-12 18:10:22 +00:00
Artem Belevich
db5b3a9018 [TableGen] Allow intrinsics to have up to 8 return values.
Differential Revision: https://reviews.llvm.org/D38633

llvm-svn: 315598
2017-10-12 17:40:00 +00:00
Sanjay Patel
0ac027152c [ValueTracking] return zero when there's conflict in known bits of a shift (PR34838)
Poison allows us to return a better result than undef.

llvm-svn: 315595
2017-10-12 17:31:46 +00:00
Bruno Cardoso Lopes
cc3be5ce55 Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."
This is r315288 & r315294, which were reverted due to stage2 bot
failures.

Summary:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to `ret i32 2` with
this change

  source_filename = "sccp.c"
  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  ; Function Attrs: norecurse nounwind readnone uwtable
  define i32 @main() local_unnamed_addr #0 {
  entry:
    %call = tail call fastcc i32 @f(i32 1)
    %call1 = tail call fastcc i32 @f(i32 47)
    %add3 = add nsw i32 %call, %call1
    ret i32 %add3
  }

  ; Function Attrs: noinline norecurse nounwind readnone uwtable
  define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
  entry:
    %c1 = icmp sle i32 %x, 100

    %cmp = icmp sgt i32 %x, 300
    %. = select i1 %cmp, i32 1, i32 2
    ret i32 %.
  }

  attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 315593
2017-10-12 16:54:11 +00:00
Lei Huang
7e1b2bd8a7 [PowerPC] Add profitablilty check for conversion to mtctr loops
Add profitability checks for modifying counted loops to use the mtctr instruction.

The latency of mtctr is only justified if there are more than 4 comparisons that
will be removed as a result.  Usually counted loops are formed relatively early
and before unrolling, so most low trip count loops often don't survive.  However
we want to ensure that if they do, we do not mistakenly update them to mtctr loops.

Use CodeMetrics to ensure we are only doing this for small loops with small trip counts.

Differential Revision: https://reviews.llvm.org/D38212

llvm-svn: 315592
2017-10-12 16:43:33 +00:00
Tim Renouf
790653e3d0 [AMDGPU] For amdpal, widen interpolation mode workaround
Summary:
The interpolation mode workaround ensures that at least one
interpolation mode is enabled in PSInputAddr. It does not also check
PSInputEna on the basis that the user might enable bits in that
depending on run-time state.

However, for amdpal os type, the user does not enable some bits after
compilation based on run-time states; the register values being
generated here are the final ones set in the hardware. Therefore, apply
the workaround to PSInputAddr and PSInputEnable together. (The case
where a bit is set in PSInputAddr but not in PSInputEnable is where the
frontend set up an input arg for a particular interpolation mode, but
nothing uses that input arg. Really we should have an earlier pass that
removes such an arg.)

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37758

llvm-svn: 315591
2017-10-12 16:16:41 +00:00
Mikael Holmen
bc25b9378a [RegisterCoalescer] Don't set read-undef in pruneValues, only clear
Summary:
The comments in the code said

 // Remove <def,read-undef> flags. This def is now a partial redef.

but the code didn't just remove read-undef, it could introduce new ones which
could cause errors.

E.g. if we have something like

%vreg1<def> = IMPLICIT_DEF
%vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4
%vreg2:subreg2<def> = op %vreg6, %vreg7

and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg
def, which the old code did.

Now we solve this by actually do what the code comment says. We remove
read-undef flags rather than remove or introduce them.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38616

llvm-svn: 315564
2017-10-12 06:21:28 +00:00