1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

149123 Commits

Author SHA1 Message Date
Matthew Simpson
f9ca5aa639 Make test target-specific
llvm-svn: 303178
2017-05-16 15:33:22 +00:00
Matthew Simpson
2b38ec1ab7 Fix test case to unbreak bots
llvm-svn: 303176
2017-05-16 15:20:27 +00:00
Matthew Simpson
fdeda43e2f [LV] Avoid potentential division by zero when selecting IC
llvm-svn: 303174
2017-05-16 14:43:55 +00:00
Gor Nishanov
e2a5e02b38 [coroutines] Handle unwind edge splitting
Summary:
RewritePHIs algorithm used in building of CoroFrame inserts a placeholder
```
%placeholder = phi [%val]
```
on every edge leading to a block starting with PHI node with multiple incoming edges,
so that if one of the incoming values was spilled and need to be reloaded, we have a
place to insert a reload. We use SplitEdge helper function to split the incoming edge.

SplitEdge function does not deal with unwind edges comping into a block with an EHPad.

This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge.

For landing pads, we clone the landing pad into every edge block and replace the original
landing pad with a PHI collection the values from all incoming landing pads.

For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the
edge blocks.

Reviewers: majnemer, rnk

Reviewed By: majnemer

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D31845

llvm-svn: 303172
2017-05-16 14:11:39 +00:00
George Rimar
a9d15a6b2e [DWARF] - Add RelocAddrEntry for cleanup. NFCi.
Was mentioned as possible cleanup during review of D33184.

llvm-svn: 303171
2017-05-16 14:05:45 +00:00
Igor Breger
7b3a99c110 [GlobalISel][X86] Split memop test file. NFC
llvm-svn: 303169
2017-05-16 13:37:31 +00:00
Chad Rosier
66c9889ab8 Fix an improperly placed curly bracket. NFC.
llvm-svn: 303165
2017-05-16 12:43:23 +00:00
George Rimar
0219cd8ab5 [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.

Suggested during review of D33184.

llvm-svn: 303163
2017-05-16 12:30:59 +00:00
George Rimar
847ba26f13 Revert r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector."
Something went wrong, it broke BB.
http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c

llvm-svn: 303162
2017-05-16 12:05:03 +00:00
George Rimar
cc6169b5bf [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Suggested during review of D33184.

llvm-svn: 303159
2017-05-16 11:54:19 +00:00
James Henderson
2de1ac6a79 [LTO] Print time-passes information at conclusion of LTO codegen
The information collected when requested by -time-passes is only printed when
llvm_shutdown is called at the moment. This means that when linking against the LTO
library dynamically and using the C interface, it is not possible to see the timing
information, because llvm_shutdown cannot be called. This change modifies the LTO
code generation functions for both regular LTO and thin LTO to explicitly print and
reset the timing information.

I have tested that this works with our proprietary linker. However, as this relies
on a specific method of building and linking against the LTO library, I'm not sure
how or if this can be tested in the LLVM testsuite.

Reviewed by: mehdi_amini

Differential Revision: https://reviews.llvm.org/D32803

llvm-svn: 303152
2017-05-16 09:43:21 +00:00
Max Kazantsev
66886e6d12 [SCEV] Fix sorting order for AddRecExprs
The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:

for (...) { // loop1
  op1 = {A,+,B}
}
for (...) { // loop2
  op2 = {A,+,B}
  S = add op1, op2
}

In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.

This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.

Reviewers: sanjoy, reames, apilipenko, anna

Reviewed By: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33121

llvm-svn: 303148
2017-05-16 07:27:06 +00:00
Craig Topper
60604f3f0f [CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC
llvm-svn: 303147
2017-05-16 07:05:38 +00:00
Daniel Berlin
e7cb68c616 NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created
llvm-svn: 303144
2017-05-16 06:06:15 +00:00
Daniel Berlin
877b776e58 NewGVN: Formatting fixes
llvm-svn: 303143
2017-05-16 06:06:12 +00:00
Davide Italiano
c968f4e36b Revert "[NewGVN] Replace predicate info leftovers."
It's breaking the bots.

llvm-svn: 303142
2017-05-16 05:51:21 +00:00
Davide Italiano
08da355c1b [NewGVN] Replace predicate info leftovers.
Fixes PR32945.

Differential Revision:  https://reviews.llvm.org/D33226

llvm-svn: 303141
2017-05-16 05:23:23 +00:00
NAKAMURA Takumi
96f4bdd6f9 AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable]
llvm-svn: 303137
2017-05-16 04:01:23 +00:00
Peter Collingbourne
b218f1407c IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape().
This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.

Differential Revision: https://reviews.llvm.org/D33162

llvm-svn: 303134
2017-05-16 00:39:01 +00:00
Sanjay Patel
3f32289f72 [InstCombine] add tests for PR32791; NFC
llvm-svn: 303133
2017-05-15 23:59:28 +00:00
Francis Visoiu Mistrih
6212dd268d [ShrinkWrapping] Handle restores on no-return paths
Shrink-wrapping uses post-dominators to find a restore point that
post-dominates all the uses of CSR / stack.

The way dominator trees are modeled in LLVM today is that unreachable
blocks are not present in a generic dominator tree, so, an unreachable node is
dominated by anything: include/llvm/Support/GenericDomTree.h:467.

Since for post-dominators, a no-return block is considered
"unreachable", calling findNearestCommonDominator on an unreachable node
A and a non-unreachable node B, will return B, which can be false. If we
find such node, we bail out since there is no good restore point
available.

rdar://problem/30186931

llvm-svn: 303130
2017-05-15 23:13:35 +00:00
Kostya Serebryany
44299ec1d0 [libFuzzer] fix tests on Windows
llvm-svn: 303128
2017-05-15 22:55:00 +00:00
Sanjay Patel
6a5f74bb94 [InstSimplify] add tests for unnecessary mask of shifted values; NFC
llvm-svn: 303127
2017-05-15 22:54:37 +00:00
Xinliang David Li
ad47f7247e Fix memory leak
llvm-svn: 303126
2017-05-15 22:43:52 +00:00
Kostya Serebryany
c641303c34 [libFuzzer] improve the afl driver and it's tests. Make it possible to run individual inputs with afl driver
llvm-svn: 303125
2017-05-15 22:38:29 +00:00
Rui Ueyama
93bd489654 Fix git command line in the Getting Started guide.
By default, git creates "llvm-project-20170507" directory,
but we want to create "llvm-project" directory.

llvm-svn: 303124
2017-05-15 22:32:34 +00:00
Justin Bogner
e9f4448b87 Add "REQUIRES:" to the last few tests that use target specific intrinsics
llvm-svn: 303123
2017-05-15 22:15:22 +00:00
Davide Italiano
07480546a7 [AMDGPU] Kill now unused phiInfoElementGetDebugLoc(). NFCI.
llvm-svn: 303122
2017-05-15 22:10:15 +00:00
Craig Topper
caecd51865 [APInt] Simplify a for loop initialization based on the fact that 'n' is known to be 1 by an earlier 'if'.
llvm-svn: 303120
2017-05-15 22:01:03 +00:00
Eugene Zelenko
659a1e517a [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).
llvm-svn: 303119
2017-05-15 21:57:41 +00:00
Tim Northover
cfd3f40830 AArch64: use linker-private symbols for globals in MachO.
We don't use section-relative relocations on AArch64, so all symbols must be at
least visible to the linker (i.e. properly global or l_whatever, but not
L_whatever).

llvm-svn: 303118
2017-05-15 21:51:38 +00:00
David Blaikie
cbf90e5613 PR32288: Describe a bool parameter's DWARF location with a simple register
There's no need (& a bit incorrect) to mask off the high bits of the
register reference when describing a simple bool value.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D31062

llvm-svn: 303117
2017-05-15 21:34:01 +00:00
Adam Nemet
f9607f0660 [SLP] Enable 64-bit wide vectorization on AArch64
ARM Neon has native support for half-sized vector registers (64 bits).  This
is beneficial for example for 2D and 3D graphics.  This patch adds the option
to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer.

*** Performance Analysis

This change was motivated by some internal benchmarks but it is also
beneficial on SPEC and the LLVM testsuite.

The results are with -O3 and PGO.  A negative percentage is an improvement.
The testsuite was run with a sample size of 4.

** SPEC

* CFP2006/482.sphinx3  -3.34%

A pretty hot loop is SLP vectorized resulting in nice instruction reduction.
This used to be a +22% regression before rL299482.

* CFP2000/177.mesa     -3.34%
* CINT2000/256.bzip2   +6.97%

My current plan is to extend the fix in rL299482 to i16 which brings the
regression down to +2.5%.  There are also other problems with the codegen in
this loop so there is further room for improvement.

** LLVM testsuite

* SingleSource/Benchmarks/Misc/ReedSolomon               -10.75%

There are multiple small SLP vectorizations outside the hot code.  It's a bit
surprising that it adds up to 10%.  Some of this may be code-layout noise.

* MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40%

The opt-viewer screenshot can be seen at F3218284.  We start at a colder store
but the tree leads us into the hottest loop.

* MultiSource/Applications/lambda-0.1.3/lambda            -2.68%
* MultiSource/Benchmarks/Bullet/bullet                    -2.18%

This is using 3D vectors.

* SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67%

Noise, binary is unchanged.

* MultiSource/Benchmarks/Ptrdist/anagram/anagram          +4.90%

There is an additional SLP in the cold code.  The test runs for ~1sec and
prints out over 2000 lines. This is most likely noise.

* MultiSource/Applications/aha/aha                        +1.63%
* MultiSource/Applications/JM/lencod/lencod               +1.41%
* SingleSource/Benchmarks/Misc/richards_benchmark         +1.15%

Differential Revision: https://reviews.llvm.org/D31965

llvm-svn: 303116
2017-05-15 21:15:01 +00:00
Hans Wennborg
247e13c637 Revert r302678 "[AArch64] Enable use of reduction intrinsics."
This caused PR33053.

Original commit message:

> The new experimental reduction intrinsics can now be used, so I'm enabling this
> for AArch64. We will need this for SVE anyway, so it makes sense to do this for
> NEON reductions as well.
>
> The existing code to match shufflevector patterns are replaced with a direct
> lowering of the reductions to AArch64-specific nodes. Tests updated with the
> new, simpler, representation.
>
> Differential Revision: https://reviews.llvm.org/D32247

llvm-svn: 303115
2017-05-15 20:59:32 +00:00
Evgeniy Stepanov
89f3e2930b [asan] Better workaround for gold PR19002.
See the comment for more details. Test in a follow-up CFE commit.

llvm-svn: 303113
2017-05-15 20:43:42 +00:00
Jan Sjodin
6fb09ca5d3 Re-submit AMDGPUMachineCFGStructurizer.
Differential Revision: https://reviews.llvm.org/D23209

llvm-svn: 303111
2017-05-15 20:18:37 +00:00
Tim Northover
77c86e2d11 AArch64: diagnose unrecognized features in .cpu directive.
We were silently ignoring any features we couldn't match up, which led to
errors in an inline asm block missing the conventional "\n\t".

llvm-svn: 303108
2017-05-15 19:42:15 +00:00
Davide Italiano
8d0d993d42 [NewGVN] Remove unused setDefiningExpr(). NFCI.
llvm-svn: 303107
2017-05-15 19:35:40 +00:00
Sanjay Patel
612c21f9a9 [InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949)
This is the InstCombine counterpart to D32954. 
I added some comments about the code duplication in:
rL302436

Alive-based verification:
http://rise4fun.com/Alive/dPw

This is a 2nd fix for the problem reported in:
https://bugs.llvm.org/show_bug.cgi?id=32949

Differential Revision: https://reviews.llvm.org/D32970

llvm-svn: 303105
2017-05-15 19:27:53 +00:00
Sanjay Patel
6116bcb0ac [InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949)
These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving:
https://bugs.llvm.org/show_bug.cgi?id=9343

As shown here:
http://rise4fun.com/Alive/C8
...however, the sdiv exact case needs a stronger predicate.

I opted for duplicated code instead of adding another fallthrough because I think that's 
easier to read (and edit in case we need/want to restrict/loosen the predicates any more).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=32949
https://bugs.llvm.org/show_bug.cgi?id=32948

Differential Revision: https://reviews.llvm.org/D32954

llvm-svn: 303104
2017-05-15 19:16:49 +00:00
Evgeny Stupachenko
d11ab9e578 The patch adds CTLZ idiom recognition.
Summary:

The following loops should be recognized:
i = 0;
while (n) {
  n = n >> 1;
  i++;
  body();
}
use(i);

And replaced with builtin_ctlz(n) if body() is empty or
for CPUs that have CTLZ instruction converted to countable:

for (j = 0; j < builtin_ctlz(n); j++) {
  n = n >> 1;
  i++;
  body();
}
use(builtin_ctlz(n));

Reviewers: rengolin, joerg

Differential Revision: http://reviews.llvm.org/D32605

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 303102
2017-05-15 19:08:56 +00:00
Davide Italiano
a3335b259b [NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency().
verifyMemoryCongruency() filters out trivially dead MemoryDef(s),
as we find them immediately dead, before moving from TOP to a new
congruence class.
This fixes the same problem for PHI(s) skipping MemoryPhis if all
the operands are dead.

Differential Revision:  https://reviews.llvm.org/D33044

llvm-svn: 303100
2017-05-15 18:50:53 +00:00
Geoff Berry
db69675c48 [AArch64][Falkor] Fix sched details for FMOV
llvm-svn: 303099
2017-05-15 18:50:22 +00:00
Jan Sjodin
4f52af0e05 Revert 303091.
llvm-svn: 303098
2017-05-15 18:39:47 +00:00
Teresa Johnson
b00b861ff8 Add support for handling ifuncs to GlobalValue::getBaseObject
Summary:
All GlobalIndirectSymbol types (not just GlobalAlias) should return
their base object.

Without this patch LTO would warn "Unable to determine comdat of
alias!" for an ifunc.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33202

llvm-svn: 303096
2017-05-15 18:28:29 +00:00
Craig Topper
7b02caa3a3 [SCEV] Use copy initialization of APInts instead of direct initialization.
This is based on post commit feed back from r302769.

llvm-svn: 303092
2017-05-15 18:14:16 +00:00
Jan Sjodin
53e05436a9 Add AMDGPUMachineCFGStructurizer.
Differential Revision: https://reviews.llvm.org/D23209

llvm-svn: 303091
2017-05-15 18:13:56 +00:00
Sanjay Patel
394d9de5e2 [InstCombine] use m_OneUse to reduce code; NFCI
llvm-svn: 303090
2017-05-15 18:08:17 +00:00
Kostya Serebryany
fcb4370c0a [libFuzzer] fix a warning from Wunreachable-code-loop-increment reported by Christian Holler. This also fixes a logical bug, which however does not affect the libFuzzer's ability too much (I wasn't able to create a differentiating test)
llvm-svn: 303087
2017-05-15 17:39:42 +00:00
Kyle Butt
0bcf661a2a CodeGen: BlockPlacement: Increase tail duplication size for O3.
At O3 we are more willing to increase size if we believe it will improve
performance. The current threshold for tail-duplication of 2 instructions is
conservative, and can be relaxed at O3.

Benchmark results:
llvm test-suite:
6% improvement in aha, due to duplication of loop latch
3% improvement in hexxagon

2% slowdown in lpbench. Seems related, but couldn't completely diagnose.

Internal google benchmark:
Produces 4% improvement on internal google protocol buffer serialization
benchmarks.

Differential-Revision: https://reviews.llvm.org/D32324
llvm-svn: 303084
2017-05-15 17:30:47 +00:00