1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

14527 Commits

Author SHA1 Message Date
Matthew Simpson
a066e2421a [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

llvm-svn: 262624
2016-03-03 16:12:01 +00:00
Amaury Sechet
09dedb75e8 Explode store of arrays in instcombine
Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.

Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17828

llvm-svn: 262530
2016-03-02 22:36:45 +00:00
Amaury Sechet
7169395ec4 Unpack array of all sizes in InstCombine
Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.

Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15890

llvm-svn: 262521
2016-03-02 21:28:30 +00:00
Daniel Berlin
e9b459fbed Really fix ASAN leak/etc issues with MemorySSA unittests
llvm-svn: 262519
2016-03-02 21:16:28 +00:00
Daniel Berlin
c49f8e9e7a Revert "Fix ASAN detected errors in code and test" (it was not meant to be committed yet)
This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95.

llvm-svn: 262512
2016-03-02 20:36:22 +00:00
Daniel Berlin
1e51dad27e Fix ASAN detected errors in code and test
llvm-svn: 262511
2016-03-02 20:27:29 +00:00
Chandler Carruth
e597ed0112 [AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.

Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Differential Revision: http://reviews.llvm.org/D17329

llvm-svn: 262490
2016-03-02 15:56:53 +00:00
George Burgess IV
872ab3815a Attempt to fix ASAN failure in a MemorySSA test.
llvm-svn: 262452
2016-03-02 02:35:04 +00:00
Sanjay Patel
afbc4f1551 revert r262424 because there's a *clang test* for AArch64 that checks -O3 asm output
that is broken by this change

llvm-svn: 262440
2016-03-02 01:04:09 +00:00
Sanjay Patel
7266aa16e2 [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701)
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case. 

Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.

But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
 

llvm-svn: 262424
2016-03-01 23:55:18 +00:00
Dehao Chen
9581e2d337 Perform InstructioinCombiningPass before SampleProfile pass.
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17742

llvm-svn: 262419
2016-03-01 22:53:02 +00:00
Owen Anderson
999a9f171c Fix an issue where fast math flags were dropped during scalarization.
Most portions of InstCombine properly propagate fast math flags, but
apparently the vector scalarization section was overlooked.

llvm-svn: 262376
2016-03-01 19:35:52 +00:00
Daniel Berlin
07e4e63d22 Add the beginnings of an update API for preserving MemorySSA
Summary:
This adds the beginning of an update API to preserve MemorySSA.  In particular,
this patch adds a way to remove memory SSA accesses when instructions are
deleted.

It also adds relevant unit testing infrastructure for MemorySSA's API.

(There is an actual user of this API, i will make that diff dependent on this one.  In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P)

Reviewers: hfinkel, reames, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17157

llvm-svn: 262362
2016-03-01 18:46:54 +00:00
Petar Jovanovic
0302dd0999 Revert "calculate builtin_object_size if argument is a removable pointer"
Revert r262337 as "check-llvm ubsan" step failed on
sanitizer-x86_64-linux-fast buildbot.

llvm-svn: 262349
2016-03-01 16:50:08 +00:00
Petar Jovanovic
a1fb751763 calculate builtin_object_size if argument is a removable pointer
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D17337

llvm-svn: 262337
2016-03-01 14:39:55 +00:00
Sanjay Patel
aee22b5eed [x86, InstCombine] transform more x86 masked loads to LLVM intrinsics
Continuation of:
http://reviews.llvm.org/rL262269

llvm-svn: 262273
2016-02-29 23:59:00 +00:00
Adam Nemet
5ed52a9810 [LLE] Fix a comment
llvm-svn: 262270
2016-02-29 23:21:12 +00:00
Sanjay Patel
4d045a2e91 [x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145

is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:

__m128 mload_zeros(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(0));
}

__m128 mload_fakeones(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(1));
}

__m128 mload_ones(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}

__m128 mload_oneset(float *f) {
  return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}

...so none of the above will actually generate a masked load for optimized code.

This is the masked load counterpart to:
http://reviews.llvm.org/rL262064

llvm-svn: 262269
2016-02-29 23:16:48 +00:00
Adam Nemet
6307c054b4 [LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with Polly
We can actually have dependences between accesses with different
underlying types.  Bail in this case.

A test will follow shortly.

llvm-svn: 262267
2016-02-29 22:53:59 +00:00
Adam Nemet
edfef38d92 Enable LoopLoadElimination by default
Summary:
I re-benchmarked this and results are similar to original results in
D13259:

On ARM64:
  SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27%
  SingleSource/Benchmarks/Polybench/stencils/adi                   -19.78%

On x86:
  SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog  -27.14%

And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop
Distribution.

In terms of compile time, there is ~5% increase on both
SingleSource/Benchmarks/Misc/oourafft and
SingleSource/Benchmarks/Linkpack/linkpack-pc.  These are both very tiny
loop-intensive programs where SCEV computations dominates compile time.

The reason that time spent in SCEV increases has to do with the design
of the old pass manager.  If a transform pass does not preserve an
analysis we *invalidate* the analysis even if there was *no*
modification made by the transform pass.

This means that currently we don't take advantage of LLE and LV sharing
the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV
for LLE.

(There should be a way to work around this limitation in the case of
SCEV and LAA since both compute things on demand and internally cache
their result.  Thus we could pretend that transform passes preserve
these analyses and manually invalidate them upon actual modification.
On the other hand the new pass manager is supposed to solve so I am not
sure if this is worthwhile.)

Reviewers: hfinkel, dberlin

Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16300

llvm-svn: 262250
2016-02-29 20:35:11 +00:00
Rong Xu
56b06540c0 Minor code cleanup. NFC
llvm-svn: 262242
2016-02-29 19:16:04 +00:00
Dehao Chen
a952200cd8 Move discriminator assignment to the right place.
Summary: Now discriminator is assigned per-function instead of per-module.

Reviewers: davidxl, dnovillo

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D17664

llvm-svn: 262240
2016-02-29 18:59:48 +00:00
Xinliang David Li
629bea6ef6 [PGO] Remove redundant counter copies for avail_extern functions.
Differential Revision: http://reviews.llvm.org/D17654

llvm-svn: 262157
2016-02-27 23:11:30 +00:00
Renato Golin
e3031ad0db Revert "[sancov] do not instrument nodes that are full pre-dominators"
This reverts commit r262103, as it broke all ARM and AArch64 bots.

llvm-svn: 262139
2016-02-27 14:19:19 +00:00
Sean Silva
460bbafe5e [instrprof] Use __{start,stop}_SECNAME on PS4 too.
Summary:
The PS4 linker seems to handle this fine.

Hi David, it seems that indeed most ELF linkers support
__{start,stop}_SECNAME, as our proprietary linker does as well.

This follows the pattern of r250679 w.r.t. the testing.

Maggie, Phillip, Paul: I've tested this with the PS4 SDK 3.5 toolchain
prerelease and it seems to work fine.

Reviewers: davidxl

Subscribers: probinson, phillip.power, MaggieYi

Differential Revision: http://reviews.llvm.org/D17672

llvm-svn: 262112
2016-02-27 06:01:26 +00:00
Mike Aizatsky
5bc3e758d6 [sancov] properly initializing pass.
llvm-svn: 262111
2016-02-27 05:50:40 +00:00
Kostya Serebryany
2e7a6d59f7 [libFuzzer] don't emit callbacks to sanitizer run-time in -fsanitize-coverage=trace-pc mode; update libFuzzer doc for previous commit
llvm-svn: 262110
2016-02-27 05:45:12 +00:00
Chandler Carruth
4def732ecd [LICM] Teach LICM how to handle cases where the alias set tracker was
merged into a loop that was subsequently unrolled (or otherwise nuked).

In this case it can't merge in the ASTs for any remaining nested loops,
it needs to re-add their instructions dircetly.

The fix is very isolated, but I've pulled the code for merging blocks
into the AST into a single place in the process. The only behavior
change is in the case which would have crashed before.

This fixes a crash reported by Mikael Holmen on the list after r261316
restored much of the loop pass pipelining and allowed us to actually do
this kind of nested transformation sequenc. I've taken that test case
and further reduced it into the somewhat twisty maze of loops in the
included test case. This does in fact trigger the bug even in this
reduced form.

llvm-svn: 262108
2016-02-27 04:34:07 +00:00
Mike Aizatsky
efd9fa98d1 [sancov] do not instrument nodes that are full pre-dominators
Summary:
Without tree pruning clang has 2,667,552 points.
Wiht only dominators pruning: 1,515,586.
With both dominators & predominators pruning: 1,340,534.

Differential Revision: http://reviews.llvm.org/D17671

llvm-svn: 262103
2016-02-27 02:10:27 +00:00
Reid Kleckner
43aeb743c9 [InstCombine] Be more conservative about removing stackrestore
We ended up removing a save/restore pair around an inalloca call,
leading to a miscompile in Chromium.

llvm-svn: 262095
2016-02-27 00:53:54 +00:00
Sanjay Patel
014ad4d329 [x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsics
Replicate everything for integers...because x86.

Continuation of:
http://reviews.llvm.org/rL262064

llvm-svn: 262077
2016-02-26 21:51:44 +00:00
Sanjay Patel
d5185a0950 [x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145

is that customers using the AVX intrinsics in C will benefit from combines when
the store mask is constant:

void mstore_zero_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(0), v);
}

void mstore_fake_ones_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(1), v);
}

void mstore_ones_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v);
}

void mstore_one_set_elt_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v);
}

...so none of the above will actually generate a masked store for optimized code.

Differential Revision: http://reviews.llvm.org/D17485

llvm-svn: 262064
2016-02-26 21:04:14 +00:00
Haicheng Wu
5511101c3f [JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()
This change tries to find more opportunities to thread over basic blocks.

llvm-svn: 261981
2016-02-26 06:06:04 +00:00
Michael Zolotukhin
4826fdb5cd [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating.
Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect.

Reviewers: chandlerc, hfinkel

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17632

llvm-svn: 261958
2016-02-26 02:57:05 +00:00
Mike Aizatsky
b65bb6da73 [sancov] Pruning full dominator blocks from instrumentation.
Summary:
This is the first simple attempt to reduce number of coverage-
instrumented blocks.

If a basic block dominates all its successors, then its coverage
information is useless to us. Ingore such blocks if
santizer-coverage-prune-tree option is set.

Differential Revision: http://reviews.llvm.org/D17626

llvm-svn: 261949
2016-02-26 01:17:22 +00:00
Anna Zaks
22dfbced29 [asan] Do not instrument globals in the special "LLVM" sections
llvm-svn: 261794
2016-02-24 22:12:18 +00:00
David Majnemer
f106ac635e [SimplifyCFG] Use a more elegant solution than r261731
The cleanupret instruction has an invariant that it's 'from' operand be
a cleanuppad.  This invariant was violated when we removed a dead block
which removed a cleanuppad leaving behind a cleanupret with an undef
'from' operand.

This was solved in r261731 by staving off the removal of the dead block
to a later pass.

However, it occured to me that we do not need to do this.
Instead, we can simply avoid processing the cleanupret if it has an
undef 'from' operand because we know that it will be removed soon.

llvm-svn: 261754
2016-02-24 17:30:48 +00:00
Sanjay Patel
97801c676a [InstCombine] enable optimization of casted vector xor instructions
This is part of the payoff for the refactoring in:
http://reviews.llvm.org/rL261649
http://reviews.llvm.org/rL261707

In addition to removing a pile of duplicated code, the xor case was
missing the optimization for vector types because it checked
"SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()"
like 'and' and 'or' were already doing.

This solves part of:
https://llvm.org/bugs/show_bug.cgi?id=26702

llvm-svn: 261750
2016-02-24 17:00:34 +00:00
Artur Pilipenko
5746dd289e NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
David Majnemer
f4887d90a6 [SimplifyCFG] Do not blindly remove unreachable blocks
DeleteDeadBlock was called indiscriminately, leading to cleanuprets with
undef cleanuppad references.

Instead, try to drain the BB of most of it's instructions if it is
unreachable.  We can then remove the BB if it solely consists of a
terminator (and maybe some phis).

llvm-svn: 261731
2016-02-24 10:02:16 +00:00
Sanjay Patel
4f966611aa [InstCombine] refactor visitOr() to use foldCastedBitwiseLogic()
Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra
check from the nearly identical 'or' case:
  if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src))

But I'm not sure how to expose that difference in a regression test. 
Without that check, the 'or' path will infinite loop on:
test/Transforms/InstCombine/zext-or-icmp.ll
because the zext-or-icmp fold is attempting a reverse transform.

The refactoring should extend to the 'xor' case next to solve part of
PR26702.

llvm-svn: 261707
2016-02-23 23:56:23 +00:00
Sanjay Patel
e4ec3d519e [InstCombine] improve readability ; NFCI
Less indenting, named local variables, more descriptive names.

llvm-svn: 261659
2016-02-23 17:41:34 +00:00
David Majnemer
29ae1c3d76 [WinEH] Don't inline an 'unwinds to caller' cleanupret into funclets which locally unwind
It is problematic if the inlinee has a cleanupret which unwinds to
caller and we inline it into a call site which doesn't unwind.

If the funclet unwinds anywhere other than to the caller,
then we will give the funclet two unwind destinations.
This will result in a verifier failure.

Seeing as how the caller wasn't an invoke (which would locally unwind)
and that the funclet cannot unwind to caller, we must conclude that an
'unwind to caller' cleanupret is dynamically unreachable.

This fixes PR26698.

Differential Revision: http://reviews.llvm.org/D17536

llvm-svn: 261656
2016-02-23 17:11:04 +00:00
Sanjay Patel
4e682fcfe8 [InstCombine] less indenting; NFC
llvm-svn: 261652
2016-02-23 16:59:21 +00:00
Sanjay Patel
70dd21aa9d [InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCI
This is a straight cut and paste of the existing code and is intended to
be the first step in solving part of PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

We should be able to reuse most of this and delete the nearly identical 
existing code in visitOr(). Then, we can enhance visitXor() to use the
same code too.

llvm-svn: 261649
2016-02-23 16:36:07 +00:00
Michael Zolotukhin
7219052084 Follow up for r261597: Add the * to the auto.
llvm-svn: 261600
2016-02-23 00:57:48 +00:00
Michael Zolotukhin
3da31c17bb Follow-up for r261595: use range loop.
llvm-svn: 261597
2016-02-23 00:48:44 +00:00
Michael Zolotukhin
cb26e1de36 [LoopUnroll] Avoid unnecessary DT recomputation.
Summary:
When we completely unroll a loop, it's pretty easy to update DT in-place and
thus avoid rebuilding it. DT recalculation is one of the most time-consuming
tasks in loop-unroll, so avoiding it at least in case of full unroll should be
beneficial.

On some extreme (but still real-world) tests this patch improves compile time by
~2x.

Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc

Subscribers: joker.eph, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D17473

llvm-svn: 261595
2016-02-23 00:30:50 +00:00
Dehao Chen
755e933005 Set function entry count as 0 if sample profile is not found for the function.
Summary: This change makes the sample profile's behavior consistent with instr profile.

Reviewers: davidxl, eraman, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17522

llvm-svn: 261587
2016-02-22 22:46:21 +00:00
Adam Nemet
6f1d2a2687 [LoopDataPrefetch] Make it testable with opt
Summary:
Since this is an IR pass it's nice to be able to write tests without
llc.  This is the counterpart of the llc test under
CodeGen/PowerPC/loop-data-prefetch.ll.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17464

llvm-svn: 261578
2016-02-22 21:41:22 +00:00