1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00
Commit Graph

16675 Commits

Author SHA1 Message Date
Adam Nemet
a4f167bba1 [GVN] Basic optimization remark support
Follow-on patches will add more interesting cases.

The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk.  This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430

Differential Revision: https://reviews.llvm.org/D26488

llvm-svn: 288046
2016-11-28 17:45:28 +00:00
Sanjay Patel
b2bc8129f9 [InstSimplify] allow integer vector types to use computeKnownBits
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005
2016-11-27 21:07:28 +00:00
Sanjay Patel
1e0354882f fix formatting; NFC
llvm-svn: 287997
2016-11-27 15:53:48 +00:00
Sanjay Patel
2f5991c226 [InstCombine] don't drop metadata in FoldOpIntoSelect()
llvm-svn: 287980
2016-11-26 15:23:20 +00:00
Sanjay Patel
29cd3b551c add optional param to copy metadata when creating selects; NFC
There are other spots where we can use this; we're currently dropping 
metadata in some places, and there are proposed changes where we will
want to propagate metadata.

IRBuilder's CreateSelect() already has a parameter like this, so this
change makes the regular 'Create' API line up with that.

llvm-svn: 287976
2016-11-26 15:01:59 +00:00
David Majnemer
1f4efba2a1 Replace some callers of setTailCall with setTailCallKind
We were a little sloppy with adding tailcall markers.  Be more
consistent by using setTailCallKind instead of setTailCall.

llvm-svn: 287955
2016-11-25 22:35:09 +00:00
Abhilash Bhandari
7cab062130 [Loop Unswitch] Patch to selective unswitch only the reachable branch instructions.
Summary:
The iterative algorithm for Loop Unswitching may render some of the branches unreachable in the unswitched loops.
Given the exponential nature of the algorithm, this is quite an overhead.
This patch fixes this problem by selectively unswitching only those branches within a loop that are reachable from the loop header.

Reviewers: Michael Zolothukin, Anna Thomas, Weiming Zhao.
Subscribers: llvm-commits.

Differential Revision: http://reviews.llvm.org/D26299

llvm-svn: 287925
2016-11-25 14:07:44 +00:00
Haicheng Wu
fe4a4f4937 [LoopUnroll] Move code to exit early. NFC.
Just to save some compilation time.

Differential Revision: https://reviews.llvm.org/D26784

llvm-svn: 287800
2016-11-23 19:39:26 +00:00
Chandler Carruth
dad102bcc9 [PM] Change the static object whose address is used to uniquely identify
analyses to have a common type which is enforced rather than using
a char object and a `void *` type when used as an identifier.

This has a number of advantages. First, it at least helps some of the
confusion raised in Justin Lebar's code review of why `void *` was being
used everywhere by having a stronger type that connects to documentation
about this.

However, perhaps more importantly, it addresses a serious issue where
the alignment of these pointer-like identifiers was unknown. This made
it hard to use them in pointer-like data structures. We were already
dodging this in dangerous ways to create the "all analyses" entry. In
a subsequent patch I attempted to use these with TinyPtrVector and
things fell apart in a very bad way.

And it isn't just a compile time or type system issue. Worse than that,
the actual alignment of these pointer-like opaque identifiers wasn't
guaranteed to be a useful alignment as they were just characters.

This change introduces a type to use as the "key" object whose address
forms the opaque identifier. This both forces the objects to have proper
alignment, and provides type checking that we get it right everywhere.
It also makes the types somewhat less mysterious than `void *`.

We could go one step further and introduce a truly opaque pointer-like
type to return from the `ID()` static function rather than returning
`AnalysisKey *`, but that didn't seem to be a clear win so this is just
the initial change to get to a reliably typed and aligned object serving
is a key for all the analyses.

Thanks to Richard Smith and Justin Lebar for helping pick plausible
names and avoid making this refactoring many times. =] And thanks to
Sean for the super fast review!

While here, I've tried to move away from the "PassID" nomenclature
entirely as it wasn't really helping and is overloaded with old pass
manager constructs. Now we have IDs for analyses, and key objects whose
address can be used as IDs. Where possible and clear I've shortened this
to just "ID". In a few places I kept "AnalysisID" to make it clear what
was being identified.

Differential Revision: https://reviews.llvm.org/D27031

llvm-svn: 287783
2016-11-23 17:53:26 +00:00
Alina Sbirlea
40a0781572 [LoadStoreVectorizer] Enable vectorization of stores in the presence of an aliasing load
Summary:
The "getVectorizablePrefix" method would give up if it found an aliasing load for a store chain.
In practice, the aliasing load can be treated as a memory barrier and all stores that precede it
are a valid vectorizable prefix.
Issue found by volkan in D26962. Testcase is a pruned version of the one in the original patch.

Reviewers: jlebar, arsenm, tstellarAMD

Subscribers: mzolotukhin, wdng, nhaehnle, anna, volkan, llvm-commits

Differential Revision: https://reviews.llvm.org/D27008

llvm-svn: 287781
2016-11-23 17:43:15 +00:00
Justin Lebar
639bde8815 [StructurizeCFG] Refactor OrderNodes.
Summary:
No need to copy the RPOT vector before using it.  Switch from std::map
to SmallDenseMap.  Get rid of an unused variable (TempVisited).  Get rid
of a typedef, RNVector, which is now used only once.

Differential Revision: https://reviews.llvm.org/D26997

llvm-svn: 287721
2016-11-22 23:14:11 +00:00
Justin Lebar
556ff6c269 [StructurizeCFG] Add whitespace in getAnalysisUsage.
Summary:
"addRequired" and "addPreserved" look very similar when squished up next
to each other -- without the newline this code looked to me like it was
addRequired'ing DominatorTreeWrapperPass twice.

Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26996

llvm-svn: 287720
2016-11-22 23:14:07 +00:00
Justin Lebar
74d146e2de [StructurizeCFG] Remove unnecessary "using" in class.
Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26995

llvm-svn: 287719
2016-11-22 23:13:49 +00:00
Justin Lebar
ebfbd1c05a [StructurizeCFG] Merge the two constructors into one.
Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26994

llvm-svn: 287718
2016-11-22 23:13:44 +00:00
Justin Lebar
025800b1ba [StructurizeCFG] Use a for-each loop instead of iterators in runOnRegion.
Summary:

Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26993

llvm-svn: 287717
2016-11-22 23:13:37 +00:00
Justin Lebar
c229bf7dc3 [StructurizeCFG] Make hasOnlyUniformBranches a non-member function.
Summary: Lets us get rid of one member variable too.

Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26992

llvm-svn: 287716
2016-11-22 23:13:33 +00:00
Sanjay Patel
51362a8242 add and use isBitwiseLogicOp() helper function; NFCI
llvm-svn: 287712
2016-11-22 22:54:36 +00:00
Dehao Chen
ab1280a397 Before sample pgo annotation, do not inline a function that has no debug info. (NFC)
If there is no debug info in the callee, inlining it will not help annotator. This avoids infinite loop as reported in PR/31119.

llvm-svn: 287710
2016-11-22 22:50:01 +00:00
Davide Italiano
ab7c4be9a7 [SCCP] Remove code in visitBinaryOperator (and add tests).
We visit and/or, we try to derive a lattice value for the
instruction even if one of the operands is overdefined.
If the non-overdefined value is still 'unknown' just return and wait
for ResolvedUndefsIn to "plug in" the correct value. This simplifies
the logic a bit. While I'm here add tests for missing cases.

llvm-svn: 287709
2016-11-22 22:11:25 +00:00
Sanjay Patel
bfd2798cf9 [InstCombine] change bitwise logic type to eliminate bitcasts
In PR27925:
https://llvm.org/bugs/show_bug.cgi?id=27925

...we proposed adding this fold to eliminate a bitcast. In D20774, there was 
some concern about changing the type of a bitwise op as well as creating 
bitcasts that might not be free for a target. However, if we're strictly 
eliminating an instruction (by limiting this to one-use ops), then we should 
be able to do this in InstCombine.

But we're cautiously restricting the transform for now to vector types to
avoid possible backend problems. A transform to make sure the logic op is
legal for the target should be added to reverse this transform and improve
codegen.

Differential Revision: https://reviews.llvm.org/D26641

llvm-svn: 287707
2016-11-22 22:05:48 +00:00
Vyacheslav Klochkov
5f6a7e4c4b Fixed the lost FastMathFlags in GVN(Global Value Numbering).
Reviewer: Hal Finkel.
Differential Revision: https://reviews.llvm.org/D26952

llvm-svn: 287700
2016-11-22 20:52:53 +00:00
Vyacheslav Klochkov
491284f0da Fixed the lost FastMathFlags in Reassociate optimization.
Reviewer: Hal Finkel.
Differential Revision: https://reviews.llvm.org/D26957

llvm-svn: 287695
2016-11-22 20:23:04 +00:00
Eli Friedman
6b340b2b35 [LoopReroll] Make root-finding more aggressive.
Allow using an instruction other than a mul or phi as the base for
root-finding. For example, the included testcase includes a loop
which requires using a getelementptr as the base for root-finding.

Differential Revision: https://reviews.llvm.org/D26529

llvm-svn: 287588
2016-11-21 22:35:34 +00:00
Sanjay Patel
187c215247 [InstCombine] canonicalize min/max constant to select's false value
This is a first step towards canonicalization and improved folding/codegen
for integer min/max as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html

Here, we're just matching the simplest min/max patterns and adjusting the
icmp predicate while swapping the select operands.

I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll
so it's easier to see how this might be extended (corresponds to the TODO
comment in the code). That's also why I'm using matchSelectPattern()
rather than a simpler check; once the backend is patched, we can just 
remove some of the restrictions to allow the obfuscated min/max patterns
in the FIXME tests to be matched.

Differential Revision: https://reviews.llvm.org/D26525

llvm-svn: 287585
2016-11-21 22:04:14 +00:00
Evgeny Stupachenko
1ab8797749 LSR debug fix.
Summary:
Dump instruction instead of address.
Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D26877

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 287584
2016-11-21 21:55:03 +00:00
Sanjay Patel
018583aa87 fix formatting; NFC
llvm-svn: 287582
2016-11-21 21:48:36 +00:00
Reid Kleckner
4642e9c80c [asan] Make ASan compatible with linker dead stripping on Windows
Summary:
This is similar to what was done for Darwin in rL264645 /
http://reviews.llvm.org/D16737, but it uses COFF COMDATs to achive the
same result instead of relying on new custom linker features.

As on MachO, this creates one metadata global per instrumented global.
The metadata global is placed in the custom .ASAN$GL section, which the
ASan runtime will iterate over during initialization. There are no other
references to the metadata, so normal linker dead stripping would
discard it. However, the metadata is put in a COMDAT group with the
instrumented global, so that it will be discarded if and only if the
instrumented global is discarded.

I didn't update the ASan ABI version check since this doesn't affect
non-Windows platforms, and the WinASan ABI isn't really stable yet.

Implementing this for ELF will require extending LLVM IR and MC a bit so
that we can use non-COMDAT section groups.

Reviewers: pcc, kcc, mehdi_amini, kubabrecka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26770

llvm-svn: 287576
2016-11-21 20:40:37 +00:00
Mandeep Singh Grang
79dcf21663 [MemorySSA] Fix for non-determinism in codegen
This patch fixes the non-determinism caused due to iterating SmallPtrSet's
which was uncovered due to the experimental "reverse iteration order " patch:
https://reviews.llvm.org/D26718

The following unit tests failed because of the undefined order of iteration.
LLVM :: Transforms/Util/MemorySSA/cyclicphi.ll
LLVM :: Transforms/Util/MemorySSA/many-dom-backedge.ll
LLVM :: Transforms/Util/MemorySSA/many-doms.ll
LLVM :: Transforms/Util/MemorySSA/phi-translation.ll

Reviewers: dberlin, mgrang

Subscribers: dberlin, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D26704

llvm-svn: 287563
2016-11-21 19:33:02 +00:00
Marcin Koscielnicki
84d15a6c72 [InstrProfiling] Mark __llvm_profile_instrument_target last parameter as i32 zeroext if appropriate.
On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed
as i32 signext instead of plain i32.  Likewise, unsigned int may be passed
as i32, i32 signext, or i32 zeroext depending on the platform.  Mark
__llvm_profile_instrument_target properly (its last parameter is unsigned
int).

This (together with the clang change) makes compiler-rt profile testsuite pass
on s390x.

Differential Revision: http://reviews.llvm.org/D21736

llvm-svn: 287534
2016-11-21 11:57:19 +00:00
Davide Italiano
651649aedf [GlobalSplit] Port to the new pass manager.
llvm-svn: 287511
2016-11-21 00:28:23 +00:00
Simon Pilgrim
e116136251 Fix spelling mistakes in Transforms comments. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287488
2016-11-20 13:19:49 +00:00
Benjamin Kramer
7cad84ff09 Give some helper classes/functions internal linkage. NFC.
llvm-svn: 287462
2016-11-19 20:44:26 +00:00
Michael Zolotukhin
7d3922aa55 [LoopSimplify] Preserve LCSSA when removing edges from unreachable blocks.
This fixes PR30454.

llvm-svn: 287379
2016-11-18 21:01:12 +00:00
Florian Hahn
437ec3d7ba [simplifycfg][loop-simplify] Preserve loop metadata in 2 transformations.
insertUniqueBackedgeBlock in lib/Transforms/Utils/LoopSimplify.cpp now
propagates existing llvm.loop metadata to newly the added backedge.

llvm::TryToSimplifyUncondBranchFromEmptyBlock in lib/Transforms/Utils/Local.cpp
now propagates existing llvm.loop metadata to the branch instructions in the
predecessor blocks of the empty block that is removed.

Differential Revision: https://reviews.llvm.org/D26495

llvm-svn: 287341
2016-11-18 13:12:07 +00:00
Craig Topper
3e2e2b550a [InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics for variable shift with 16-bit elements.
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.

llvm-svn: 287316
2016-11-18 06:04:33 +00:00
Anna Zaks
37f1e6770c [asan] Turn on Mach-O global metadata liveness tracking by default
This patch turns on the metadata liveness tracking since all known issues
have been resolved. The future has been implemented in
https://reviews.llvm.org/D16737 and enables support of dead code stripping
option on Mach-O platforms.

As part of enabling the feature, I also plan on reverting the following
patch to compiler-rt:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160704/369910.html

Differential Revision: https://reviews.llvm.org/D26772

llvm-svn: 287235
2016-11-17 16:55:40 +00:00
Chris Bieneman
488842a066 [CMake] NFC. Updating CMake dependency specifications
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.

llvm-svn: 287206
2016-11-17 04:36:50 +00:00
Dehao Chen
63de4725df Use profile info to adjust loop unroll threshold.
Summary:
For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold.
For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling.

Reviewers: davidxl, mzolotukhin

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D26527

llvm-svn: 287186
2016-11-17 01:17:02 +00:00
Peter Collingbourne
b981e1c7e5 Introduce GlobalSplit pass.
This pass splits globals into elements using inrange annotations on
getelementptr indices.

Differential Revision: https://reviews.llvm.org/D22295

llvm-svn: 287178
2016-11-16 23:40:26 +00:00
Sanjay Patel
cad5938290 [InstCombine] replace unreachable with assert and remove unreachable code; NFCI
llvm-svn: 287147
2016-11-16 20:40:02 +00:00
Sanjay Patel
b1b45afff1 [InstCombine] fix formatting and add FIXMEs to foldOperationIntoSelectOperand(); NFC
llvm-svn: 287145
2016-11-16 20:18:34 +00:00
Mandeep Singh Grang
757d2ffc9c [LoopVectorize] Fix for non-determinism in codegen
Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718

Reviewers: mssimpso

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26727

llvm-svn: 287135
2016-11-16 18:53:17 +00:00
Reid Kleckner
7f2f1a4146 [sancov] Name the global containing the main source file name
If the global name doesn't start with __sancov_gen, ASan will insert
unecessary red zones around it.

llvm-svn: 287117
2016-11-16 16:50:43 +00:00
Craig Topper
9352cc47c5 [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.

Reviewers: zvi, delena, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26660

llvm-svn: 287083
2016-11-16 05:24:10 +00:00
Vyacheslav Klochkov
90d5b65bd8 Fixed the lost FastMathFlags for CALL operations in SLPVectorizer.
Reviewer: Michael Zolotukhin.
Differential Revision: https://reviews.llvm.org/D26575

llvm-svn: 287064
2016-11-16 00:55:50 +00:00
Justin Lebar
84413d2a5a [BypassSlowDivision] Handle division by constant numerators better.
Summary:
We don't do BypassSlowDivision when the denominator is a constant, but
we do do it when the numerator is a constant.

This patch makes two related changes to BypassSlowDivision when the
numerator is a constant:

 * If the numerator is too large to fit into the bypass width, don't
   bypass slow division (because we'll never run the smaller-width
   code).

 * If we bypass slow division where the numerator is a constant, don't
   OR together the numerator and denominator when determining whether
   both operands fit within the bypass width.  We need to check only the
   denominator.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D26699

llvm-svn: 287062
2016-11-16 00:44:47 +00:00
Justin Lebar
d79331f606 [BypassSlowDivision] Simplify partially-tautological if statement.
if (A || (B && A)) --> if (A).

llvm-svn: 287061
2016-11-16 00:44:43 +00:00
Filipe Cabecinhas
0a0aeaf19e [AddressSanitizer] Add support for (constant-)masked loads and stores.
This patch adds support for instrumenting masked loads and stores under
ASan, if they have a constant mask.

isInterestingMemoryAccess now supports returning a mask to be applied to
the loads, and instrumentMop will use it to generate additional checks.

Added tests for v4i32 v8i32, and v4p0i32 (~v4i64) for both loads and
stores (as well as a test to verify we don't add checks to non-constant
masks).

Differential Revision: https://reviews.llvm.org/D26230

llvm-svn: 287047
2016-11-15 22:37:30 +00:00
Kostya Serebryany
e0b6dd6efa [sanitizer-coverage] make sure asan does not instrument coverage guards (reported in https://github.com/google/oss-fuzz/issues/84)
llvm-svn: 287030
2016-11-15 21:12:50 +00:00
Wei Mi
906934d091 Revert r286999 which caused buildbot test failures. Some testcases need to be made target specific.
llvm-svn: 287014
2016-11-15 19:42:05 +00:00
Wei Mi
c58f7f97a3 [LSR] Allow formula containing Reg for SCEVAddRecExpr related with outerloop.
In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr,
and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser
and dropped.

Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only
handle inner loop now so only %for.body2 will be handled.

Using the logic above, formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) will be dropped
no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related
with outerloop. Only formula like
reg(%array) + 1*reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept
because the SCEVAddRecExpr related with outerloop is folded into the initial value of the
SCEVAddRecExpr related with current loop.

But in some cases, we do need to share the basic induction variable
reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction
variables used by LSR, so we don't want to drop the formula like
reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally.

From the existing comment, it tries to avoid considering multiple level loops at the same time.
However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other
than current loop, it is an invariant and will be simple to handle, and the formula doesn't have
to be dropped.

Differential Revision: https://reviews.llvm.org/D26429

llvm-svn: 286999
2016-11-15 18:35:53 +00:00
Wei Mi
7ba0bf14e8 [IndVars] Change the order to compute WidenAddRec in widenIVUse.
When both WidenIV::getWideRecurrence and WidenIV::getExtendedOperandRecurrence
return non-null but different WideAddRec, if getWideRecurrence is called
before getExtendedOperandRecurrence, we won't bother to call
getExtendedOperandRecurrence again. But As we know it is possible that after
SCEV folding, we cannot prove the legality using the SCEVAddRecExpr returned
by getWideRecurrence. Meanwhile if getExtendedOperandRecurrence returns non-null
WideAddRec, we know for sure that it is legal to do widening for current instruction.
So it is better to put getExtendedOperandRecurrence before getWideRecurrence, which
will increase the chance of successful widening.

Differential Revision: https://reviews.llvm.org/D26059

llvm-svn: 286987
2016-11-15 17:34:52 +00:00
Craig Topper
a6d5cfa05b [AVX-512] Add AVX-512 vector shift intrinsics to memory santitizer.
Just needed to add the intrinsics to the exist switch. The code is generic enough to support the wider vectors with no changes.

llvm-svn: 286980
2016-11-15 16:27:33 +00:00
Pablo Barrio
3057e0746e Revert "[JumpThreading] Unfold selects that depend on the same condition"
This reverts commit ac54d0066c478a09c7cd28d15d0f9ff8af984afc.

llvm-svn: 286976
2016-11-15 15:42:23 +00:00
Pablo Barrio
1c058118ef Revert "[JumpThreading] Prevent non-deterministic use lists"
This reverts commit f2c2f5354070469dac253373c66527ca971ddc66.

llvm-svn: 286975
2016-11-15 15:42:17 +00:00
Robert Lougher
75ef160b89 [LoopVectorizer] When estimating reg usage, unused insts may "end" another use
The register usage algorithm incorrectly treats instructions whose value is
not used within the loop (e.g. those that do not produce a value).

The algorithm first calculates the usages within the loop.  It iterates over
the instructions in order, and records at which instruction index each use
ends (in fact, they're actually recorded against the next index, as this is
when we want to delete them from the open intervals).

The algorithm then iterates over the instructions again, adding each
instruction in turn to a list of open intervals.  Instructions are then
removed from the list of open intervals when they occur in the list of uses
ended at the current index.

The problem is, instructions which are not used in the loop are skipped.
However, although they aren't used, the last use of a value may have been
recorded against that instruction index.  In this case, the use is not deleted
from the open intervals, which may then bump up the estimated register usage.

This patch fixes the issue by simply moving the "is used" check after the loop
which erases the uses at the current index.

Differential Revision: https://reviews.llvm.org/D26554

llvm-svn: 286969
2016-11-15 14:27:33 +00:00
Florian Hahn
ed556af3ee Test commit, remove trailing space.
This commit is used to test commit access.

llvm-svn: 286957
2016-11-15 13:28:42 +00:00
Kuba Brecka
06fe9815da [tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part
This adds support for TSan C++ exception handling, where we need to add extra calls to __tsan_func_exit when a function is exitted via exception mechanisms. Otherwise the shadow stack gets corrupted (leaked). This patch moves and enhances the existing implementation of EscapeEnumerator that finds all possible function exit points, and adds extra EH cleanup blocks where needed.

Differential Revision: https://reviews.llvm.org/D26177

llvm-svn: 286893
2016-11-14 21:41:13 +00:00
Teresa Johnson
1297c7a3c0 [ThinLTO] Only promote exported locals as marked in index
Summary:
We have always speculatively promoted all renamable local values
(except const non-address taken variables) for both the exporting
and importing module. We would then internalize them back based on
the ThinLink results if they weren't actually exported. This is
inefficient, and results in unnecessary renames. It also meant we
had to check the non-renamability of a value in the summary, which
was already checked during function importing analysis in the ThinLink.

Made renameModuleForThinLTO (which does the promotion/renaming) instead
use the index when exporting, to avoid unnecessary renames/promotions.
For importing modules, we can simply promoted all values as any local
we import by definition is exported and needs promotion.

This required changes to the method used by the FunctionImport pass
(only invoked from 'opt' for testing) and when invoked from llvm-link,
since neither does a ThinLink. We simply conservatively mark all locals
in the index as promoted, which preserves the current aggressive
promotion behavior.

I also needed to change an llvm-lto based test where we had previously
been aggressively promoting values that weren't importable (aliasees),
but now will not promote.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26467

llvm-svn: 286871
2016-11-14 19:21:41 +00:00
Teresa Johnson
68406a2c5d [ThinLTO] Make inline assembly handling more efficient in summary
Summary:
The change in r285513 to prevent exporting of locals used in
inline asm added all locals in the llvm.used set to the reference
set of functions containing inline asm. Since these locals were marked
NoRename, this automatically prevented importing of the function.

Unfortunately, this caused an explosion in the summary reference lists
in some cases. In my particular example, it happened for a large protocol
buffer generated C++ file, where many of the generated functions
contained an inline asm call. It was exacerbated when doing a ThinLTO
PGO instrumentation build, where the PGO instrumentation included
thousands of private __profd_* values that were added to llvm.used.

We really only need to include a single llvm.used local (NoRename) value
in the reference list of a function containing inline asm to block it
being imported. However, it seems cleaner to add a flag to the summary
that explicitly describes this situation, which is what this patch does.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26402

llvm-svn: 286840
2016-11-14 16:40:19 +00:00
Simon Pilgrim
30b038ab28 Remove redundant condition (PR28352) NFCI.
We were already testing is the op was not a leaf, so need to then test if it was a leaf (added it to the assert instead).

llvm-svn: 286817
2016-11-14 12:00:46 +00:00
Pablo Barrio
70dbdfe526 [JumpThreading] Prevent non-deterministic use lists
Summary:
Unfolding selects was previously done with the help of a vector
of pointers that was then sorted to be able to remove duplicates.
As this sorting depends on the memory addresses, it was
non-deterministic. A SetVector is used now so that duplicates are
removed without the need of sorting first.

Reviewers: mgrang, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26450

llvm-svn: 286807
2016-11-14 10:24:26 +00:00
Craig Topper
e327b5db7f [InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked AVX-512 variable shift intrinsics.
llvm-svn: 286755
2016-11-13 07:26:19 +00:00
Peter Collingbourne
286be4f135 Bitcode: Change module reader functions to return an llvm::Expected.
Differential Revision: https://reviews.llvm.org/D26562

llvm-svn: 286752
2016-11-13 07:00:17 +00:00
Peter Collingbourne
3b997324c8 Analysis: Simplify the ScalarEvolution::getGEPExpr() interface. NFCI.
All existing callers were manually extracting information out of an existing
GEP instruction and passing it to getGEPExpr(). Simplify the interface by
changing it to take a GEPOperator instead.

llvm-svn: 286751
2016-11-13 06:59:50 +00:00
Craig Topper
a0f3a18b4d [InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 shift by immediate and shift by single value.
This does not include support for the AVX-512 variable shifts. That will be coming in a future patch.

llvm-svn: 286739
2016-11-13 01:51:55 +00:00
Sanjay Patel
5f0794a91c [InstCombine] use dyn_cast rather isa+cast; NFC
Follow-up to r286664 cleanup as suggested by Eli. Thanks!

llvm-svn: 286671
2016-11-11 23:20:01 +00:00
Sanjay Patel
2499c998f5 [InstCombine] clean up foldSelectOpOp(); NFC
llvm-svn: 286664
2016-11-11 23:01:20 +00:00
Anna Zaks
d013821982 [tsan][llvm] Implement the function attribute to disable TSan checking at run time
This implements a function annotation that disables TSan checking for the
function at run time. The benefit over attribute((no_sanitize("thread")))
is that the accesses within the callees will also be suppressed.

The motivation for this attribute is a guarantee given by the objective C
language that the calls to the reference count decrement and object
deallocation will be synchronized. To model this properly, we would need
to intercept all ref count decrement calls (which are very common in ObjC
due to use of ARC) and also every single message send. Instead, we propose
to just ignore all accesses made from within dealloc at run time. The main
downside is that this still does not introduce any synchronization, which
means we might still report false positives if the code that relies on this
synchronization is not executed from within dealloc. However, we have not seen
this in practice so far and think these cases will be very rare.

Differential Revision: https://reviews.llvm.org/D25858

llvm-svn: 286663
2016-11-11 23:01:02 +00:00
Adam Nemet
2d303ea2a2 [LV] Stop saying "use -Rpass-analysis=loop-vectorize"
This is PR28376.

Unfortunately given the current structure of optimization diagnostics we
lack the capability to tell whether the user has
passed -Rpass-analysis=loop-vectorize since this is local to the
front-end (BackendConsumer::OptimizationRemarkHandler).

So rather than printing this even if the user has already
passed -Rpass-analysis, this patch just punts and stops recommending
this option.  I don't think that getting this right is worth the
complexity.

Differential Revision: https://reviews.llvm.org/D26563

llvm-svn: 286662
2016-11-11 22:51:46 +00:00
Erik Eckstein
e69fd0701d FunctionComparator: don't rely on argument evaluation order.
This is a follow-up on the recent refactoring of the FunctionMerge pass.
It should fix a fail of the new FunctionComparator unittest whe compiling with MSVC.

llvm-svn: 286648
2016-11-11 22:21:39 +00:00
Evgeniy Stepanov
5ca3586069 [cfi] Fix weak functions handling.
When a function pointer is replaced with a jumptable pointer, special
case is needed to preserve the semantics of extern_weak functions.
Since a jumptable entry can not be extern_weak, we emulate that
behaviour by replacing all references to F (the extern_weak function)
with the following expression: F != nullptr ? JumpTablePtr : nullptr.

Extra special care is needed for global initializers, since most (or
probably all) backends can not lower an initializer that includes
this kind of constant expression. Initializers like that are replaced
with a global constructor (i.e. a runtime initializer).

llvm-svn: 286636
2016-11-11 21:39:26 +00:00
Erik Eckstein
8bab9d126f Make the FunctionComparator of the MergeFunctions pass a stand-alone utility.
This is pure refactoring. NFC.

This change moves the FunctionComparator (together with the GlobalNumberState
utility) in to a separate file so that it can be used by other passes.
For example, the SwiftMergeFunctions pass in the Swift compiler:
https://github.com/apple/swift/blob/master/lib/LLVMPasses/LLVMMergeFunctions.cpp

Details of the change:

*) The big part is just moving code out of MergeFunctions.cpp into FunctionComparator.h/cpp
*) Make FunctionComparator member functions protected (instead of private)
   so that a derived comparator class can use them.

Following refactoring helps to share code between the base FunctionComparator
class and a derived class:

*) Add a beginCompare() function
*) Move some basic function property comparisons into a separate function compareSignature()
*) Do the GEP comparison inside cmpOperations() which now has a new
   needToCmpOperands reference parameter

https://reviews.llvm.org/D25385

llvm-svn: 286632
2016-11-11 21:15:13 +00:00
Vyacheslav Klochkov
6d9c315fe7 Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer.
Reviewer: Michael Zolotukhin.
Differential Revision: https://reviews.llvm.org/D26543

llvm-svn: 286626
2016-11-11 19:55:29 +00:00
Peter Collingbourne
a426ef0167 Bitcode: Change getModuleSummaryIndex() to return an llvm::Expected.
Differential Revision: https://reviews.llvm.org/D26539

llvm-svn: 286624
2016-11-11 19:50:39 +00:00
Reid Kleckner
1200af0389 [sancov] Don't instrument MSVC CRT stdio config helpers
They get called before initialization, which is a problem for winasan.

Test coming in compiler-rt.

llvm-svn: 286615
2016-11-11 19:18:45 +00:00
Evgeniy Stepanov
74dd9de11e [cfi] Implement cfi-icall using inline assembly.
The current implementation is emitting a global constant that happens
to evaluate to the same bytes + relocation as a jump instruction on
X86. This does not work for PIE executables and shared libraries
though, because we end up with a wrong relocation type. And it has no
chance of working on ARM/AArch64 which use different relocation types
for jump instructions (R_ARM_JUMP24) that is never generated for
data.

This change replaces the constant with module-level inline assembly
followed by a hidden declaration of the jump table. Works fine for
ARM/AArch64, but has some drawbacks.
* Extra symbols are added to the static symbol table, which inflate
the size of the unstripped binary a little. Stripped binaries are not
affected. This happens because jump table declarations must be
external (because their body is in the inline asm).
* Original functions that were anonymous are now named
<original name>.cfi, and it affects symbolization sometimes. This is
necessary because the only user of these functions is the (inline
asm) jump table, so they had to be added to @llvm.used, which does
not allow unnamed functions.

llvm-svn: 286611
2016-11-11 18:49:09 +00:00
Sanjay Patel
12b989ab94 [InstCombine] fix formatting of FoldOpIntoSelect(); NFCI
llvm-svn: 286604
2016-11-11 17:42:16 +00:00
Dehao Chen
334f136dbd Add comments about why we put LoopSink pass at the very late stage.
llvm-svn: 286480
2016-11-10 17:42:18 +00:00
Sanjay Patel
3a42ddfbd2 [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.

https://llvm.org/bugs/show_bug.cgi?id=30923

llvm-svn: 286423
2016-11-10 00:15:14 +00:00
Eli Friedman
4486441607 Preserve assumption cache in loop-rotate.
No testcase included because I can't figure out how to reduce it.
(It's easy to write a testcase where rotation clones an assume,
but that doesn't actually seem to trigger the crash in opt on
its own; maybe an issue with the laziness?)

Differential Revision: https://reviews.llvm.org/D26434

llvm-svn: 286410
2016-11-09 23:05:01 +00:00
Evgeny Stupachenko
61643c59a4 Minor unroll pass refacoring.
Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
 when "back edge" becomes "fall through" replaced with
 variable.
Some comments added.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D21719

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 286389
2016-11-09 19:56:39 +00:00
Peter Collingbourne
bcab72e19e Bitcode: Change the materializer interface to return llvm::Error.
Differential Revision: https://reviews.llvm.org/D26439

llvm-svn: 286382
2016-11-09 17:49:19 +00:00
Pavel Labath
f1b5ddf287 Remove TimeValue usage from Scalar/SROA.cpp. NFC.
llvm-svn: 286361
2016-11-09 12:07:12 +00:00
Alexandros Lamprineas
8a98bf69b0 [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.

Differential Revision: https://reviews.llvm.org/D26185

llvm-svn: 286347
2016-11-09 08:53:07 +00:00
Dehao Chen
8606f5bffd Enable Loop Sink pass for functions that has profile.
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.

Reviewers: davidxl, hfinkel

Subscribers: mehdi_amini, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26155

llvm-svn: 286325
2016-11-09 00:58:19 +00:00
Sanjay Patel
b8d6170e09 [InstCombine] fix profitability equation for max-of-nots transform
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.

This transform could also cause us to miss folds of min/max pairs.

llvm-svn: 286315
2016-11-09 00:13:11 +00:00
Sanjay Patel
8021633552 [InstCombine] reduce indentation; NFC
llvm-svn: 286314
2016-11-08 23:49:15 +00:00
Kuba Brecka
0b44510f74 [asan] Speed up compilation of large C++ stringmaps (tons of allocas) with ASan
This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst.

Differential Revision: https://reviews.llvm.org/D26380

llvm-svn: 286296
2016-11-08 21:30:41 +00:00
Davide Italiano
a13e2c7107 [LoopDistribute] Preserve GlobalsAA also in the new Pass Manager.
Differential Revision:  https://reviews.llvm.org/D26408

llvm-svn: 286280
2016-11-08 19:52:32 +00:00
Davide Italiano
ca68c4db42 [LibcallsShrinkWrap] This pass doesn't preserve the CFG.
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.

Differential Revision:  https://reviews.llvm.org/D26381

llvm-svn: 286271
2016-11-08 19:18:20 +00:00
Chad Rosier
4e7dddafdc Fix typo in comment. NFC.
llvm-svn: 286270
2016-11-08 19:10:25 +00:00
Chad Rosier
736ff7dc57 Remove unused include. NFC.
llvm-svn: 286250
2016-11-08 16:51:19 +00:00
Dehao Chen
9ef7dd39fb Use the last 7 bits to represent the discriminator to fit it in 1 byte ULEB128 (NFC).
From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte.

llvm-svn: 286245
2016-11-08 16:32:32 +00:00
Pablo Barrio
81c02096c5 [JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.

Patch by James Molloy and Pablo Barrio.

Reviewers: rengolin, haicheng, sebpop

Subscribers: jojo, jmolloy, llvm-commits

Differential Revision: https://reviews.llvm.org/D26391

llvm-svn: 286236
2016-11-08 14:53:30 +00:00
Sanjoy Das
406b239485 [TRE] Remove dead code
Address review by Eli Friedman on rL286147.

llvm-svn: 286165
2016-11-07 22:17:37 +00:00
Dehao Chen
823f52c5eb Reset debug loc to OldInduction in InnerLoopVectorizer::createInductionVariable. (NFC)
This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator().

llvm-svn: 286159
2016-11-07 21:59:40 +00:00
Sanjoy Das
582475344b Avoid tail recursion elimination across calls with operand bundles
Summary:
In some specific scenarios with well understood operand bundle types
(like `"deopt"`) it may be possible to go ahead and convert recursion to
iteration, but TailRecursionElimination does not have that logic today
so avoid doing the right thing for now.

I need some input on whether `"funclet"` operand bundles should also
block tail recursion elimination.  If not, I'll allow TRE across calls
with `"funclet"` operand bundles and add a test case.

Reviewers: rnk, majnemer, nlewycky, ahatanak

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26270

llvm-svn: 286147
2016-11-07 21:01:49 +00:00
Evgeniy Stepanov
3100834e94 Use -fsanitize-recover instead of -mllvm -msan-keep-going.
Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26352

llvm-svn: 286145
2016-11-07 21:00:10 +00:00
Kuba Brecka
912a036573 [tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part
Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this.

Differential Revision: https://reviews.llvm.org/D26266

llvm-svn: 286135
2016-11-07 19:09:56 +00:00
Benjamin Kramer
5edd1d8287 [MemCpyOpt] Don't emit IR in an unspecified order
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.

llvm-svn: 286126
2016-11-07 17:47:28 +00:00
Chad Rosier
656735ea6e Fix 80-column violations. NFC.
llvm-svn: 286117
2016-11-07 16:28:04 +00:00
Sanjay Patel
e3709a01b8 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
Justin Lebar
805d56a218 [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.

It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range.  But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.

Fixes PR30914.

Reviewers: jeremyhu

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26323

llvm-svn: 286038
2016-11-05 16:47:25 +00:00
Chandler Carruth
e1d6e7170a Only log the visit of a return instruction if we in fact found a return
instruction.

This avoids dereferencing null in the debug logging if the instruction
was not in fact a return instruction. This potential bug was found by
PVS-Studio.

This actually fixes the last of the "dereferenced a pointer before
checking it for null" reports in the recent PVS-Studio run. However,
there are quite a few reports of this nature that I did not do anything
to fix because they are pretty glaring false positives. They usually
took the form of quite clear correlated checks or a check made in
a separate function. I've even added asserts anywhere this correlation
wasn't pretty obvious and fundamental to the code.

llvm-svn: 285988
2016-11-04 06:59:50 +00:00
Xinliang David Li
894d3c5ef1 Fix typo
llvm-svn: 285978
2016-11-04 03:00:52 +00:00
Chandler Carruth
091ce91384 Fix a bug found by inspection by PVS-Studio.
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.

We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101

llvm-svn: 285930
2016-11-03 16:39:25 +00:00
Teresa Johnson
9ed6fc7a3c [ThinLTO] Handle distributed backend case when doing renaming
Summary:
The recent change I made to consult the summary when deciding whether to
rename (to handle inline asm) in r285513 broke the distributed build
case. In a distributed backend we will only have a portion of the
combined index, specifically for imported modules we only have the
summaries for any imported definitions. When renaming on import we were
asserting because no summary entry was found for a local reference being
linked in (def wasn't imported).

We only need to consult the summary for a renaming decision for the
exporting module. For imports, we would have prevented importing any
references to NoRename values already.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26250

llvm-svn: 285871
2016-11-03 01:07:16 +00:00
Greg Bedwell
316ba031a2 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Eli Friedman
4062ca28fa DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Bjorn Pettersson
f9a3ecf57f [Reassociate] Skip analysis of dead code to avoid infinite loop.
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).

The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30818

Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini

Subscribers: dberlin

Differential Revision: https://reviews.llvm.org/D26154

llvm-svn: 285793
2016-11-02 08:55:19 +00:00
George Burgess IV
1f539103d7 [MemorySSA] Tighten up types to make our API prettier. NFC.
Patch by bryant.

Differential Revision: https://reviews.llvm.org/D26126

llvm-svn: 285750
2016-11-01 21:17:46 +00:00
Sanjay Patel
4ecf444c54 [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel
55ceba42f0 [InstCombine] Fold nuw left-shifts in ugt/ule comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Sanjay Patel
8ca7f20046 [InstCombine] clean up adjustMinMax(); NFCI
1. Change param names for readability
2. Change pointer param to ref
3. Early exit to reduce indent
4. Change switch to if/else

llvm-svn: 285718
2016-11-01 18:15:03 +00:00
Sanjay Patel
d2eeb98f3a [InstCombine] add helper function for adjustMinMax(); NFCI
This is just a cut and paste; clean-up and enhancements to follow.

llvm-svn: 285715
2016-11-01 17:46:08 +00:00
Simon Pilgrim
ba759cb550 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
Evgeniy Stepanov
d6cc873271 Fix a typo.
Found with PVS-Studio here: http://www.viva64.com/en/b/0446/

llvm-svn: 285652
2016-10-31 22:42:39 +00:00
Kuba Brecka
7004ed37b6 [asan] Move instrumented null-terminated strings to a special section, LLVM part
On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings.

Differential Revision: https://reviews.llvm.org/D25026

llvm-svn: 285619
2016-10-31 18:51:58 +00:00
Dorit Nuzman
78297c81ee Second attempt at r285517.
llvm-svn: 285568
2016-10-31 13:17:31 +00:00
Dorit Nuzman
bea49f1580 Revert r285517 due to build failures.
llvm-svn: 285518
2016-10-30 14:34:57 +00:00
Dorit Nuzman
2fc6728513 [LoopVectorize] Make interleaved-accesses analysis less conservative about
possible pointer-wrap-around concerns, in some cases.

Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.

Differential Revision: https://reviews.llvm.org/D25276

llvm-svn: 285517
2016-10-30 12:23:26 +00:00
Teresa Johnson
d39ccf1724 [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm
Summary:
Instead of using the workaround of suppressing the entire index for
modules that call inline asm that may reference locals, use the
NoRename flag on the summary for any locals in the llvm.used set, and
add a reference edge from any functions containing inline asm.

This avoids issues from having no summaries despite the module defining
global values, which was preventing more aggressive index-based
optimization. It will be followed by a subsequent patch to make a
similar fix for local references in module level asm (to fix PR30610).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26121

llvm-svn: 285513
2016-10-30 05:40:44 +00:00
Teresa Johnson
3c66837fec [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)
Rename as suggested in code review for D26063.

llvm-svn: 285508
2016-10-29 21:52:23 +00:00
Teresa Johnson
26a0340c9b [ThinLTO] Use NoPromote flag in summary during promotion
Summary:
Replace the check of whether a GV has a section with the flag check
in the summary. This is in preparation for using the NoPromote flag
to convey other situations when we can't promote (e.g. locals used in
inline asm).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26063

llvm-svn: 285507
2016-10-29 21:31:48 +00:00
Sanjay Patel
d37ca92dd0 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Justin Lebar
188600d04d Don't leave unused divs/rems sitting around in BypassSlowDivision.
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.

This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection.  If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.

Specifically, in NVPTX, we want to compute rem from the output of div,
if available.  But if a div is not available, we want to leave the rem
alone.  This transformation is overeager if div is always available.

Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change.  But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.

Reviewers: tra

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26088

llvm-svn: 285460
2016-10-28 21:43:54 +00:00
Justin Lebar
b61890b5fc Don't claim the udiv created in BypassSlowDivision is exact.
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.

  udiv exact i32 %a, %b

"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D26097

llvm-svn: 285459
2016-10-28 21:43:51 +00:00
Matt Arsenault
c44a8a92e5 SpeculativeExecution: Allow speculating more inst types
Partial step towards removing the whitelist and only
using TTI's cost.

llvm-svn: 285438
2016-10-28 20:00:33 +00:00
George Burgess IV
406ed15c52 [MemorySSA] Add const to getClobberingMemoryAccess.
Thanks to bryant for the patch!

Differential Revision: https://reviews.llvm.org/D26086

llvm-svn: 285432
2016-10-28 19:22:46 +00:00
Igor Laevsky
1777237add [LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.

Differential Revision: https://reviews.llvm.org/D25873

llvm-svn: 285394
2016-10-28 12:57:20 +00:00
Davide Italiano
aada583c07 [Reassociate] Removing instructions mutates the IR.
Fixes PR 30784. Discussed with Justin, who pointed out that
in the new PassManager infrastructure we can have more fine-grained
control on which analyses we want to preserve, but this is the
best we can do with the current infrastructure.

llvm-svn: 285380
2016-10-28 02:47:09 +00:00
Teresa Johnson
44b461069e [ThinLTO] Rename HasSection to NoRename (NFC)
Summary:
This is in preparation for a change to utilize this flag for symbols
referenced/defined in either inline or module level assembly.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26048

llvm-svn: 285376
2016-10-28 02:24:59 +00:00
Sanjay Patel
fc59e659b5 [InstCombine] fix foldSPFofSPF() to handle vector splats
llvm-svn: 285345
2016-10-27 21:19:40 +00:00
Haicheng Wu
78fd509204 [LoopUnroll] Check partial unrolling is enabled before initialization. NFC.
Differential Revision: https://reviews.llvm.org/D23891

llvm-svn: 285330
2016-10-27 18:40:02 +00:00
Sanjay Patel
6b6d270c74 [InstCombine] handle simple vector integer constants in IsFreeToInvert
llvm-svn: 285318
2016-10-27 17:30:50 +00:00
Dehao Chen
ecb41605f5 Add Loop Sink pass to reverse the LICM based of basic block frequency.
Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency.

Reviewers: hfinkel, davidxl, chandlerc

Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D22778

llvm-svn: 285308
2016-10-27 16:30:08 +00:00
Alexey Bataev
b6c0260e86 [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.
After successfull horizontal reduction vectorization attempt for PHI node
vectorizer tries to update root binary op by combining vectorized tree
and the ReductionPHI node. But during vectorization this ReductionPHI
can be vectorized itself and replaced by the `undef` value, while the
instruction itself is marked for deletion. This 'marked for deletion'
PHI node then can be used in new binary operation, causing "Use still
stuck around after Def is destroyed" crash upon PHI node deletion.

Also the test is fixed to make it perform actual testing.

Differential Revision: https://reviews.llvm.org/D25671

llvm-svn: 285286
2016-10-27 12:02:28 +00:00
Dehao Chen
291651da88 Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators.
Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later.

Reviewers: dnovillo, dblaikie, aprantl, echristo

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25959

llvm-svn: 285207
2016-10-26 15:48:45 +00:00
Sanjay Patel
bc2320e150 [InstCombine] clean up commonCastTransforms; NFC
1. Use 'auto' with dyn_cast.
2. Variables start with a capital letter.
3. Use proper punctuation in comments.

llvm-svn: 285200
2016-10-26 14:52:35 +00:00
Andrea Di Biagio
ca2ff9d43b [IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly reuse the debug location of the original comparison.
When the loop exit condition is canonicalized as a != compaison, reuse the
debug location of the original (non canonical) comparison.

Before this patch, the debug location of the new icmp was obtained from the
loop latch terminator. This patch fixes the issue by correctly setting the
IRBuilder's "current debug location" to the location of the original compare.

Differential Revision: https://reviews.llvm.org/D25953

llvm-svn: 285185
2016-10-26 10:28:32 +00:00
Peter Collingbourne
f4cf97ceb0 Cloning: Also clone global variable attached metadata.
llvm-svn: 285161
2016-10-26 02:57:33 +00:00
Evgeniy Stepanov
2be3986f95 Utility functions for appending to llvm.used/llvm.compiler.used.
llvm-svn: 285143
2016-10-25 23:53:31 +00:00
Rong Xu
95093c2f32 [PGO] Fix select instruction annotation
Summary:
Select instruction annotation in IR PGO uses the edge count to infer the
branch count. It's currently placed in setInstrumentedCounts() where
no all the BB counts have been computed. This leads to wrong branch weights.
Move the annotation after all BB counts are populated.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25961

llvm-svn: 285128
2016-10-25 21:47:24 +00:00
Guozhi Wei
14476b73f7 [InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996
The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996.

The problem is with following code

xB = load (type B); 
xA = load (type A); 
+yA = (A)xB; B -> A
+zAn = PHI[yA, xA]; PHI 
+zBn = (B)zAn; // A -> B
store zAn;
store zBn;

optimizeBitCastFromPhi generates

+zBn = (B)zAn; // A -> B

and expects it will be combined with the following store instruction to another

store zAn 

Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely.

optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions.

Differential Revision: https://reviews.llvm.org/D23896

llvm-svn: 285116
2016-10-25 20:43:42 +00:00
Sanjay Patel
c4399434a9 [InstCombine] Ensure that truncated int types are legal.
Fixes the FIXMEs in D25952 and rL285075.

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25955

llvm-svn: 285108
2016-10-25 20:11:47 +00:00
Matthew Simpson
47bcdf8c60 [LV] Sink scalar operands of predicated instructions
When we predicate an instruction (div, rem, store) we place the instruction in
its own basic block within the vectorized loop. If a predicated instruction has
scalar operands, it's possible to recursively sink these scalar expressions
into the predicated block so that they might avoid execution. This patch sinks
as much scalar computation as possible into predicated blocks. We previously
were able to sink such operands only if they were extractelement instructions.

Differential Revision: https://reviews.llvm.org/D25632

llvm-svn: 285097
2016-10-25 18:59:45 +00:00
Michael Ilseman
eba5480140 Add -strip-nonlinetable-debuginfo capability
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.

The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches.  For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.

The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.

Thanks to Adrian Prantl for stewarding this patch!

llvm-svn: 285094
2016-10-25 18:44:13 +00:00
Michael Kuperstein
cf60720a14 Fix 80-char violations. NFC.
llvm-svn: 285092
2016-10-25 18:31:23 +00:00
Dehao Chen
5f785dc222 Move discriminator assignment to where it is used. (NFC)
llvm-svn: 285084
2016-10-25 16:50:27 +00:00