1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

18852 Commits

Author SHA1 Message Date
Vlad Tsyrklevich
0b4abdf3cc Add section headers to SpecialCaseLists
Summary:
Sanitizer blacklist entries currently apply to all sanitizers--there
is no way to specify that an entry should only apply to a specific
sanitizer. This is important for Control Flow Integrity since there are
several different CFI modes that can be enabled at once. For maximum
security, CFI blacklist entries should be scoped to only the specific
CFI mode(s) that entry applies to.

Adding section headers to SpecialCaseLists allows users to specify more
information about list entries, like sanitizer names or other metadata,
like so:

  [section1]
  fun:*fun1*
  [section2|section3]
  fun:*fun23*

The section headers are regular expressions. For backwards compatbility,
blacklist entries entered before a section header are put into the '[*]'
section so that blacklists without sections retain the same behavior.

SpecialCaseList has been modified to also accept a section name when
matching against the blacklist. It has also been modified so the
follow-up change to clang can define a derived class that allows
matching sections by SectionMask instead of by string.

Reviewers: pcc, kcc, eugenis, vsk

Reviewed By: eugenis, vsk

Subscribers: vitalybuka, llvm-commits

Differential Revision: https://reviews.llvm.org/D37924

llvm-svn: 314170
2017-09-25 22:11:11 +00:00
Craig Topper
b5e7463711 [InstCombine] Move an optimization from foldICmpAndConstConst to foldICmpUsingKnownBits
All this optimization cares about is knowing how many low bits of LHS is known to be zero and whether that means that the result is 0 or greater than the RHS constant. It doesn't matter where the zeros in the low bits came from. So we don't need to specifically look for an AND. Instead we can use known bits.

Differential Revision: https://reviews.llvm.org/D38195

llvm-svn: 314153
2017-09-25 21:15:00 +00:00
Sanjay Patel
115f2aa1d5 [InstCombine] remove extract-of-select vector transform (2nd try)
The 1st attempt at this:
https://reviews.llvm.org/rL314117
was reverted at:
https://reviews.llvm.org/rL314118

because of bot fails for clang tests that were checking optimized IR. That should be fixed with:
https://reviews.llvm.org/rL314144
...so try again. 

Original commit message:

The transform to convert an extract-of-a-select-of-vectors was added at:
https://reviews.llvm.org/rL194013

And a question about the validity of this transform was raised in the review:
https://reviews.llvm.org/D1539:
...but not answered AFAICT>

Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with
the original commit, but they are not regressing even after we remove the transform in this patch.

The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do
those transforms as canonicalizations.

The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301:
https://bugs.llvm.org/show_bug.cgi?id=33301
...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see.

Differential Revision: https://reviews.llvm.org/D38006

llvm-svn: 314147
2017-09-25 20:30:53 +00:00
Hongbin Zheng
916a6deb46 [SimplifyIndvar] Minor change to refine r314125, NFC
llvm-svn: 314130
2017-09-25 18:10:36 +00:00
Hongbin Zheng
8c8d69325e [SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative
Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior

This is a follow up of D34598

Differential Revision: https://reviews.llvm.org/D38072

llvm-svn: 314125
2017-09-25 17:39:40 +00:00
Sanjay Patel
e61651e45e revert r314117 because there are bogus clang tests that depend on the optimizer
llvm-svn: 314118
2017-09-25 17:00:04 +00:00
Sanjay Patel
4fd934685a [InstCombine] remove extract-of-select vector transform
The transform to convert an extract-of-a-select-of-vectors was added at:
rL194013

And a question about the validity of this transform was raised in the review:
https://reviews.llvm.org/D1539:
...but not answered AFAICT>

Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with
the original commit, but they are not regressing even after we remove the transform in this patch.

The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do
those transforms as canonicalizations.

The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301:
https://bugs.llvm.org/show_bug.cgi?id=33301
...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see.

Differential Revision: https://reviews.llvm.org/D38006

llvm-svn: 314117
2017-09-25 16:41:34 +00:00
Alexey Bataev
0b1859ac23 [SLP] Support for horizontal min/max reduction.
Summary:
SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions.
Patch fixes PR26956.

Reviewers: spatel, mkuper, hfinkel, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27846

llvm-svn: 314101
2017-09-25 13:34:59 +00:00
Craig Topper
b01d5413e4 [InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to equality comparisons when the min/max ranges intersect in a single value.
This is the inverse of what we do for SGT/SLT/UGT/ULT.

llvm-svn: 314032
2017-09-22 21:47:22 +00:00
Craig Topper
a151e88501 [InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases in foldICmpUsingKnownBits.
llvm-svn: 314025
2017-09-22 19:54:15 +00:00
Craig Topper
9d5fb1b5c3 [InstCombine] Move the call to isSignBitCheck into getDemandedBitsLHSMask instead of calling it outside and passing its result through a flag. NFCI
The result of the isSignBitCheck isn't used anywhere else and this allows us to share the m_APInt call in the likely case that it isn't a sign bit check.

llvm-svn: 314018
2017-09-22 18:57:23 +00:00
Craig Topper
7e37824b8f [InstCombine] Simplify check for RHS being a splat constant in foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt.
llvm-svn: 314017
2017-09-22 18:57:22 +00:00
Craig Topper
ef284eb207 [InstCombine] Make cases for ICMP_UGT/ICMP_ULT use similar formatting since they use similar code. NFC
llvm-svn: 314016
2017-09-22 18:57:20 +00:00
Artur Pilipenko
f5935363ed Rework loop predication pass
We've found a serious issue with the current implementation of loop predication.
The current implementation relies on SCEV and this turned out to be problematic.
To fix the problem we had to rework the pass substantially. We have had the
reworked implementation in our downstream tree for a while. This is the initial
patch of the series of changes to upstream the new implementation.

For now the transformation is limited to the following case:
  * The loop has a single latch with either ult or slt icmp condition.
  * The step of the IV used in the latch condition is 1.
  * The IV of the latch condition is the same as the post increment IV of the guard condition.
  * The guard condition is ult.

See the review or the LoopPredication.cpp header for the details about the
problem and the new implementation.

Reviewed By: sanjoy, mkazantsev

Differential Revision: https://reviews.llvm.org/D37569

llvm-svn: 313981
2017-09-22 13:13:57 +00:00
Sanjoy Das
e733e91370 Rename markAsErased to erase, as pointed out in a previous review; NFC
llvm-svn: 313951
2017-09-22 01:47:41 +00:00
Reid Kleckner
b408a0c51d Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
The fix is to avoid invalidating our insertion point in
replaceDbgDeclare:
     Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore);
+    if (DII == InsertBefore)
+      InsertBefore = &*std::next(InsertBefore->getIterator());
     DII->eraseFromParent();

I had to write a unit tests for this instead of a lit test because the
use list order matters in order to trigger the bug.

The reduced C test case for this was:
  void useit(int*);
  static inline void inlineme() {
    int x[2];
    useit(x);
  }
  void f() {
    inlineme();
    inlineme();
  }

llvm-svn: 313905
2017-09-21 19:52:03 +00:00
Daniel Jasper
9eea6fd83b Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
.. as well as the two subsequent changes r313826 and r313875.

This leads to segfaults in combination with ASAN. Will forward repro
instructions to the original author (rnk).

llvm-svn: 313876
2017-09-21 12:07:33 +00:00
Mikael Holmen
2e96ed2e60 [SROA] Really remove associated dbg.declare when removing dead alloca
Summary:
There already was code that tried to remove the dbg.declare, but that code
was placed after we had called
 I->replaceAllUsesWith(UndefValue::get(I->getType()));
on the alloca, so when we searched for the relevant dbg.declare, we
couldn't find it.

Now we do the search before we call RAUW so there is a chance to find it.

An existing testcase needed update due to this. Two dbg.declare with undef
were removed and then suddenly one of the two CHECKS failed.

Before this patch we got

  call void @llvm.dbg.declare(metadata i24* undef, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15
  call void @llvm.dbg.declare(metadata %struct.prog_src_register* undef, metadata !14, metadata !DIExpression()), !dbg !15
  call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15
  call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15

and with it we get

  call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15
  call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15

However, the CHECKs in the testcase checked things in a silly order, so
they only passed since they found things in the first dbg.declare. Now
we changed the order of the checks and the test passes.

Reviewers: rnk

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37900

llvm-svn: 313875
2017-09-21 11:14:27 +00:00
Strahinja Petrovic
41f6e5a696 Fixed reverted commit rL312318
This patch contains fix for reverted commit
rL312318 which was causing failure due to use
of unchecked dyn_cast to CIInit.

Patch by: Nikola Prica.

llvm-svn: 313870
2017-09-21 10:04:02 +00:00
Serguei Katkov
9b3b402ab0 Revert "Re-enable "[IRCE] Identify loops with latch comparison against current IV value""
Revert the patch causing the functional failures.
The patch owner is notified with test cases which fail.
Test case has been provided to Maxim offline.

llvm-svn: 313857
2017-09-21 04:50:41 +00:00
Craig Topper
58fc4b3d14 [InstCombine] Teach getDemandedBitsLHSMask to handle constant splat vectors
This replaces a ConstantInt dyn_cast with m_APInt

Differential Revision: https://reviews.llvm.org/D38100

llvm-svn: 313840
2017-09-20 23:48:58 +00:00
Matt Morehouse
ec81f6718b [MSan] Disable sanitization for __sanitizer_dtor_callback.
Summary:
Eliminate unnecessary instrumentation at __sanitizer_dtor_callback
call sites.  Fixes https://github.com/google/sanitizers/issues/861.

Reviewers: eugenis, kcc

Reviewed By: eugenis

Subscribers: vitalybuka, llvm-commits, cfe-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D38063

llvm-svn: 313831
2017-09-20 22:53:08 +00:00
Sanjay Patel
7b84f5e824 [SimplifyCFG] don't create a no-op subtract
I noticed this inefficiency while investigating PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

This fix will likely push another bug (we don't maintain state of 'LateSimplifyCFG') 
into hiding, but I'll try to clean that up with a follow-up patch anyway.

llvm-svn: 313829
2017-09-20 22:31:35 +00:00
Reid Kleckner
d558b9268e [IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare
Summary:
This implements the design discussed on llvm-dev for better tracking of
variables that live in memory through optimizations:
  http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html

This is tracked as PR34136

llvm.dbg.addr is intended to be produced and used in almost precisely
the same way as llvm.dbg.declare is today, with the exception that it is
control-dependent. That means that dbg.addr should always have a
position in the instruction stream, and it will allow passes that
optimize memory operations on local variables to insert llvm.dbg.value
calls to reflect deleted stores. See SourceLevelDebugging.rst for more
details.

The main drawback to generating DBG_VALUE machine instrs is that they
usually cause LLVM to emit a location list for DW_AT_location. The next
step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE
ranges as not needing a location list, and possibly start setting
DW_AT_start_offset for variables whose lifetimes begin mid-scope.

Reviewers: aprantl, dblaikie, probinson

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37768

llvm-svn: 313825
2017-09-20 21:52:33 +00:00
Craig Topper
8ad966131c [InstCombine] Handle (X & C2) < C1 --> (X & C2) == 0
We already did (X & C2) > C1 --> (X & C2) != 0, if any bit set in (X & C2) will produce a result greater than C1. But there is an equivalent inverse condition with <= C1 (which will be canonicalized to < C1+1)

Differential Revision: https://reviews.llvm.org/D38065

llvm-svn: 313819
2017-09-20 21:18:17 +00:00
Craig Topper
0f488ed700 [InstCombine] Use APInt::getActiveBits() to avoid creating an APInt from a trailing zero count to do a comparison. NFCI
llvm-svn: 313792
2017-09-20 18:49:29 +00:00
Hans Wennborg
44d1f6514a Revert r313771 "[SLP] Vectorize jumbled memory loads."
This broke the buildbots, e.g.
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask'
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Subscribers: mzolotukhin
>
> Reviewed By: ayal
>
> Differential Revision: https://reviews.llvm.org/D36130
>
> Review comments updated accordingly
>
> Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0
>
> Added a TODO for sortLoadAccesses API
>
> Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58
>
> Modified the TODO for sortLoadAccesses API
>
> Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565
>
> Review comment update for using OpdNum to insert the mask in respective location
>
> Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce
>
> Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase
>
> Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

llvm-svn: 313781
2017-09-20 18:00:03 +00:00
Quentin Colombet
fe9e785376 [InstCombine] Add select simplifications
In these cases, two selects have constant selectable operands for
both the true and false components and have the same conditional
expression.
We then create two arithmetic operations of the same type and feed a
final select operation using the result of the true arithmetic for the true
operand and the result of the false arithmetic for the false operand and reuse
the original conditionl expression.
The arithmetic operations are naturally folded as a consequence, leaving
only the newly formed select to replace the old arithmetic operation.

Patch by: Michael Berg <michael_c_berg@apple.com>
Differential Revision: https://reviews.llvm.org/D37019

llvm-svn: 313774
2017-09-20 17:32:16 +00:00
Mohammad Shahid
5a29343ec1 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask'
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Subscribers: mzolotukhin

Reviewed By: ayal

Differential Revision: https://reviews.llvm.org/D36130

Review comments updated accordingly

Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0

Added a TODO for sortLoadAccesses API

Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58

Modified the TODO for sortLoadAccesses API

Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565

Review comment update for using OpdNum to insert the mask in respective location

Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce

Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase

Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b
llvm-svn: 313771
2017-09-20 17:19:57 +00:00
Teresa Johnson
4e8f623913 [ThinLTO] Fix dead stripping analysis for SamplePGO
Summary:
The fix for dead stripping analysis in the case of SamplePGO indirect
calls to local functions (r313151) introduced the possibility of an
infinite loop.

Make sure we check for the value being already live after we update it
for SamplePGO indirect call handling.

Reviewers: danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D38086

llvm-svn: 313766
2017-09-20 17:09:47 +00:00
Alexander Kornienko
694b27a0ce Revert r313736: "[SLP] Vectorize jumbled memory loads."
The revision breaks buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio

llvm-svn: 313758
2017-09-20 14:53:07 +00:00
Mohammad Shahid
33f617c7de [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

Commit after rebase for patch D36130

Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab
llvm-svn: 313736
2017-09-20 08:18:28 +00:00
Sanjoy Das
64884488b6 Tighten the invariants around LoopBase::invalidate
Summary:
With this change:
 - Methods in LoopBase trip an assert if the receiver has been invalidated
 - LoopBase::clear frees up the memory held the LoopBase instance

This change also shuffles things around as necessary to work with this stricter invariant.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38055

llvm-svn: 313708
2017-09-20 02:31:57 +00:00
Daniel Berlin
cb1e4ba6da GVNSink: Make ModelledPHIs constructor linear (and avoid edge case it worries about) by avoiding getIncomingValueForBlock
llvm-svn: 313702
2017-09-20 00:07:27 +00:00
Daniel Berlin
4f04848c69 Revert "[GVNSink] Remove dependency on SmallPtrSet iteration order."
This reverts commit r312156, because now the op and block arrays are not in the same order :(.

llvm-svn: 313701
2017-09-20 00:07:25 +00:00
Daniel Berlin
f1e316ecf3 NewGVN: Remove unused includes
llvm-svn: 313700
2017-09-20 00:07:12 +00:00
Sanjoy Das
1139050cb2 [LoopInfo] Make LoopBase and Loop destructors non-public
Summary:
See comment for why I think this is a good idea.

This change also:

 - Removes an SCEV test case.  The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop.
 - Updates the loop pass manager test case to deal with a private ~Loop::Loop.
 - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37996

llvm-svn: 313695
2017-09-19 23:19:00 +00:00
Adam Nemet
8d2be3fa7c Allow ORE.emit to take a closure to delay building the remark object
In the lambda we are now returning the remark by value so we need to preserve
its type in the insertion operator.  This requires making the insertion
operator generic.

I've also converted a few cases to use the new API.  It seems to work pretty
well.  See the LoopUnroller for a slightly more interesting case.

llvm-svn: 313691
2017-09-19 23:00:55 +00:00
Dehao Chen
366d1f7fbc Import all inlined indirect call targets for SamplePGO.
Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36637

llvm-svn: 313678
2017-09-19 21:18:14 +00:00
Sanjay Patel
2ada5fa4ca [SimplifyCFG] fix typos/formatting; NFC
llvm-svn: 313671
2017-09-19 20:58:14 +00:00
Dehao Chen
f1f34755de Handle profile mismatch correctly for SamplePGO.
Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash.

Reviewers: davidxl, tejohnson, eraman

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38018

llvm-svn: 313658
2017-09-19 18:26:54 +00:00
Reid Kleckner
9540ee28ed [gcov] Emit errors when opening the notes file fails
No time to write a test case, on to the next bug. =P

Discovered while investigating PR34659

llvm-svn: 313571
2017-09-18 21:31:48 +00:00
Sanjay Patel
b7a2b50f6a [SLP] clean up for vector store case; NFCI
llvm-svn: 313541
2017-09-18 16:20:15 +00:00
Craig Topper
9f5737a6bd [X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles.
I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates.

llvm-svn: 313450
2017-09-16 07:36:14 +00:00
Chandler Carruth
4a434efc2a [SLP] Revert r312791 and other necessary commits, except for TTI and
CostModel.

The original patch added support for horizontal min/max reductions to
the SLP vectorizer.

This patch causes LLVM to miscompile fairly simple signed min
reductions. I have attached a test progrom to http://llvm.org/PR34635
that shows the behavior change after this patch. We found this in a test
for the open source Eigen library, but also in other code.

Unfortunately, the revert is moderately challenging. It required
reverting:
r313042: [SLP] Test with multiple uses of conditional op and wrong parent.
r312853: [SLP] Fix buildbots, NFC.
r312793: [SLP] Fix the warning about paths not returning the value, NFC.
r312791: [SLP] Support for horizontal min/max reduction.

And even then, I had to completely skip reverting the changes to TTI and
CostModel because r312832 rewrote so much of this code. Plus, the cost
modeling changes aren implicated in the miscompile, so they should be
fine and will just not be used until this gets re-introduced.

llvm-svn: 313409
2017-09-15 22:23:27 +00:00
Vivek Pandya
dda17788af This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352
It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with
command line options.
The diagnostic handler used to be callback now this patch adds a class
DiagnosticHandler. It has virtual method to provide custom diagnostic handler
and methods to control which particular remarks are enabled. 
However LLVM-C API users can still provide callback function for diagnostic handler.

llvm-svn: 313390
2017-09-15 20:10:09 +00:00
Vivek Pandya
d4e6d8ad49 This reverts r313381
llvm-svn: 313387
2017-09-15 19:53:54 +00:00
Vivek Pandya
8ef565859a This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352
It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with
command line options.
The diagnostic handler used to be callback now this patch adds a class
DiagnosticHandler. It has virtual method to provide custom diagnostic handler
and methods to control which particular remarks are enabled. 
However LLVM-C API users can still provide callback function for diagnostic handler.

llvm-svn: 313382
2017-09-15 19:30:59 +00:00
Anna Thomas
9a38f41e5e [RuntimeUnroll] Add heuristic for unrolling multi-exit loop
Add a profitability heuristic to enable runtime unrolling of multi-exit
loop: There can be atmost two unique exit blocks for the loop and the
second exit block should be a deoptimizing block. Also, there can be one
other exiting block other than the latch exiting block. The reason for
the latter is so that we limit the number of branches in the unrolled
code to being at most the unroll factor.  Deoptimizing blocks are rarely
taken so these additional number of branches created due to the
unrolling are predictable, since one of their target is the deopt block.

Reviewers: apilipenko, reames, evstupac, mkuper

Subscribers: llvm-commits

Reviewed by: reames

Differential Revision: https://reviews.llvm.org/D35380

llvm-svn: 313363
2017-09-15 15:56:05 +00:00
Anna Thomas
b78a541c07 [RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup
During runtime unrolling on loops with multiple exits, we update the
exit blocks with the correct phi values from both original and remainder
loop.
In this process, we lookup the VMap for the mapped incoming phi values,
but did not update the VMap if a default entry was generated in the VMap
during the lookup. This default value is generated when constants or
values outside the current loop are looked up.
This patch fixes the assertion failure when null entries are present in
the VMap because of this lookup. Added a testcase that showcases the
problem.

llvm-svn: 313358
2017-09-15 13:29:33 +00:00
Ilya Biryukov
51785ee0c4 Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
This reverts commit r313348.

Reason: it caused buildbot failures.
llvm-svn: 313352
2017-09-15 10:15:00 +00:00
Dinar Temirbulatov
abd1b29772 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:

void add1(int * __restrict dst, const int * __restrict src) {
  *dst++ = *src++;
  *dst++ = *src++ + 1;
  *dst++ = *src++ + 2;
  *dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.

Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 313348
2017-09-15 06:56:39 +00:00
Dinar Temirbulatov
f74bdecbc6 [SLPVectorizer] Remove duplicated functionality code in initScheduleData function, NFCI.
llvm-svn: 313341
2017-09-15 04:31:54 +00:00
Alina Sbirlea
bf65988660 Refactor collectChildrenInLoop to LoopUtils [NFC]
Summary: Move to LoopUtils method that collects all children of a node inside a loop.

Reviewers: majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37870

llvm-svn: 313322
2017-09-15 00:04:16 +00:00
Dehao Chen
c6dcd59361 Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader.
Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: vitalybuka, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37779

llvm-svn: 313277
2017-09-14 17:29:56 +00:00
Alon Kom
f5b825ae49 [LV] Fix maximum legal VF calculation
This patch fixes pr34283, which exposed that the computation of
maximum legal width for vectorization was wrong, because it relied
on MaxInterleaveFactor to obtain the maximum stride used in the loop,
however not all strided accesses in the loop have an interleave-group
associated with them.
Instead of recording the maximum stride in the loop, which can be over
conservative (e.g. if the access with the maximum stride is not involved
in the dependence limitation), this patch tracks the actual maximum legal
width imposed by accesses that are involved in dependencies.

Differential Revision: https://reviews.llvm.org/D37507

llvm-svn: 313237
2017-09-14 07:40:02 +00:00
Vitaly Buka
01154d35f0 Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader."
Patch introduced uninitialized value.

This reverts commit r313195.

llvm-svn: 313230
2017-09-14 05:40:33 +00:00
Peter Collingbourne
e7f0383a82 Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222.
This reland includes a fix for the LowerTypeTests pass so that it
looks past aliases when determining which type identifiers are live.

Differential Revision: https://reviews.llvm.org/D37842

llvm-svn: 313229
2017-09-14 05:02:59 +00:00
Dinar Temirbulatov
4f2a1f0cb0 [SLPVectorizer] Prefer auto over explicit type for VL0, NFCI.
llvm-svn: 313228
2017-09-14 04:28:35 +00:00
Hans Wennborg
66fcbcafce Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping."
This broke Chromium's CFI build; see crbug.com/765004.

> We were previously handling aliases during dead stripping by adding
> the aliased global's "original name" GUID to the worklist. This will
> lead to incorrect behaviour if the global has local linkage because
> the original name GUID will not correspond to the global's GUID in
> the summary.
>
> Because an alias is just another name for the global that it
> references, there is no need to mark the referenced global as used,
> or to follow references from any other copies of the global. So all
> we need to do is to follow references from the aliasee's summary
> instead of the alias.
>
> Differential Revision: https://reviews.llvm.org/D37789

llvm-svn: 313222
2017-09-14 00:40:14 +00:00
Eugene Zelenko
6205a7f0f8 [Transforms] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 313198
2017-09-13 21:43:53 +00:00
Dehao Chen
def9c20da9 Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader.
Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37779

llvm-svn: 313195
2017-09-13 21:22:55 +00:00
Anna Thomas
a8474917f4 [LV] Avoid computing the register usage for default VF. NFC
These are changes to reduce redundant computations when calculating a
feasible vectorization factor:
1. early return when target has no vector registers
2. don't compute register usage for the default VF.

Suggested during review for D37702.

llvm-svn: 313176
2017-09-13 19:35:45 +00:00
Hiroshi Yamauchi
3fdeae8ddd Add options to dump PGO counts in text.
Summary:
Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text.

This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37776

llvm-svn: 313159
2017-09-13 17:20:38 +00:00
Peter Collingbourne
6bf7e6c622 ThinLTO: Correctly follow aliasee references when dead stripping.
We were previously handling aliases during dead stripping by adding
the aliased global's "original name" GUID to the worklist. This will
lead to incorrect behaviour if the global has local linkage because
the original name GUID will not correspond to the global's GUID in
the summary.

Because an alias is just another name for the global that it
references, there is no need to mark the referenced global as used,
or to follow references from any other copies of the global. So all
we need to do is to follow references from the aliasee's summary
instead of the alias.

Differential Revision: https://reviews.llvm.org/D37789

llvm-svn: 313157
2017-09-13 17:09:20 +00:00
Teresa Johnson
63341428f9 [ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link
Summary:
SamplePGO indirect call profiles record the target as the original GUID
for statics. The importer had special handling to map to the normal GUID
in that case. The dead global analysis needs the same treatment or
inconsistencies arise, resulting in linker unsats due to some dead
symbols being exported and kept, leaving in references to other dead
symbols that are removed.

This can happen when a SamplePGO profile collected by one binary is used
for a different binary, so the indirect call profiles may not accurately
reflect live targets.

Reviewers: danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37783

llvm-svn: 313151
2017-09-13 15:16:38 +00:00
Ayal Zaks
303a302e55 [LV] Fix PR34523 - avoid generating redundant selects
When converting a PHI into a series of 'select' instructions to combine the
incoming values together according their edge masks, initialize the first
value to the incoming value In0 of the first predecessor, instead of
generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter
fails when the Cond[0] mask is null, representing a full mask, which can
happen only when there's a single incoming value.

No functional changes intended nor expected other than surviving null Cond[0]'s.

This fix follows D35725, which introduced using null to represent full masks.

Differential Revision: https://reviews.llvm.org/D37619

llvm-svn: 313119
2017-09-13 06:28:37 +00:00
Aditya Kumar
3828d43959 [GVNHoist] Factor out reachability to search for anticipable instructions quickly
Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding
the ANTIC points
in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse
graph. So we introduce a data structure (CHI nodes)
to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the
function as they are the potential hoistable candidates.

This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a
trivial way.
Relevant test cases are added to show the functionality as well as regression fixes from PR32821.

Regression from previous GVNHoist:
We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN

Differential Revision: https://reviews.llvm.org/D35918
Reviewers: dberlin, sebpop, gberry,

llvm-svn: 313116
2017-09-13 05:28:03 +00:00
Reid Kleckner
2279e85466 [InstCombine] Add a flag to disable LowerDbgDeclare
Summary:
This should improve optimized debug info for address-taken variables at
the cost of inaccurate debug info in some situations.

We patched this into clang and deployed this change to Chromium
developers, and this significantly improved debuggability of optimized
code. The long-term solution to PR34136 seems more and more like it's
going to take a while, so I would like to commit this change under a
flag so that it can be used as a stop-gap measure.

This flag should really help so for C++ aggregates like std::string and
std::vector, which are typically address-taken, even after inlining, and
cannot be SROA-ed.

Reviewers: aprantl, dblaikie, probinson, dberlin

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D36596

llvm-svn: 313108
2017-09-13 01:43:25 +00:00
Dehao Chen
59e2012cd4 Refactor the code to pass down ACT to SampleProfileLoader correctly.
Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37773

llvm-svn: 313080
2017-09-12 21:55:55 +00:00
Alina Sbirlea
56413b9615 Make promoteLoopAccessesToScalars independent of AliasSet [NFC]
Summary:
The current promoteLoopAccessesToScalars method receives an AliasSet, but
the information used is in fact a list of Value*, known to must alias.
Create the list ahead of time to make this method independent of the AliasSet class.

While there is no functionality change, this adds overhead for creating
a set of Value*, when promotion would normally exit earlier.
This is meant to be as a first refactoring step in order to start replacing
AliasSetTracker with MemorySSA.
And while the end goal is to redesign LICM, the first few steps will focus on
adding MemorySSA as an alternative to the AliasSetTracker using most of the
existing functionality.

Reviewers: mkuper, danielcdh, dberlin

Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D35439

llvm-svn: 313075
2017-09-12 21:18:44 +00:00
Anna Thomas
c739ce8170 [LV] Clamp the VF to the trip count
Summary:
When the MaxVectorSize > ConstantTripCount, we should just clamp the
vectorization factor to be the ConstantTripCount.
This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF.

Earlier we were finding the maximum vector width, which could be greater than
the trip count itself. The Loop vectorizer does all the work for generating a
vectorizable loop, but in the end we would always choose the scalar loop (since
the VF > trip count). This allows us to choose the VF keeping in mind the trip
count if available.

This is a fix on top of rL312472.

Reviewers: Ayal, zvi, hfinkel, dneilson

Reviewed by: Ayal

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37702

llvm-svn: 313046
2017-09-12 16:32:45 +00:00
Alexey Bataev
521acd5aad [SLP] Fix for PHINode during horizontal reduction scanning, NFC.
Reduces number of loops during instructions analysis.

llvm-svn: 313035
2017-09-12 15:13:50 +00:00
Peter Collingbourne
16a8857c47 LowerTypeTests: Add import/export support for targets without absolute symbol constants.
The rationale is the same as for r312967.

Differential Revision: https://reviews.llvm.org/D37408

llvm-svn: 312968
2017-09-11 22:49:10 +00:00
Peter Collingbourne
49fc5fcee6 WholeProgramDevirt: Add import/export support for targets without absolute symbol constants.
Not all targets support the use of absolute symbols to export
constants. In particular, ARM has a wide variety of constant encodings
that cannot currently be relocated by linkers. So instead of exporting
the constants using symbols, export them directly in the summary.
The values of the constants are left as zeroes on targets that support
symbolic exports.

This may result in more cache misses when targeting those architectures
as a result of arbitrary changes in constant values, but this seems
somewhat unavoidable for now.

Differential Revision: https://reviews.llvm.org/D37407

llvm-svn: 312967
2017-09-11 22:34:42 +00:00
Uriel Korach
32b30a50d0 Test commit
llvm-svn: 312878
2017-09-10 08:31:22 +00:00
Nuno Lopes
4e212e275d Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

llvm-svn: 312869
2017-09-09 18:23:11 +00:00
Sanjay Patel
554ba948c2 [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 

llvm-svn: 312862
2017-09-09 13:38:18 +00:00
Kostya Serebryany
f6a82a7790 [sanitizer-coverage] call appendToUsed once per module, not once per function (which is too slow)
llvm-svn: 312855
2017-09-09 05:30:13 +00:00
Alexey Bataev
a8c33e3a37 [SLP] Fix buildbots, NFC.
llvm-svn: 312853
2017-09-09 02:08:45 +00:00
Dinar Temirbulatov
d6c1f050aa [SLPVectorizer] Add struct InstructionsState that holds information about analysis of vector to be vectorized.
Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D37212

llvm-svn: 312802
2017-09-08 17:08:17 +00:00
Alexey Bataev
c263c4735c [SLP] Fix the warning about paths not returning the value, NFC.
llvm-svn: 312793
2017-09-08 14:32:20 +00:00
Alexey Bataev
182cf8e3c6 [SLP] Support for horizontal min/max reduction.
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

llvm-svn: 312791
2017-09-08 13:49:36 +00:00
Max Kazantsev
428c559eff Re-enable "[IRCE] Identify loops with latch comparison against current IV value"
Re-applying after the found bug was fixed.

Differential Revision: https://reviews.llvm.org/D36215

llvm-svn: 312783
2017-09-08 10:15:05 +00:00
Max Kazantsev
6ad3fc9e75 diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
index f72a808..9fa49fd 100644
--- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
+++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
@@ -450,20 +450,10 @@ struct LoopStructure {
   // equivalent to:
   //
   // intN_ty inc = IndVarIncreasing ? 1 : -1;
-  // pred_ty predicate = IndVarIncreasing
-  //                         ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT
-  //                         : IsSignedPredicate ? ICMP_SGT : ICMP_UGT;
+  // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT;
   //
-  //
-  // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt);
-  //      iv = IndVarNext)
+  // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase)
   //   ... body ...
-  //
-  // Here IndVarBase is either current or next value of the induction variable.
-  // in the former case, IsIndVarNext = false and IndVarBase points to the
-  // Phi node of the induction variable. Otherwise, IsIndVarNext = true and
-  // IndVarBase points to IV increment instruction.
-  //
 
   Value *IndVarBase;
   Value *IndVarStart;
@@ -471,13 +461,12 @@ struct LoopStructure {
   Value *LoopExitAt;
   bool IndVarIncreasing;
   bool IsSignedPredicate;
-  bool IsIndVarNext;
 
   LoopStructure()
       : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr),
         LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr),
         IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr),
-        IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {}
+        IndVarIncreasing(false), IsSignedPredicate(true) {}
 
   template <typename M> LoopStructure map(M Map) const {
     LoopStructure Result;
@@ -493,7 +482,6 @@ struct LoopStructure {
     Result.LoopExitAt = Map(LoopExitAt);
     Result.IndVarIncreasing = IndVarIncreasing;
     Result.IsSignedPredicate = IsSignedPredicate;
-    Result.IsIndVarNext = IsIndVarNext;
     return Result;
   }
 
@@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
     return false;
   };
 
-  // `ICI` can either be a comparison against IV or a comparison of IV.next.
-  // Depending on the interpretation, we calculate the start value differently.
+  // `ICI` is interpreted as taking the backedge if the *next* value of the
+  // induction variable satisfies some constraint.
 
-  // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch
-  // comparisons happens against the IV before or after its value is
-  // incremented. Two valid combinations for them are:
-  //
-  // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false },
-  // 2) { iv.next; true }.
-  //
-  // The latch comparison happens against IndVarBase which can be either current
-  // or next value of the induction variable.
   const SCEVAddRecExpr *IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV);
   bool IsIncreasing = false;
   bool IsSignedPredicate = true;
-  bool IsIndVarNext = false;
   ConstantInt *StepCI;
   if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) {
     FailureReason = "LHS in icmp not induction variable";
     return None;
   }
 
-  const SCEV *IndVarStart = nullptr;
-  // TODO: Currently we only handle comparison against IV, but we can extend
-  // this analysis to be able to deal with comparison against sext(iv) and such.
-  if (isa<PHINode>(LeftValue) &&
-      cast<PHINode>(LeftValue)->getParent() == Header)
-    // The comparison is made against current IV value.
-    IndVarStart = IndVarBase->getStart();
-  else {
-    // Assume that the comparison is made against next IV value.
-    const SCEV *StartNext = IndVarBase->getStart();
-    const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
-    IndVarStart = SE.getAddExpr(StartNext, Addend);
-    IsIndVarNext = true;
-  }
+  const SCEV *StartNext = IndVarBase->getStart();
+  const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
+  const SCEV *IndVarStart = SE.getAddExpr(StartNext, Addend);
   const SCEV *Step = SE.getSCEV(StepCI);
 
   ConstantInt *One = ConstantInt::get(IndVarTy, 1);
@@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
   Result.IndVarIncreasing = IsIncreasing;
   Result.LoopExitAt = RightValue;
   Result.IsSignedPredicate = IsSignedPredicate;
-  Result.IsIndVarNext = IsIndVarNext;
 
   FailureReason = nullptr;
 
@@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd(
                                       BranchToContinuation);
 
     NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader);
-    auto *FixupValue =
-        LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN;
-    NewPHI->addIncoming(FixupValue, RRI.ExitSelector);
+    NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch),
+                        RRI.ExitSelector);
     RRI.PHIValuesAtPseudoExit.push_back(NewPHI);
   }
 
@@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop *L, LPPassManager &LPM) {
   }
   LoopStructure LS = MaybeLoopStructure.getValue();
   const SCEVAddRecExpr *IndVar =
-      cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase));
-  if (LS.IsIndVarNext)
-    IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar,
-                                                  SE.getSCEV(LS.IndVarStep)));
+      cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep)));
 
   Optional<InductiveRangeCheck::Range> SafeIterRange;
   Instruction *ExprInsertPt = Preheader->getTerminator();
diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll
deleted file mode 100644
index afea0e6..0000000
--- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll
+++ /dev/null
@@ -1,182 +0,0 @@
-; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 | FileCheck %s
-
-; Check that IRCE is able to deal with loops where the latch comparison is
-; done against current value of the IV, not the IV.next.
-
-; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-
-; SLT condition for increasing loop from 0 to 100.
-define void @test_01(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK:      test_01
-; CHECK:        entry:
-; CHECK-NEXT:     %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT:     [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK:        loop:
-; CHECK-NEXT:     %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT:     %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT:     %abc = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK:        in.bounds:
-; CHECK-NEXT:     %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT:     store i32 0, i32* %addr
-; CHECK-NEXT:     %next = icmp slt i32 %idx, 100
-; CHECK-NEXT:     [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK:        main.exit.selector:
-; CHECK-NEXT:     %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT:     [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100
-; CHECK-NEXT:     br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK:        loop.postloop:
-; CHECK-NEXT:    %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT:     %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT:     %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT:     br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
-  %len = load i32, i32* %a_len_ptr, !range !0
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %abc = icmp slt i32 %idx, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp slt i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; ULT condition for increasing loop from 0 to 100.
-define void @test_02(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK:      test_02
-; CHECK:        entry:
-; CHECK-NEXT:     %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT:     [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK:        loop:
-; CHECK-NEXT:     %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT:     %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT:     %abc = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK:        in.bounds:
-; CHECK-NEXT:     %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT:     store i32 0, i32* %addr
-; CHECK-NEXT:     %next = icmp ult i32 %idx, 100
-; CHECK-NEXT:     [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK:        main.exit.selector:
-; CHECK-NEXT:     %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT:     [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100
-; CHECK-NEXT:     br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK:        loop.postloop:
-; CHECK-NEXT:    %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT:     %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT:     %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT:     br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
-  %len = load i32, i32* %a_len_ptr, !range !0
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %abc = icmp ult i32 %idx, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp ult i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; Same as test_01, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_03(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK:      test_03
-
-entry:
-  %len = load i64, i64* %a_len_ptr, !range !1
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %idx.ext = sext i32 %idx to i64
-  %abc = icmp slt i64 %idx.ext, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp slt i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; Same as test_02, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_04(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK:      test_04
-
-entry:
-  %len = load i64, i64* %a_len_ptr, !range !1
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %idx.ext = sext i32 %idx to i64
-  %abc = icmp ult i64 %idx.ext, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp ult i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-!0 = !{i32 0, i32 50}
-!1 = !{i64 0, i64 50}

llvm-svn: 312775
2017-09-08 04:26:41 +00:00
Peter Collingbourne
6ed275352a WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat.
This is required when targeting COFF, as the comdat name must match
one of the names of the symbols in the comdat.

Differential Revision: https://reviews.llvm.org/D37550

llvm-svn: 312767
2017-09-08 00:10:53 +00:00
Reid Kleckner
b5c890aa13 Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include
Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.

llvm-svn: 312759
2017-09-07 23:27:44 +00:00
Richard Trieu
71428966b0 Revert r312318, r312325, r312424, r312489
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318

Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490

llvm-svn: 312758
2017-09-07 23:20:35 +00:00
Krzysztof Parzyszek
cae9ddd062 Disable jump threading into loop headers
Consider this type of a loop:
    for (...) {
      ...
      if (...) continue;
      ...
    }
Normally, the "continue" would branch to the loop control code that
checks whether the loop should continue iterating and which contains
the (often) unique loop latch branch. In certain cases jump threading
can "thread" the inner branch directly to the loop header, creating
a second loop latch. Loop canonicalization would then transform this
loop into a loop nest. The problem with this is that in such a loop
nest neither loop is countable even if the original loop was. This
may inhibit subsequent loop optimizations and be detrimental to
performance.

Differential Revision: https://reviews.llvm.org/D36404

llvm-svn: 312664
2017-09-06 19:36:58 +00:00
Sanjay Patel
1db23187f2 [ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants
This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite():
https://bugs.llvm.org/show_bug.cgi?id=27145

In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno
with a constant operand.

But while looking at those patterns, I realized we were missing a canonicalization for nonzero
constants. Rather than limiting to just folds for constants, we're adding a general value
tracking method for this based on an existing DAG helper.

By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps()
and pick up missing vector folds.

Differential Revision: https://reviews.llvm.org/D37427

llvm-svn: 312591
2017-09-05 23:13:13 +00:00
Davide Italiano
de3b2240d0 [GVNHoist] Move duplicated code to a helper function. NFCI.
llvm-svn: 312575
2017-09-05 20:49:41 +00:00
Craig Topper
6f75caf47a [InstCombine] Move foldSelectICmpAnd helper function earlier in the file to enable reuse in a future patch.
llvm-svn: 312518
2017-09-05 05:26:37 +00:00
Craig Topper
6033202ee4 [InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know for sure we're going to use it and avoid an unnecessary call to m_APInt.
Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt.

This is a refactor for a future patch that will do some more checks of the constant values here.

llvm-svn: 312517
2017-09-05 05:26:36 +00:00
Daniel Berlin
ea1236fd21 NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant.
llvm-svn: 312510
2017-09-05 02:17:43 +00:00
Daniel Berlin
04ea683d08 NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification
llvm-svn: 312509
2017-09-05 02:17:42 +00:00
Daniel Berlin
24e0209f20 NewGVN: Detect copies through predicateinfo
llvm-svn: 312508
2017-09-05 02:17:41 +00:00
Daniel Berlin
7030d893d8 NewGVN: Change where check for original instruction in phi of ops leader finding is done. Where we had it before, we would stop looking when we hit the original instruction, but skip it. Now we skip it and keep looking.
llvm-svn: 312507
2017-09-05 02:17:40 +00:00
Zvi Rackover
d91a5044bd LoopVectorize: MaxVF should not be larger than the loop trip count
Summary:
Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count.

Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow:
1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks.
2. Compute MaxVF returned 16 on a target supporting AVX512.
3. OptForSize -> choose VF:=MaxVF.
4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop.

With this patch step 2. will choose MaxVF=8 based on TripCount.

Reviewers: Ayal, dorit, mkuper, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D37425

llvm-svn: 312472
2017-09-04 08:35:13 +00:00
Sam Parker
281954d2c5 [LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop
Debug information can be, and was, corrupted when the runtime
remainder loop was fully unrolled. This is because a !null node can
be created instead of a unique one describing the loop. In this case,
the original node gets incorrectly updated with the NewLoopID
metadata.

In the case when the remainder loop is going to be quickly fully
unrolled, there isn't the need to add loop metadata for it anyway.

Differential Revision: https://reviews.llvm.org/D37338

llvm-svn: 312471
2017-09-04 08:12:16 +00:00
Sanjay Patel
cfa7e8ff1b [InstCombine] replace unnecessary fcmp fold with assert
See https://reviews.llvm.org/rL312411 for related InstSimplify tests.

llvm-svn: 312421
2017-09-02 18:10:29 +00:00
Sanjay Patel
9c342733b7 [InstCombine] combine foldAndOfFCmps and foldOrOfFcmps; NFCI
In addition to removing chunks of duplicated code, we don't
want these to diverge. If there's a fold for one, there
should be a fold of the other via DeMorgan's Laws.

llvm-svn: 312420
2017-09-02 17:53:33 +00:00
Sanjay Patel
8c9a3cc176 [InstCombine] fix misnamed locals and use them to reduce code; NFCI
We had these locals:
Value *Op0RHS = LHS->getOperand(1);
Value *Op1LHS = RHS->getOperand(0);
...so we confusingly transposed the meaning of left/right and op0/op1.

llvm-svn: 312418
2017-09-02 17:17:17 +00:00
Benjamin Kramer
3052672172 [LoopVectorize] Turn static DenseSet into switch.
LLVM transforms this into a bit test which is a lot faster and smaller.

llvm-svn: 312417
2017-09-02 16:41:55 +00:00
Sanjay Patel
f62d223574 [InstCombine] remove unnecessary code; NFC
llvm-svn: 312416
2017-09-02 16:32:37 +00:00
Sanjay Patel
058fe13af0 [InstCombine] move related functions next to each other; NFC
This makes it easier to see that they're almost duplicates.
As with the similar icmp functions, there should be identical 
folds for both logic ops because those are DeMorganized variants.

llvm-svn: 312415
2017-09-02 16:30:27 +00:00
Sanjay Patel
42dfc958f8 [InstCombine] use local variable to reduce code duplication; NFCI
llvm-svn: 312414
2017-09-02 15:11:55 +00:00
Daniel Berlin
f723f3f577 Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders.
Summary:
After a discussion with Rekka, i believe this (or a small variant)
should fix the remaining phi-of-ops problems.

Rekka's algorithm for completeness relies on looking up expressions
that should have no leader, and expecting it to fail (IE looking up
expressions that can't exist in a predecessor, and expecting it to
find nothing).

Unfortunately, sometimes these expressions can be simplified to
constants, but we need the lookup to fail anyway.  Additionally, our
simplifier outsmarts this by taking these "not quite right"
expressions, and simplifying them into other expressions or walking
through phis, etc.  In the past, we've sometimes been able to find
leaders for these expressions, incorrectly.

This change causes us to not to try to phi of ops such expressions.
We determine safety by seeing if they depend on a phi node in our
block.

This is not perfect, we can do a bit better, but this should be a
"correctness start" that we can then improve.  It also requires a
bunch of caching that i'll eventually like to eliminate.

The right solution, longer term, to the simplifier issues, is to make
the query interface for the instruction simplifier/constant folder
have the flags we need, so that we can keep most things going, but
turn off the possibly-invalid parts (threading through phis, etc).
This is an issue in another wrong code bug as well.

Reviewers: davide, mcrosier

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37175

llvm-svn: 312401
2017-09-02 02:18:44 +00:00
Eugene Zelenko
cbd8f32d28 [Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 312383
2017-09-01 21:37:29 +00:00
Craig Topper
fe8f0ed2ee [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.

This allows some code to be removed from InstSimplify that was doing this functionality.

This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.

There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.

Differential Revision: https://reviews.llvm.org/D37158

llvm-svn: 312382
2017-09-01 21:27:34 +00:00
Craig Topper
05d0465354 [InstCombine] Don't require the compare types to be the same in getMaskedTypeForICmpPair.
A future patch will make the code look through truncates feeding the compare. So the compares might be different types but the pretruncated types might be the same.

This should be safe because we still require the same Value* to be used truncated or not in both compares. So that serves to ensure the types are the same.

llvm-svn: 312381
2017-09-01 21:27:31 +00:00
Craig Topper
ccb2b3c89a [InstCombine] When converting decomposeBitTestICmp's APInt return to ConstantInt, make sure we use the type from the Value* that was also returned from decomposeBitTestICmp.
Previously we used the type from the LHS of the compare, but a future patch will change decomposeBitTestICmp to look through truncates so it will return a pretruncated Value* and the type needs to match that.

llvm-svn: 312380
2017-09-01 21:27:29 +00:00
Daniel Berlin
e854f4814d NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops
Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context.

Reviewers: davide, mcrosier

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D37174

llvm-svn: 312352
2017-09-01 19:20:18 +00:00
Manoj Gupta
1bbda74bca [LoopVectorizer] Use two step casting for float to pointer types.
Summary:
LoopVectorizer is creating casts between vec<ptr> and vec<float> types
on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a
floating point type to a pointer type even if the types have same size
causing a crash. Fix the crash using a two-step casting by bitcasting
to integer and integer to pointer/float.
Fixes PR33804.

Reviewers: mkuper, Ayal, dlj, rengolin, srhines

Reviewed By: rengolin

Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35498

llvm-svn: 312331
2017-09-01 15:36:00 +00:00
Clement Courbet
57b70dfd78 [MergeICmps] Fix build of rL312315 on clang-with-thin-lto-windows:
MergeICmps.cpp(68,15): error: chosen constructor is explicit in copy-initialization
      return {};
APInt.h(339,12): note: explicit constructor declared here
  explicit APInt() : BitWidth(1) { U.VAL = 0; }
             ^
MergeICmps.cpp(56,9): note: in implicit initialization of field 'Offset' with omitted
initializer
  APInt Offset;
          ^

llvm-svn: 312326
2017-09-01 11:51:23 +00:00
Clement Courbet
d92186722f Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
Add missing header.

This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf.

llvm-svn: 312322
2017-09-01 10:56:34 +00:00
Strahinja Petrovic
169dc52e38 Debug info for variables whose type is shrinked to bool
This patch provides such debug information for integer
variables whose type is shrinked to bool by providing 
dwarf expression which returns either constant initial 
value or other value.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D35994

llvm-svn: 312318
2017-09-01 10:05:27 +00:00
Clement Courbet
e0f18240ac Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer"
Break build

This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b.

llvm-svn: 312317
2017-09-01 09:43:08 +00:00
Clement Courbet
7f476c0f6f [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
comparisons into memcmp.

Thanks to recent improvements in the LLVM codegen, the memcmp is typically
inlined as a chain of efficient hardware comparisons.
This typically benefits C++ member or nonmember operator==().

For now this is disabled by default until:
 - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete
 - Benchmarks show that this is always useful.

Differential Revision:
https://reviews.llvm.org/D33987

llvm-svn: 312315
2017-09-01 09:07:05 +00:00
Eugene Zelenko
f25fa567b0 [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC).
llvm-svn: 312289
2017-08-31 21:56:16 +00:00
Akira Hatanaka
149ba0c692 [ObjCARC] Pass the correct BasicBlock to fix assertion failure.
The BasicBlock passed to FindPredecessorRetainWithSafePath should be the
parent block of Autorelease. This fixes a crash that occurs in
FindDependencies when StartInst is not in StartBB.

rdar://problem/33866381

llvm-svn: 312266
2017-08-31 18:27:47 +00:00
Sanjay Patel
add1397ba7 [InstCombine] improve demanded vector elements analysis of insertelement
Recurse instead of returning on the first found optimization. Also, return early in the caller
instead of continuing because that allows another round of simplification before we might
potentially lose undef information from a shuffle mask by eliminating the shuffle.

As noted in the review, we could probably do better and be more efficient by moving all of
demanded elements into a separate pass, but this is yet another quick fix to instcombine.

Differential Revision: https://reviews.llvm.org/D37236

llvm-svn: 312248
2017-08-31 15:57:17 +00:00
Dinar Temirbulatov
727d1c6273 [SLPVectorizer] Move out Entry->NeedToGather check and assert of inner loop as invariant, NFCI.
llvm-svn: 312242
2017-08-31 14:10:07 +00:00
Max Kazantsev
a43a398580 [IRCE] Identify loops with latch comparison against current IV value
Current implementation of parseLoopStructure interprets the latch comparison as a
comarison against `iv.next`. If the actual comparison is made against the `iv` current value
then the loop may be rejected, because this misinterpretation leads to incorrect evaluation
of the latch start value.

This patch teaches the IRCE to distinguish this kind of loops and perform the optimization
for them. Now we use `IndVarBase` variable which can be either next or current value of the
induction variable (previously we used `IndVarNext` which was always the value on next iteration).

Differential Revision: https://reviews.llvm.org/D36215

llvm-svn: 312221
2017-08-31 07:04:20 +00:00
Max Kazantsev
bd3cb213bf [IRCE][NFC] Rename IndVarNext to IndVarBase
Renaming as a preparation step to generalizing IRCE for comparison not only against
the next value of an indvar, but also against the current.

Differential Revision: https://reviews.llvm.org/D36509

llvm-svn: 312215
2017-08-31 05:58:15 +00:00
Adrian Prantl
42340d8718 Don't add a fragment expression when GlobalSRA splits up a single-member struct
Fixes PR34390.

https://bugs.llvm.org/show_bug.cgi?id=34390

llvm-svn: 312196
2017-08-31 00:06:18 +00:00
Matt Morehouse
2af47f8696 [SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer
Summary:
- Don't sanitize __sancov_lowest_stack.
- Don't instrument leaf functions.
- Add CoverageStackDepth to Fuzzer and FuzzerNoLink.
- Only enable on Linux.

Reviewers: vitalybuka, kcc, george.karpenkov

Reviewed By: kcc

Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37156

llvm-svn: 312185
2017-08-30 22:49:31 +00:00
Adrian Prantl
6b1b2b3ca5 Refactor DIBuilder::createFragmentExpression into a static DIExpression member
NFC

llvm-svn: 312165
2017-08-30 20:04:17 +00:00
Daniel Berlin
e1681e6ee2 NewGVN: Make sure we add the correct user if we swapped the comparison operands
llvm-svn: 312162
2017-08-30 19:53:23 +00:00
Daniel Berlin
32f7caf670 NewGVN: Allow simplification into variables
llvm-svn: 312161
2017-08-30 19:52:39 +00:00
Benjamin Kramer
e94b55aedd [GVNSink] Remove dependency on SmallPtrSet iteration order.
Found by LLVM_ENABLE_REVERSE_ITERATION.

llvm-svn: 312156
2017-08-30 18:46:37 +00:00
Sanjay Patel
3d0c6d78ae [InstCombine] remove unnecessary vector select fold; NFCI
This code is double-dead:
1. We simplify all selects with constant true/false condition in InstSimplify.
   I've minimized/moved the tests to show that works as expected.
2. All remaining vector selects with a constant condition are canonicalized to
   shufflevector, so we really can't see this pattern.

llvm-svn: 312123
2017-08-30 14:04:57 +00:00
Florian Hahn
75fa314f27 [InstCombine] Fold insert sequence if first ins has multiple users.
Summary:
If the first insertelement instruction has multiple users and inserts at
position 0, we can re-use this instruction when folding a chain of
insertelement instructions. As we need to generate the first
insertelement instruction anyways, this should be a strict improvement.

We could get rid of the restriction of inserting at position 0 by
creating a different shufflemask, but it is probably worth to keep the
first insertelement instruction with position 0, as this is easier to do
efficiently than at other positions I think.

Reviewers: grosser, mkuper, fpetrogalli, efriedma

Reviewed By: fpetrogalli

Subscribers: gareevroman, llvm-commits

Differential Revision: https://reviews.llvm.org/D37064

llvm-svn: 312110
2017-08-30 10:54:21 +00:00
Mandeep Singh Grang
efb6beb731 [cfi] Fixed non-determinism in codegen due to DenseSet iteration order
llvm-svn: 312098
2017-08-30 04:47:21 +00:00
Evgeniy Stepanov
5ed30160a9 [cfi] Avoid branch veneers in jump tables when possible.
Summary:
When jumptable encoding does not match target code encoding (arm vs
thumb), a veneer is inserted by the linker. We can not avoid this
in all cases, because entries within one jumptable must have the same
encoding, but we can make it less common by selecting the jumptable
encoding to match the majority of its targets.

This change only covers FullLTO, and not ThinLTO.

Reviewers: pcc

Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37171

llvm-svn: 312054
2017-08-29 22:40:19 +00:00
Evgeniy Stepanov
8b7ba14751 [cfi] Build __cfi_check as Thumb when applicable.
Summary:
Cross-DSO CFI needs all __cfi_check exports to use the same encoding
(ARM vs Thumb).

Reviewers: pcc

Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37243

llvm-svn: 312052
2017-08-29 22:29:15 +00:00
Matt Morehouse
730a7b1ffd Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer"
This reverts r312026 due to bot breakage.

llvm-svn: 312047
2017-08-29 21:56:56 +00:00
Wei Mi
549c611992 [LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement
This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition
fails to find a loop invariant condition. This is wrong and it will disable loop
unswitch for select. The patch fixes the bug.

Differential Revision: https://reviews.llvm.org/D36985

llvm-svn: 312045
2017-08-29 21:45:11 +00:00
Benjamin Kramer
f41c3e4471 [FunctionImport] Avoid unused variable warnings in Release builds
Just skip the entire block in NDEBUG. No functionality change intended.

llvm-svn: 312031
2017-08-29 20:24:39 +00:00
Alexey Bataev
eaced7f8cb [SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores.
Summary:
If SimplifyCFG pass is able to merge conditional stores into single one,
it loses the alignment. This may lead to incorrect codegen. Patch
sets the alignment of the new instruction if it is set in the original
one.

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36841

llvm-svn: 312030
2017-08-29 20:06:24 +00:00
Matt Morehouse
915473d0a1 [SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer
Summary:
- Don't sanitize __sancov_lowest_stack.
- Don't instrument leaf functions.
- Add CoverageStackDepth to Fuzzer and FuzzerNoLink.
- Disable stack depth tracking on Mac.

Reviewers: vitalybuka, kcc, george.karpenkov

Reviewed By: kcc

Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37156

llvm-svn: 312026
2017-08-29 19:48:12 +00:00
Craig Topper
56c044c7a8 [InstCombine] Support vector splats in transformZExtICmp
This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code.

One test didn't vectorize optimize properly due to another bug so a TODO has been added.

Differential Revision: https://reviews.llvm.org/D37253

llvm-svn: 312023
2017-08-29 18:58:13 +00:00
Teresa Johnson
7b540b72b0 [ThinLTO] Clean up stale alias import handling
Summary:
Remove some code that was no longer needed. The first FIXME is
stale since we long ago started using the index to drive importing,
rather than doing force importing based on linkage type. And
now with r309278, we no longer import any aliases.

Reviewers: dblaikie

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D37266

llvm-svn: 312019
2017-08-29 18:15:34 +00:00
Dehao Chen
b75a9f43bd Add null check for promoted direct call
Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed.

Reviewers: tejohnson, xur, davidxl, djasper

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37252

llvm-svn: 312005
2017-08-29 15:28:12 +00:00
Sanjay Patel
4aeffc1bf9 [Instruction] add moveAfter() convenience function; NFCI
As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(),
but implemented using moveBefore() for symmetry/efficiency.

Differential Revision: https://reviews.llvm.org/D37239

llvm-svn: 312001
2017-08-29 14:07:48 +00:00
Max Kazantsev
4e5297ed76 [LSR] Fix Shadow IV in case of integer overflow
When LSR processes code like

  int accumulator = 0;
  for (int i = 0; i < N; i++) {
    accummulator += i;
    use((double) accummulator);
  }

It may decide to replace integer `accumulator` with a double Shadow IV to get rid
of casts.  The problem with that is that the `accumulator`'s value may overflow.
Starting from this moment, the behavior of integer and double accumulators
will differ.

This patch strenghtens up the conditions of Shadow IV mechanism applicability.
We only allow it for IVs that are proved to be `AddRec`s with `nsw`/`nuw` flag.

Differential Revision: https://reviews.llvm.org/D37209

llvm-svn: 311986
2017-08-29 07:32:20 +00:00
Craig Topper
f15c02fe65 [InstCombine] Teach foldSelectICmpAndOr to handle vector splats
This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail.

Differential Revision: https://reviews.llvm.org/D37237

llvm-svn: 311960
2017-08-29 00:13:49 +00:00
Justin Bogner
ab955cada2 [sanitizer-coverage] Mark the guard and 8-bit counter arrays as used
In r311742 we marked the PCs array as used so it wouldn't be dead
stripped, but left the guard and 8-bit counters arrays alone since
these are referenced by the coverage instrumentation. This doesn't
quite work if we want the indices of the PCs array to match the other
arrays though, since elements can still end up being dead and
disappear.

Instead, we mark all three of these arrays as used so that they'll be
consistent with one another.

llvm-svn: 311959
2017-08-29 00:11:05 +00:00
Justin Bogner
3fbd1ce502 [sanitizer-coverage] Return the array from CreatePCArray. NFC
Be more consistent with CreateFunctionLocalArrayInSection in the API
of CreatePCArray, and assign the member variable in the caller like we
do for the guard and 8-bit counter arrays.

This also tweaks the order of method declarations to match the order
of definitions in the file.

llvm-svn: 311955
2017-08-28 23:46:11 +00:00
Justin Bogner
2c2f9c39d6 [sanitizer-coverage] Clean up trailing whitespace. NFC
llvm-svn: 311954
2017-08-28 23:38:12 +00:00
Kamil Rytarowski
c304eefca8 Define NetBSD/amd64 ASAN Shadow Offset
Summary:
Catch up after compiler-rt changes and define kNetBSD_ShadowOffset64
as (1ULL << 46).
 
Sponsored by <The NetBSD Foundation>

Reviewers: kcc, joerg, filcab, vitalybuka, eugenis

Reviewed By: eugenis

Subscribers: llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D37234

llvm-svn: 311941
2017-08-28 22:13:52 +00:00