1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

18653 Commits

Author SHA1 Message Date
Chandler Carruth
f802228b0f Revert r311077: [LV] Using VPlan ...
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

llvm-svn: 311304
2017-08-20 23:17:11 +00:00
Benjamin Kramer
3023506a20 [Mem2Reg] Modernize code a bit.
No functionality change intended.

llvm-svn: 311290
2017-08-20 14:34:44 +00:00
Benjamin Kramer
b795ef1cb5 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Aditya Kumar
2eb341c166 [Loop Vectorize] Added a separate metadata
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

llvm-svn: 311281
2017-08-20 10:32:41 +00:00
Sam Elliott
27ac91642e Revert "Emit only A Single Opt Remark When Inlining"
Reverting due to clang build failure

llvm-svn: 311274
2017-08-20 06:55:10 +00:00
Sam Elliott
7849a007e5 Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311273
2017-08-20 06:43:34 +00:00
Teresa Johnson
8827aa8f71 [ThinLTO] Fix ThinLTO crash
Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

llvm-svn: 311254
2017-08-19 18:04:25 +00:00
Chandler Carruth
9d308b85ac [Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

llvm-svn: 311229
2017-08-19 06:56:11 +00:00
Chandler Carruth
8d95cf5170 [SLP] Fix an unused variable warning in non-asserts builds.
llvm-svn: 311227
2017-08-19 05:06:23 +00:00
Dinar Temirbulatov
8641c659f9 [SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.
llvm-svn: 311223
2017-08-19 03:15:07 +00:00
Dinar Temirbulatov
58a3537457 [SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.
Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766

llvm-svn: 311221
2017-08-19 02:54:20 +00:00
Xinliang David Li
cfb9a3b007 Fix comment /NFC
llvm-svn: 311209
2017-08-18 23:08:50 +00:00
Xinliang David Li
39614caf8e [Profile] backward propagate profile info in JumpThreading
Differential Revsion: http://reviews.llvm.org/D36864

llvm-svn: 311208
2017-08-18 23:00:05 +00:00
Max Kazantsev
1040ba2c7d [IRCE] Fix buggy behavior in Clamp
Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations.
In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned
predicates, but we did not check this for `S` which can in theory be negative.

This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative,
and it adds a test where such situation may lead to incorrect conditions calculation.

Differential Revision: https://reviews.llvm.org/D36873

llvm-svn: 311205
2017-08-18 22:50:29 +00:00
Ana Pazos
c748485cc5 [PGO] Fixed assertion due to mismatched memcpy size type.
Summary:
Memcpy intrinsics have size argument of any integer type, like i32 or i64.
Fixed size type along with its value when cloning the intrinsic.

Reviewers: davidxl, xur

Reviewed By: davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36844

llvm-svn: 311188
2017-08-18 19:17:08 +00:00
Matt Morehouse
38756d86aa [SanitizerCoverage] Add stack depth tracing instrumentation.
Summary:
Augment SanitizerCoverage to insert maximum stack depth tracing for
use by libFuzzer.  The new instrumentation is enabled by the flag
-fsanitize-coverage=stack-depth and is compatible with the existing
trace-pc-guard coverage.  The user must also declare the following
global variable in their code:
  thread_local uintptr_t __sancov_lowest_stack

https://bugs.llvm.org/show_bug.cgi?id=33857

Reviewers: vitalybuka, kcc

Reviewed By: vitalybuka

Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D36839

llvm-svn: 311186
2017-08-18 18:43:30 +00:00
Jakub Kuderski
4ff0daf0e1 [LoopRotate][Dominators] Use the incremental API to update DomTree
Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree.

Reviewers: dberlin, davide, grosser, sanjoy

Reviewed By: dberlin, davide

Subscribers: hiraditya, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D35581

llvm-svn: 311125
2017-08-17 21:48:19 +00:00
Jakub Kuderski
eb022aae37 [Dominators] Teach LoopUnswitch to use the incremental API
Summary:
This patch makes LoopUnswitch use new incremental API for updating dominators.
It also updates SplitCriticalEdge, as it is called in LoopUnswitch.

There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch.

Reviewers: dberlin, davide, sanjoy, grosser, chandlerc

Reviewed By: davide, grosser

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35528

llvm-svn: 311093
2017-08-17 16:45:35 +00:00
Simon Dardis
bf003e65ef [dfsan] Add explicit zero extensions for shadow parameters in function wrappers.
In the case where dfsan provides a custom wrapper for a function,
shadow parameters are added for each parameter of the function.
These parameters are i16s. For targets which do not consider this
a legal type, the lack of sign extension information would cause
LLVM to generate anyexts around their usage with phi variables
and calling convention logic.

Address this by introducing zero exts for each shadow parameter.

Reviewers: pcc, slthakur

Differential Revision: https://reviews.llvm.org/D33349

llvm-svn: 311087
2017-08-17 14:14:25 +00:00
Ayal Zaks
89896f022f [LV] Using VPlan to model the vectorized code and drive its transformation
VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
patch introduces the VPlan model into LV and uses it to represent the vectorized
code and drive the generation of vectorized IR.

In this patch VPlan models the vectorized loop body: the vectorized control-flow
is represented using VPlan's Hierarchical CFG, with predication refactored from
being a post-vectorization-step into a vectorization planning step modeling
if-then VPRegionBlocks, and generating code inline with non-predicated code. The
vectorized code within each VPBasicBlock is represented as a sequence of
Recipes, each responsible for modelling and generating a sequence of IR
instructions. To keep the size of this commit manageable the Recipes in this
patch are coarse-grained and capture large chunks of LV's code-generation logic.
The constructed VPlans are dumped in dot format under -debug.

This commit retains current vectorizer output, except for minor instruction
reorderings; see associated modifications to lit tests.

For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
and its references.

Authors: Gil Rapaport and Ayal Zaks

Differential Revision: https://reviews.llvm.org/D32871

llvm-svn: 311077
2017-08-17 09:29:59 +00:00
Jakub Kuderski
45d778e4d9 Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

The patch was originally committed in r311039 and reverted in r311049.
This revision fixes the problem with not adding a dependency on the
DominatorTreeWrapperPass for the LegacyPassManager.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311057
2017-08-17 01:41:49 +00:00
Amjad Aboud
40d0dfb492 [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount)
Differential Revision: https://reviews.llvm.org/D36784

llvm-svn: 311050
2017-08-16 22:42:38 +00:00
Jakub Kuderski
51e1d27f30 Revert "[ADCE][Dominators] Teach ADCE to preserve dominators"
This reverts commit r311039. The patch caused the
`test/Bindings/OCaml/Output/scalar_opts.ml` to fail.

llvm-svn: 311049
2017-08-16 22:10:53 +00:00
Craig Topper
6a4ddfbe1b [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors
This also uses decomposeBitTestICmp to decode the compare.

Differential Revision: https://reviews.llvm.org/D36781

llvm-svn: 311044
2017-08-16 21:52:07 +00:00
Jakub Kuderski
374b3a728d [ADCE][Dominators] Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311039
2017-08-16 20:50:23 +00:00
Geoff Berry
d15ab7857f [LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution
Summary:
Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
ScalarEvolution since they do not alter loop structure and should not
alter any SCEV values (though LoopDataPrefetch may introduce new
instructions that won't have cached SCEV values yet).

This can result in slight code differences, mainly w.r.t. nsw/nuw flags
on SCEVs, since these are computed somewhat lazily when a zext/sext
instruction is encountered.  As a result, passes after the modified
passes may see SCEVs with more nsw/nuw flags present.

Reviewers: sanjoy, anemet

Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36716

llvm-svn: 311032
2017-08-16 19:03:16 +00:00
Hal Finkel
d939b8436e [BDCE] Don't check demanded bits on unsized types
To clear assumptions that are potentially invalid after trivialization, we need
to walk the use/def chain. Normally, the only way to reach an instruction with
an unsized type is via an instruction that has side effects (or otherwise will
demand its input bits). That would stop the walk. However, if we have a
readnone function that returns an unsized type (e.g., void), we must avoid
asking for the demanded bits of the function call's return value. A
void-returning readnone function is always dead (and so we can stop walking the
use/def chain here), but the check is necessary to avoid asserting.

Fixes PR34211.

llvm-svn: 311014
2017-08-16 16:09:22 +00:00
Dehao Chen
cd1e91b8a4 Merge debug info when hoist then-else code to if.
Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached.

Reviewers: aprantl, probinson, dblaikie, echristo, loladiro

Reviewed By: aprantl

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36778

llvm-svn: 310985
2017-08-16 01:55:26 +00:00
Craig Topper
3cf4550e3b [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount
We were only allowing ConstantInt before. This patch allows splat of ConstantInt too.

Differential Revision: https://reviews.llvm.org/D36763

llvm-svn: 310970
2017-08-15 22:48:41 +00:00
Amjad Aboud
b2c229e196 [InstCombine] Added support for (X >>s C) << C --> X & (-1 << C)
Differential Revision: https://reviews.llvm.org/D36743

llvm-svn: 310949
2017-08-15 19:33:14 +00:00
Sanjay Patel
e7f59bf4cc [InstCombine] sink sext after ashr
Narrow ops are better for bit-tracking, and in the case of vectors,
may enable better codegen.

As the trunc test shows, this can allow follow-on simplifications.

There's a block of code in visitTrunc that deals with shifted ops
with FIXME comments. It may be possible to remove some of that now,
but I want to make sure there are no problems with this step first.

http://rise4fun.com/Alive/Y3a

Name: hoist_ashr_ahead_of_sext_1
  %s = sext i8 %x to i32
  %r = ashr i32 %s, 3  ; shift value is < than source bit width
  =>
  %a = ashr i8 %x, 3
  %r = sext i8 %a to i32
  
Name: hoist_ashr_ahead_of_sext_2
  %s = sext i8 %x to i32
  %r = ashr i32 %s, 8  ; shift value is >= than source bit width
  =>
  %a = ashr i8 %x, 7   ; so clamp this shift value
  %r = sext i8 %a to i32
  
Name: junc_the_trunc  
  %a = sext i16 %v to i32
  %s = ashr i32 %a, 18
  %t = trunc i32 %s to i16
  =>
  %t = ashr i16 %v, 15
llvm-svn: 310942
2017-08-15 18:25:52 +00:00
Jakub Kuderski
e25b7db874 [Dominators] Include infinite loops in PostDominatorTree
Summary:
This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change.

What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG.

This patch makes the following assumptions:
- A sequence of updates should produce the same tree as a recalculating it.
- Any sequence of the same updates should lead to the same tree.
- Siblings and roots are unordered.

The last two properties are essential to efficiently perform batch updates in the future.
When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently.

This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough.
That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping  clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call:

```
# functions:  52283
# samples:  337609
# reverse unreachable BBs:  216022
# BBs:  247840796
Percent reverse-unreachable:  0.08716159869015269 %
Max(PercRevUnreachable) in a function:  87.58620689655172 %
# > 25 % samples:  471 ( 0.1395104988314885 % samples )
... in 145 ( 0.27733680163724345 % functions )
```

Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway.

I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :).

Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel

Reviewed By: dberlin

Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050

Differential Revision: https://reviews.llvm.org/D35851

llvm-svn: 310940
2017-08-15 18:14:57 +00:00
Rui Ueyama
aaf73b4bf2 Fix -Wunused-lambda-capture for Release build.
`I` and `this` are used only in assert or DEBUG, so they are unused
in Release build.

llvm-svn: 310934
2017-08-15 17:39:35 +00:00
Ayal Zaks
43ce19524f [LV] Minor savings to Sink casts to unravel first order recurrence
Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
is not needed.

Differential Revision: https://reviews.llvm.org/D36408

llvm-svn: 310910
2017-08-15 08:32:59 +00:00
Dinar Temirbulatov
4f1e6897ea [SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0,
replace E->Scalars[0] to VL0, NFCI.

llvm-svn: 310904
2017-08-15 00:31:49 +00:00
Dehao Chen
21b59fd2ae Add missing dependency in ICP. (NFC)
llvm-svn: 310896
2017-08-14 23:25:21 +00:00
Reid Kleckner
4947e7accc Remove checks for debug info intrinsics in use lists, NFC
These haven't done anything since debug info intrinsics stopped
appearing in Value use lists in 2014.

llvm-svn: 310892
2017-08-14 22:10:54 +00:00
Craig Topper
f902ca84e9 Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
This recommits r310869, with the moved files and no extra changes.

Original commit message:

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310889
2017-08-14 21:39:51 +00:00
Andrew Kaylor
76e7a73bbb Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Craig Topper
d159c6915c Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.

llvm-svn: 310873
2017-08-14 19:09:32 +00:00
Craig Topper
d0eb871eb7 [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310869
2017-08-14 18:49:42 +00:00
Dinar Temirbulatov
17df27f960 [SLPVectorizer] Schedule bundle with different opcodes.
This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ]

Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36518

llvm-svn: 310847
2017-08-14 15:40:16 +00:00
Sanjay Patel
c34d06472c [BDCE] reduce scope of an assert (PR34179)
The assert was added with r310779 and is usually correct,
but as the test shows, not always. The 'volatile' on the
load is needed to expose the faulty path because without
it, DemandedBits would return that the load is just dead
rather than not demanded, and so we wouldn't hit the
bogus assert.

Also, since the lambda is just a single-line now, get rid 
of it and inline the DB.isAllOnesValue() calls. 

This should fix (prevent execution of a faulty assert):
https://bugs.llvm.org/show_bug.cgi?id=34179

llvm-svn: 310842
2017-08-14 15:13:46 +00:00
Sam Parker
471134db57 [LoopUnroll] Enable option to peel remainder loop
On some targets, the penalty of executing runtime unrolling checks
and then not the unrolled loop can be significantly detrimental to
performance. This results in the need to be more conservative with
the unroll count, keeping a trip count of 2 reduces the overhead as
well as increasing the chance of the unrolled body being executed. But
being conservative leaves performance gains on the table.

This patch enables the unrolling of the remainder loop introduced by
runtime unrolling. This can help reduce the overhead of misunrolled
loops because the cost of non-taken branches is much less than the
cost of the backedge that would normally be executed in the remainder
loop. This allows larger unroll factors to be used without suffering
performance loses with smaller iteration counts.

Differential Revision: https://reviews.llvm.org/D36309

llvm-svn: 310824
2017-08-14 09:25:26 +00:00
Craig Topper
ae34cf0b5a [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants
Summary:
These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.

Next step is to use m_APInt instead of ConstantInt.

Reviewers: spatel, efriedma, davide, majnemer

Reviewed By: spatel

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D36439

llvm-svn: 310806
2017-08-14 00:04:21 +00:00
Sanjay Patel
9a31ece394 [BDCE] clear poison generators after turning a value into zero (PR33695, PR34037)
nsw, nuw, and exact carry implicit assumptions about their operands, so we need
to clear those after trivializing a value. We decided there was no danger for
llvm.assume or metadata, so there's just a comment about that.

This fixes miscompiles as shown in:
https://bugs.llvm.org/show_bug.cgi?id=33695
https://bugs.llvm.org/show_bug.cgi?id=34037

Differential Revision: https://reviews.llvm.org/D36592

llvm-svn: 310779
2017-08-12 16:41:08 +00:00
Eli Friedman
28e7964c1c [OptDiag] Updating Remarks in SampleProfile
Updating remark API to newer OptimizationDiagnosticInfo API. This
allows remarks to show up in diagnostic yaml file, and enables use
of opt-viewer tool.

Hotness information for remarks (L505 and L751) do not display hotness
information, most likely due to profile information not being
propagated yet. Unsure if this is the desired outcome.

Patch by Tarun Rajendran.

Differential Revision: https://reviews.llvm.org/D36127

llvm-svn: 310763
2017-08-11 21:12:04 +00:00
Xinliang David Li
8162f856f5 Fix typo /NFC
llvm-svn: 310737
2017-08-11 17:49:20 +00:00
Craig Topper
f441c2e9e1 [InstCombine] Make (X|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectors
This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2).

Differential Revision: https://reviews.llvm.org/D36505

llvm-svn: 310658
2017-08-10 20:35:34 +00:00
Craig Topper
d294d8b0ac [InstCombine] Fix a crash in getSelectCondition if we happen to have two inverse vectors of i1 constants.
We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check.

llvm-svn: 310639
2017-08-10 17:48:14 +00:00
Craig Topper
3a6cf94f85 [InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many instructions are visited for debug
Sometimes it would be nice to stop InstCombine mid way through its combining to see the current IR. By using a debug counter we can place an upper limit on how many instructions to process.

This will also allow skipping the first X combines, but that has the potential to change later combines since earlier canonicalizations might have been skipped.

Differential Revision: https://reviews.llvm.org/D36553

llvm-svn: 310638
2017-08-10 17:48:12 +00:00
Craig Topper
2d53908029 [DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use.
This make it consistent with STATISTIC which it will often appears near.

While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary.

llvm-svn: 310637
2017-08-10 17:48:11 +00:00
Alexander Potapenko
cbad8b88a2 [sanitizer-coverage] Change cmp instrumentation to distinguish const operands
This implementation of SanitizerCoverage instrumentation inserts different
callbacks depending on constantness of operands:

  1. If both operands are non-const, then a usual
     __sanitizer_cov_trace_cmp[1248] call is inserted.
  2. If exactly one operand is const, then a
     __sanitizer_cov_trace_const_cmp[1248] call is inserted. The first
     argument of the call is always the constant one.
  3. If both operands are const, then no callback is inserted.

This separation comes useful in fuzzing when tasks like "find one operand
of the comparison in input arguments and replace it with the other one"
have to be done. The new instrumentation allows us to not waste time on
searching the constant operands in the input.

Patch by Victor Chibotaru.

llvm-svn: 310600
2017-08-10 15:00:13 +00:00
Chad Rosier
cefc8ece4f [NewGVN] Add CL option to control the generation of phi-of-ops (disable by default).
Differential Revision: https://reviews.llvm.org/D36478539

llvm-svn: 310594
2017-08-10 14:12:57 +00:00
Sanjay Patel
4e5284fb6d [InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046)
I couldn't find any smaller folds to help the cases in:
https://bugs.llvm.org/show_bug.cgi?id=34046
after:
rL310141

The truncated rotate-by-variable patterns elude all of the existing transforms because 
of multiple uses and knowledge about demanded bits and knownbits that doesn't exist 
without the whole pattern. So we need an unfortunately large pattern match. But by 
simplifying this pattern in IR, the backend is already able to generate 
rolb/rolw/rorb/rorw for x86 using its existing rotate matching logic (although
there is a likely extraneous 'and' of the rotate amount). 

Note that rotate-by-constant doesn't have this problem - smaller folds should already 
produce the narrow IR ops.

Differential Revision: https://reviews.llvm.org/D36395

llvm-svn: 310509
2017-08-09 18:37:41 +00:00
Matt Morehouse
8235f0ba3e [asan] Fix instruction emission ordering with dynamic shadow.
Summary:
Instrumentation to copy byval arguments is now correctly inserted
after the dynamic shadow base is loaded.

Reviewers: vitalybuka, eugenis

Reviewed By: vitalybuka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D36533

llvm-svn: 310503
2017-08-09 17:59:43 +00:00
Jonas Paulsson
54a000e514 [LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess()
isLegalAddressingMode() has recently gained the extra optional Instruction*
parameter, and therefore it can now do the job that previously only
isFoldableMemAccess() could do.

The SystemZ implementation of isLegalAddressingMode() has gained the
functionality of checking for offsets, which used to be done with
isFoldableMemAccess().

The isFoldableMemAccess() hook has been removed everywhere.

Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D35933

llvm-svn: 310463
2017-08-09 11:28:01 +00:00
Jonas Paulsson
b5dbc00a27 [LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded().
In the recursive call to isAMCompletelyFolded(), the passed offset should be
the sum of F.BaseOffset and Fixup.Offset.

Review: Quentin Colombet.
llvm-svn: 310462
2017-08-09 11:27:46 +00:00
Davide Italiano
0dad68b666 [GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI.
llvm-svn: 310453
2017-08-09 09:23:29 +00:00
Craig Topper
837c0b5c9e [InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFC
llvm-svn: 310446
2017-08-09 06:17:48 +00:00
Wei Mi
6484ea55e4 [GVN] Remove stale entries in phitranslate cache when new phi is generated for PRE
When a new phi is generated for scalarpre of an expression, the phiTranslate cache
will become stale: Before PRE, the candidate expression must not be available in a
predecessor block, and phitranslate will cache the information. After PRE, the
expression will become available in all predecessor blocks, so the related entries
in phiTranslate cache becomes stale. The patch will simply remove the stale entries
so phiTranslate can be recomputed next time.

The stale entries in phitranslate cache will not affect correctness but will cause
missing PRE opportunity for later instructions.

Differential Revision: https://reviews.llvm.org/D36124

llvm-svn: 310421
2017-08-08 21:40:14 +00:00
Dehao Chen
a8aee54f27 Make ICP uses PSI to check for hotness.
Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted.

Reviewers: davidxl, tejohnson, eraman

Reviewed By: davidxl

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36341

llvm-svn: 310416
2017-08-08 20:57:33 +00:00
Craig Topper
76bcda5719 [InstCombine] Support pulling left shifts through a subtract with constant LHS
We already support pulling through an add with constant RHS. We can do the same for subtract.

Differential Revision: https://reviews.llvm.org/D36443

llvm-svn: 310407
2017-08-08 20:14:11 +00:00
Chad Rosier
98b1b144a6 [NewGVN] Use a cast instead of a dyn_cast.
Differential Revision: https://reviews.llvm.org/D36478

llvm-svn: 310397
2017-08-08 18:41:49 +00:00
Anna Thomas
ce71877c5e [LoopVectorize] Fix assertion failure in Fcmp vectorization
Summary:
When vectorizing fcmps we can trip on incorrect cast assertion when setting the
FastMathFlags after generating the vectorized FCmp.
This can happen if the FCmp can be folded to true or false directly. The fix
here is to set the FastMathFlag using the FastMathFlagBuilder *before* creating
the FCmp Instruction. This is what's done by other optimizations such as
InstCombine.
Added a test case which trips on cast assertion without this patch.

Reviewers: Ayal, mssimpso, mkuper, gilr

Reviewed by: Ayal, mssimpso

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36244

llvm-svn: 310389
2017-08-08 18:07:44 +00:00
Craig Topper
3fbd17d46d [InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify the code.
We no longer need the explicit operand count check or the later dynamic cast.

llvm-svn: 310339
2017-08-08 06:19:24 +00:00
Chandler Carruth
14f567c031 [PM] Fix new LoopUnroll function pass by invalidating loop analysis
results when a loop is completely removed.

This is very hard to manifest as a visible bug. You need to arrange for
there to be a subsequent allocation of a 'Loop' object which gets the
exact same address as the one which the unroll deleted, and you need the
LoopAccessAnalysis results to be significant in the way that they're
stale. And you need a million other things to align.

But when it does, you get a deeply mysterious crash due to actually
finding a stale analysis result. This fixes the issue and tests for it
by directly checking we successfully invalidate things. I have not been
able to get *any* test case to reliably trigger this. Changes to LLVM
itself caused the only test case I ever had to cease to crash.

I've looked pretty extensively at less brittle ways of fixing this and
they are actually very, very hard to do. This is a somewhat strange and
unusual case as we have a pass which is deleting an IR unit, but is not
running within that IR unit's pass framework (which is what handles this
cleanly for the normal loop unroll). And where there isn't a definitive
way to clear *all* of the stale cache entries. And where the pass *is*
updating the core analysis that provides the IR units!

For example, we don't have any of these problems with Function analyses
because it is easy to clear out function analyses when the functions
themselves may have been deleted -- we clear an entire module's worth!
But that is too heavy of a hammer down here in the LoopAnalysisManager
layer.

A better long-term solution IMO is to require that AnalysisManager's
make their keys durable to this kind of thing. Specifically, when
caching an analysis for one IR unit that is conceptually "owned" by
a higher level IR unit, the AnalysisManager should incorporate this into
its data structures so that we can reliably clear these results without
having to teach each and every pass to do so manually as we do here. But
that is a change for another day as it will be a fairly invasive change
to the AnalysisManager infrastructure. Until then, this fortunately
seems to be quite rare.

llvm-svn: 310333
2017-08-08 02:24:20 +00:00
Evgeny Stupachenko
8c60bbb9d2 Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720).
The root cause of reverting was fixed - PR33514.

Summary:
The patch makes instruction count the highest priority for
 LSR solution for X86 (previously registers had highest priority).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30562

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 310289
2017-08-07 19:56:34 +00:00
Aaron Ballman
7608dca219 Removing an unused variable that was missed with the refactoring in r310272; NFC.
llvm-svn: 310285
2017-08-07 19:26:17 +00:00
Craig Topper
e851864d61 [InstCombine] Support (X | C1) & C2 --> (X & C2^(C1&C2)) | (C1&C2) for vector splats
Note the original code I deleted incorrectly listed this as (X | C1) & C2 --> (X & C2^(C1&C2)) | C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code.

My new implementation avoids relying on that behavior so that it can be naively verified with alive.

Differential Revision: https://reviews.llvm.org/D36384

llvm-svn: 310272
2017-08-07 18:10:39 +00:00
Alexey Bataev
9f19665894 [SLP] General improvements of SLP vectorization process.
Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does:

1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts.
2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the
array. This array is processed only after the vectorization of the
first-after-these instructions key node is finished. Vectorization goes
in reverse order to try to vectorize as much code as possible.

Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon

Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits

Differential Revision: https://reviews.llvm.org/D29826

llvm-svn: 310260
2017-08-07 15:25:49 +00:00
Alexey Bataev
92afaf2479 Revert "[SLP] General improvements of SLP vectorization process."
This reverts commit r310255.

llvm-svn: 310257
2017-08-07 14:51:52 +00:00
Alexey Bataev
ec62fc0fc9 [SLP] General improvements of SLP vectorization process.
Summary:
Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does:
1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts.
2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible.

Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon

Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits

Differential Revision: https://reviews.llvm.org/D29826

llvm-svn: 310255
2017-08-07 14:03:17 +00:00
Vitaly Buka
cb2b0513d2 [asan] Fix asan dynamic shadow check before copyArgsPassedByValToAllocas
llvm-svn: 310242
2017-08-07 07:35:33 +00:00
Vitaly Buka
de5aa1767f [asan] Disable checking of arguments passed by value for --asan-force-dynamic-shadow
Fails with "Instruction does not dominate all uses!"

llvm-svn: 310241
2017-08-07 07:12:34 +00:00
Davide Italiano
1031f1aec8 [Reassociate] Use a range loop for clarity. NFCI.
While here, rename `i` to `Rank` as the latter is more
self-explanatory (and this code also uses `I` two lines below to
identify an Instruction).

llvm-svn: 310238
2017-08-07 01:57:21 +00:00
Davide Italiano
aeb942e819 [Reassociate] Try to bail out early when canonicalizing.
This commit rearranges the checks to avoid calls to getRank()
when not needed (e.g. when RHS == LHS).

llvm-svn: 310237
2017-08-07 01:49:09 +00:00
Craig Topper
7792eb189a [InstCombine] Remove shift handling from OptAndOp.
Summary: This is all handled by SimplifyDemandedBits.

Reviewers: spatel, davide

Reviewed By: davide

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D36382

llvm-svn: 310234
2017-08-06 23:30:49 +00:00
Craig Topper
5ce12bbc0a [InstCombine] Support (X ^ C1) & C2 --> (X & C2) ^ (C1&C2) for vector splats.
llvm-svn: 310233
2017-08-06 23:11:49 +00:00
Craig Topper
926fe5d0ed [InstCombine] Support '(C - X) ^ signmask -> (C + signmask - X)' and '(X + C) ^ signmask -> (X + C + signmask)' for vector splats.
llvm-svn: 310232
2017-08-06 22:17:21 +00:00
Craig Topper
d5cf04da1c [InstCombine] Support ~(c-X) --> X+(-c-1) and ~(X-c) --> (-c-1)-X for splat vectors.
llvm-svn: 310195
2017-08-06 06:28:41 +00:00
Craig Topper
f043cfbb9a [InstCombine] Fold (C - X) ^ signmask -> (C + signmask - X).
llvm-svn: 310186
2017-08-05 20:00:44 +00:00
Craig Topper
c19e9b9071 [InstCombine] Teach the code that pulls logical operators through constant shifts to handle vector splats too.
llvm-svn: 310185
2017-08-05 20:00:42 +00:00
Craig Topper
eb05f8fcb3 [InstCombine] Support vector splats in foldSelectICmpAnd.
Unfortunately, it looks like there's some other missed optimizations in the generated code for some of these cases. I'll try to look at some of those next.

llvm-svn: 310184
2017-08-05 20:00:41 +00:00
Dinar Temirbulatov
a7733cfe8c [SLPVectorizer] Add extra parameter to setInsertPointAfterBundle to handle different opcodes, NFCI.
Differential Revision: https://reviews.llvm.org/D35769

llvm-svn: 310183
2017-08-05 18:43:52 +00:00
Sanjay Patel
2eb16c0389 [InstCombine] refactor trunc(binop) transforms; NFCI
In addition to moving the shift transforms over, we may want to
detect too-wide rotate patterns here (PR34046). 

llvm-svn: 310181
2017-08-05 15:19:18 +00:00
Craig Topper
0f832f8e82 [InstCombine] In foldSelectICmpAnd, if we need to to truncate from the 'and' type to the 'select' type, do it after shifting right instead of just bailing.
Previously we were always trying to emit the zext or truncate before any shift. This meant if the 'and' mask was larger than the size of the truncate we would skip the transformation.

Now we shift the result of the and right first leaving the bit within the range of the truncate.

This matches what we are doing in foldSelectICmpAndOr for the same problem.

llvm-svn: 310159
2017-08-05 01:45:17 +00:00
Sanjay Patel
23bd4a9364 [InstCombine] narrow truncated add/sub/mul with constant
Name: narrow_sub
  %sub = sub i32 C1, %x
  %r = trunc i32 %sub to i8
  =>  
  %xn = trunc i32 %x to i8
  %narrowC = trunc i32 C1 to i8
  %r = sub i8 %narrowC, %xn
 
Name: narrow_add
  %add = add i32 %x, C1
  %r = trunc i32 %add to i8
  =>  
  %xn = trunc i32 %x to i8
  %narrowC = trunc i32 C1 to i8
  %r = add i8 %xn, %narrowC
  
Name: narrow_mul
  %mul = mul i32 %x, C1
  %r = trunc i32 %mul to i8
  =>  
  %xn = trunc i32 %x to i8
  %narrowC = trunc i32 C1 to i8
  %r = mul i8 %xn, %narrowC


http://rise4fun.com/Alive/QpS

This doesn't solve PR34046 (failure to recognize rotate):
https://bugs.llvm.org/show_bug.cgi?id=34046
...but it reduces an extra complication in the description examples 
to a form that we can more easily match.

llvm-svn: 310141
2017-08-04 22:30:34 +00:00
Nico Weber
1ee42ff794 Revert r310055, it caused PR34074.
llvm-svn: 310123
2017-08-04 20:40:38 +00:00
Evgeny Stupachenko
2be9c5c55b Fix PR33514
Summary:
The bug was uncovered after fix of  PR23384 (part 3 of 3).
The patch restricts pointer multiplication in SCEV computaion for ICmpZero.

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D36170

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 310092
2017-08-04 18:46:13 +00:00
Reid Kleckner
e29583d0d2 [ArgPromotion] Preserve alignment of byval argument in new alloca
The frontend may have requested a higher alignment for any reason, and
downstream optimizations may already have taken advantage of it.  We
should keep the same alignment when moving the allocation from the
parameter area to the local variable area.

Fixes PR34038

llvm-svn: 310071
2017-08-04 17:09:11 +00:00
Benjamin Kramer
423f2d0b11 [InstCombine] Fold single-use variable into assert.
Avoids unused variable warnings in Release builds. No functional change.

llvm-svn: 310064
2017-08-04 16:08:41 +00:00
Craig Topper
bcfbcd1b18 [InstCombine] Remove the (not (sext)) case from foldBoolSextMaskToSelect and inline the remaining code to match visitOr
Summary:
The (not (sext)) case is really (xor (sext), -1) which should have been simplified to (sext (xor, 1)) before we got here. So we shouldn't need to handle it.

With that taken care of we only need to two cases so don't need the swap anymore. This makes us in sync with the equivalent code in visitOr so inline this to match.

Reviewers: spatel, eli.friedman, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36240

llvm-svn: 310063
2017-08-04 16:07:20 +00:00
Craig Topper
fe1a0c7e67 [InstCombine] Use ConstantInt::getFalse to reduce some code. NFC
llvm-svn: 310062
2017-08-04 16:07:18 +00:00
Sanjay Patel
0e224c7256 [InstCombine] narrow lshr with constant
Name: narrow_shift
Pre: C1 < 8
%zx = zext i8 %x to i32
%l = lshr i32 %zx, C1
  =>  
%narrowC = trunc i32 C1 to i8
%ns = lshr i8 %x, %narrowC
%l = zext i8 %ns to i32

http://rise4fun.com/Alive/jIV

This isn't directly applicable to PR34046 as written, but we
need to have more narrowing folds like this to be sure that
rotate patterns are recognized.

llvm-svn: 310060
2017-08-04 15:42:47 +00:00
Filipe Cabecinhas
4058d9a4ca [DSE] Merge stores when the later store only writes to memory locations the early store also wrote to.
Summary:
This fixes PR31777.

If both stores' values are ConstantInt, we merge the two stores
(shifting the smaller store appropriately) and replace the earlier (and
larger) store with an updated constant.

In the future we should also support vectors of integers. And maybe
float/double if we can.

Reviewers: hfinkel, junbuml, jfb, RKSimon, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30703

llvm-svn: 310055
2017-08-04 12:28:36 +00:00
Nikolai Bozhenov
899aec6301 [InstCombine] Canonicalize clamp of float types to minmax in fast mode.
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.

This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.

Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D33186

llvm-svn: 310054
2017-08-04 12:22:17 +00:00
Max Kazantsev
9209a4521a Do not declare a variable which is used only in assert. NFC
llvm-svn: 310034
2017-08-04 07:41:24 +00:00
Max Kazantsev
ffa0782bec [IRCE] Handle loops with step different from 1/-1
This patch generalizes IRCE to handle IV steps that are not equal to 1 or -1.

Differential Revision: https://reviews.llvm.org/D35539

llvm-svn: 310032
2017-08-04 07:01:04 +00:00
Max Kazantsev
1aed4a4dfc [IRCE] Recognize loops with unsigned latch conditions
This patch enables recognition of loops with ult/ugt latch conditions.

Differential Revision: https://reviews.llvm.org/D35302

llvm-svn: 310027
2017-08-04 05:40:20 +00:00
Craig Topper
14179d8998 [InstCombine] Move the call to foldSelectICmpAnd into foldSelectInstWithICmp. NFCI
llvm-svn: 310025
2017-08-04 05:12:37 +00:00
Craig Topper
0f297d09b1 [InstCombine] Remove unnecessary casts. NFC
We're calling an overload of getOpcode that already returns Instruction::CastOps.

llvm-svn: 310024
2017-08-04 05:12:35 +00:00
Victor Leschuk
eabd98601c Un-revert r310014: false revert, it wasn't the cause of build break
llvm-svn: 310021
2017-08-04 04:51:15 +00:00
Victor Leschuk
fe0e5c87b4 Revert r310014 as it breaks build lld-x86_64-darwin13
llvm-svn: 310020
2017-08-04 04:43:54 +00:00
Adrian Prantl
d3acfe5504 Teach GlobalSRA to update the debug info for split-up globals.
This is similar to what we are doing in "regular" SROA and creates
DW_OP_LLVM_fragment operations to describe the resulting variables.

rdar://problem/33654891

llvm-svn: 310014
2017-08-04 01:19:54 +00:00
Teresa Johnson
cde6934bb7 Use profile summary to disable peeling for huge working sets
Summary:
Detect when the working set size of a profiled application is huge,
by comparing the number of counts required to reach the hot percentile
in the profile summary to a large threshold*.

When the working set size is determined to be huge, disable peeling
to avoid bloating the working set further.

*Note that the selected threshold (15K) is significantly larger than the
largest working set value in SPEC cpu2006 (which is gcc at around 11K).

Reviewers: davidxl

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36288

llvm-svn: 310005
2017-08-03 23:42:58 +00:00
Davide Italiano
6806b06bf7 [NewGVN] Fix the case where we have a phi-of-ops which goes away.
Patch by Daniel Berlin, fixes PR33196 (and probably something else).

llvm-svn: 309988
2017-08-03 21:17:49 +00:00
Teresa Johnson
ffba812867 Disable loop peeling during full unrolling pass.
Summary:
Peeling should not occur during the full unrolling invocation early
in the pipeline, but rather later with partial and runtime loop
unrolling. The later loop unrolling invocation will also eventually
utilize profile summary and branch frequency information, which
we would like to use to control peeling. And for ThinLTO we want
to delay peeling until the backend (post thin link) phase, just as
we do for most types of unrolling.

Ensure peeling doesn't occur during the full unrolling invocation
by adding a parameter to the shared implementation function, similar
to the way partial and runtime loop unrolling are disabled.

Performance results for ThinLTO suggest this has a neutral to positive
effect on some internal benchmarks.

Reviewers: chandlerc, davidxl

Subscribers: mzolotukhin, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36258

llvm-svn: 309966
2017-08-03 17:52:38 +00:00
Sanjay Patel
ef97ac86ad [NewGVN] fix typos; NFC
llvm-svn: 309946
2017-08-03 15:18:27 +00:00
Ewan Crawford
ae5418b4ef [Cloning] Move distinct GlobalVariable debug info metadata in CloneModule
Duplicating the distinct Subprogram and CU metadata nodes seems like the incorrect thing to do in CloneModule for GlobalVariable debug info. As it results in the scope of the GlobalVariable DI no longer being consistent with the rest of the module, and the new CU is absent from llvm.dbg.cu.

Fixed by adding RF_MoveDistinctMDs to MapMetadata flags for GlobalVariables.

Current unit test IR after clone:
```
@gv = global i32 1, comdat($comdat), !dbg !0, !type !5

define private void @f() comdat($comdat) personality void ()* @persfn !dbg !14 {

!llvm.dbg.cu = !{!10}

!0 = !DIGlobalVariableExpression(var: !1)
!1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true)
!2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5)
!3 = !DIFile(filename: "filename.c", directory: "/file/dir/")
!4 = !DISubroutineType(types: !5)
!5 = !{}
!6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8)
!7 = !DIFile(filename: "filename.c", directory: "/file/dir")
!8 = !{!0}
!9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
!10 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !11)
!11 = !{!12}
!12 = !DIGlobalVariableExpression(var: !13)
!13 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !14, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true)
!14 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !10, variables: !5)
```

Patched IR after clone:
```
@gv = global i32 1, comdat($comdat), !dbg !0, !type !5

define private void @f() comdat($comdat) personality void ()* @persfn !dbg !2 {

!llvm.dbg.cu = !{!6}

!0 = !DIGlobalVariableExpression(var: !1)
!1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true)
!2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5)
!3 = !DIFile(filename: "filename.c", directory: "/file/dir/")
!4 = !DISubroutineType(types: !5)
!5 = !{}
!6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8)
!7 = !DIFile(filename: "filename.c", directory: "/file/dir")
!8 = !{!0}
!9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
```

Reviewers: aprantl, probinson, dblaikie, echristo, loladiro
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36082

llvm-svn: 309928
2017-08-03 09:23:03 +00:00
Matt Arsenault
3207fbe899 LV: Don't insert runtime ptr checks on divergent targets
llvm-svn: 309890
2017-08-02 21:43:08 +00:00
Craig Topper
e87eb175b2 [InstCombine] Remove unnecessary temporary APInt. NFCI
llvm-svn: 309887
2017-08-02 21:05:40 +00:00
Teresa Johnson
f4af38fa5f [PM] Split LoopUnrollPass and make partial unroller a function pass
Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.

*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.

Reviewers: chandlerc

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36157

llvm-svn: 309886
2017-08-02 20:35:29 +00:00
Craig Topper
d4646fd843 [InstCombine] Remove explicit code for folding (xor(zext(cmp)), 1) and (xor(sext(cmp)), -1) to ext(!cmp).
As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor.

Differential Revision: https://reviews.llvm.org/D36214

llvm-svn: 309882
2017-08-02 20:30:27 +00:00
Craig Topper
b8ffa0a173 [InstCombine] Support sext in foldLogicCastConstant
This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214.

Differential Revision: https://reviews.llvm.org/D36234

llvm-svn: 309880
2017-08-02 20:25:56 +00:00
Jakub Kuderski
8f78266b9f [Dominators] Teach LoopDeletion to use the new incremental API
Summary:
This patch makes LoopDeletion use the incremental DominatorTree API.

We modify LoopDeletion to perform the deletion in 5 steps:
1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch.
2. Inform the DomTree about the new edge.
3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable.
4. Inform the DomTree about the deleted edge.
5. Remove the unreachable block from the function.

Creating the dummy conditional branch is necessary to perform incremental DomTree update.
We should consider using the batch updater when it's ready.

Reviewers: dberlin, davide, grosser, sanjoy

Reviewed By: dberlin, grosser

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35391

llvm-svn: 309850
2017-08-02 18:17:52 +00:00
Alexey Bataev
6c386e33ac [SLPVectorizer] Generalize interface of functions, NFC.
llvm-svn: 309816
2017-08-02 14:38:07 +00:00
Alexey Bataev
abac065cef [SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements
Summary:
Currently most of the time vectors of extractelement instructions are
treated as scalars that must be gathered into vectors. But in some
cases, like when we have extractelement instructions from single vector
with different constant indeces or from 2 vectors of the same size, we
can treat this operations as shuffle of a single vector or blending of 2
vectors.
```
define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) {
  %x0 = extractelement <2 x i8> %x, i32 0
  %y1 = extractelement <2 x i8> %y, i32 1
  %x0x0 = mul i8 %x0, %x0
  %y1y1 = mul i8 %y1, %y1
  %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0
  %ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1
  ret <2 x i8> %ins2
}
```
can be converted to something like
```
define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) {
  %1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3>
  %2 = mul <2 x i8> %1, %1
  ret <2 x i8> %2
}
```
Currently this type of conversion is considered as high cost
transformation.

Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon

Subscribers: ashahid, RKSimon, spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D30200

llvm-svn: 309812
2017-08-02 13:25:26 +00:00
Davide Italiano
205a09d9d6 [NewGVN] Fold single-use variables. NFCI.
llvm-svn: 309790
2017-08-02 04:05:49 +00:00
Davide Italiano
b829a675a1 [NewGVN] Remove a (now stale) comment. NFCI.
llvm-svn: 309789
2017-08-02 03:51:40 +00:00
Craig Topper
1142dd4c6d [SimplifyCFG] Fix typo in comment. NFC
llvm-svn: 309785
2017-08-02 02:34:16 +00:00
Chandler Carruth
eb7a769c09 [PM] Fix a bug where through CGSCC iteration we can get
infinite-inlining across multiple runs of the inliner by keeping a tiny
history of internal-to-SCC inlining decisions.

This is still a bit gross, but I don't yet have any fundamentally better
ideas and numerous people are blocked on this to use new PM and ThinLTO
together.

The core of the idea is to detect when we are about to do an inline that
has a chance of re-splitting an SCC which we have split before with
a similar inlining step. That is a critical component in the inlining
forming a cycle and so far detects all of the various cyclic patterns
I can come up with as well as the original real-world test case (which
comes from a ThinLTO build of libunwind).

I've added some tests that I think really demonstrate what is going on
here. They are essentially state machines that march the inliner through
various steps of a cycle and check that we stop when the cycle is closed
and that we actually did do inlining to form that cycle.

A lot of thanks go to Eric Christopher and Sanjoy Das for the help
understanding this issue and improving the test cases.

The biggest "yuck" here is the layering issue -- the CGSCC pass manager
is providing somewhat magical state to the inliner for it to use to make
itself converge. This isn't great, but I don't honestly have a lot of
better ideas yet and at least seems nicely isolated.

I have tested this patch, and it doesn't block *any* inlining on the
entire LLVM test suite and SPEC, so it seems sufficiently narrowly
targeted to the issue at hand.

We have come up with hypothetical issues that this patch doesn't cover,
but so far none of them are practical and we don't have a viable
solution yet that covers the hypothetical stuff, so proceeding here in
the interim. Definitely an area that we will be back and revisiting in
the future.

Differential Revision: https://reviews.llvm.org/D36188

llvm-svn: 309784
2017-08-02 02:09:22 +00:00
Chad Rosier
e36216c004 [Value Tracking] Default argument to true and rename accordingly. NFC.
IMHO this is a bit more readable.

llvm-svn: 309739
2017-08-01 20:18:54 +00:00
Craig Topper
83316a77e5 [InstCombine] Remove explicit check for impossible condition. Replace with assert
Summary:
As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth.

At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible.

Reviewers: spatel, majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36123

llvm-svn: 309690
2017-08-01 15:10:25 +00:00
Max Kazantsev
f2bbb1101e [IRCE][NFC] Add another assert that AddRecExpr's step is not zero
One more assertion of this kind. It is a preparation step for generalizing
to the case of stride not equal to +1/-1.

llvm-svn: 309663
2017-08-01 06:49:29 +00:00
Max Kazantsev
3d9db69707 [IRCE][NFC] Add assert that AddRecExpr's step is not zero
We should never return zero steps, ensure this fact by adding
a sanity check when we are analyzing the induction variable.

llvm-svn: 309661
2017-08-01 06:27:51 +00:00
Davide Italiano
da86042768 [MetaRenamer] Leave @main alone.
To the best of my knowledge -metarenamer is used in two cases:
1) obfuscate names, when e.g. they contain informations that
can't be shared.
2) Improve clarity of the textual IR for testcases.

One of the usecases if getting the output of `opt` and passing it
to the lli interpreter to run the test. If metarenamer renames
@main, lli can't find an entry point.

llvm-svn: 309657
2017-08-01 05:14:45 +00:00
Alina Sbirlea
73dde02cf6 Default MemoryLocation passed to getModRefInfo should be None (D35441)
llvm-svn: 309645
2017-08-01 00:47:17 +00:00
Kostya Serebryany
a154c261b9 [sanitizer-coverage] relax an assertion
llvm-svn: 309644
2017-08-01 00:44:05 +00:00
Alina Sbirlea
7b373d280b Allow None as a MemoryLocation to getModRefInfo
Summary:
Adding part of the changes in D30369 (needed to make progress):
Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA.

Original summary from D30369, by dberlin:
Currently, we have instructions which affect memory but have no memory
location. If you call, for example, MemoryLocation::get on a fence,
it asserts. This means things specifically have to avoid that. It
also means we end up with a copy of each API, one taking a memory
location, one not.

This starts to fix that.

We add MemoryLocation::getOrNone as a new call, and reimplement the
old asserting version in terms of it.

We make MemoryLocation optional in the (Instruction, MemoryLocation)
version of getModRefInfo, and kill the old one argument version in
favor of passing None (it had one caller). Now both can handle fences
because you can just use MemoryLocation::getOrNone on an instruction
and it will return a correct answer.

We use all this to clean up part of MemorySSA that had to handle this difference.

Note that literally every actual getModRefInfo interface we have could be made private and replaced with:

getModRefInfo(Instruction, Optional<MemoryLocation>)
and
getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>)

and delegating to the right ones, if we wanted to.

I have not attempted to do this yet.

Reviewers: dberlin, davide, dblaikie

Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D35441

llvm-svn: 309641
2017-08-01 00:28:29 +00:00
Sanjay Patel
9716dfe75b [InstCombine] allow mask hoisting transform for vector types
llvm-svn: 309627
2017-07-31 21:01:53 +00:00
Peter Collingbourne
da82874983 Update phi nodes in LowerTypeTests control flow simplification
D33925 added a control flow simplification for -O2 --lto-O0 builds that
manually splits blocks and reassigns conditional branches but does not
correctly update phi nodes. If the else case being branched to had
incoming phi nodes the control-flow simplification would leave phi nodes
in that BB with an unhandled predecessor.

Patch by Vlad Tsyrklevich!

Differential Revision: https://reviews.llvm.org/D36012

llvm-svn: 309621
2017-07-31 20:43:07 +00:00
Kostya Serebryany
434bcf9183 [sanitizer-coverage] don't instrument available_externally functions
llvm-svn: 309611
2017-07-31 20:00:22 +00:00
Kostya Serebryany
7923f1a0ff [sanitizer-coverage] ensure minimal alignment for coverage counters and guards
llvm-svn: 309610
2017-07-31 19:49:45 +00:00
Davide Italiano
3c9ab0171a [SLPVectorizer] Unbreak the build with -Werror.
GCC was complaining about `&&` within `||` without explicit
parentheses. NFCI.

llvm-svn: 309606
2017-07-31 19:14:19 +00:00
Craig Topper
d5daa59720 [X86][InstCombine] Add some simplifications for BZHI intrinsics
This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications.

This could be in InstSimplify, but we don't handle any target specific intrinsics there today.

Differential Revision: https://reviews.llvm.org/D36069

llvm-svn: 309604
2017-07-31 18:52:15 +00:00
Craig Topper
80686037d0 [X86][InstCombine] Add basic simplification support for BEXTR/BEXTRI intrinsics.
This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either.

I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify.

Differential Revision: https://reviews.llvm.org/D36063

llvm-svn: 309603
2017-07-31 18:52:13 +00:00
David Majnemer
fb412dedc3 [IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer
We are not allowed to reason about an initializer value without first
consulting hasDefinitiveInitializer.

llvm-svn: 309594
2017-07-31 17:47:07 +00:00
Florian Hahn
f80450c0bb Extend ifdefs to more unused helper functions.
This fixes a buildbot failure with -Werror introduced by r309553

llvm-svn: 309572
2017-07-31 16:11:43 +00:00
Alexey Bataev
2aa430ee58 [SLP] Initial rework for min/max horizontal reduction vectorization, NFC.
Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations.

Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D29402

llvm-svn: 309566
2017-07-31 14:36:05 +00:00
Alexey Bataev
55309303be [Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC.
llvm-svn: 309563
2017-07-31 14:19:32 +00:00
Ayal Zaks
dca1c68e5c [LV] Avoid redundant operations manipulating masks
The Loop Vectorizer generates redundant operations when manipulating masks:
AND with true, OR with false, compare equal to true. Instead of relying on
a subsequent pass to clean them up, this patch avoids generating them.

Use null (no-mask) to represent all-one full masks, instead of a constant
all-one vector, following the convention of masked gathers and scatters.

Preparing for a follow-up VPlan patch in which these mask manipulating
operations are modeled using recipes.

Differential Revision: https://reviews.llvm.org/D35725

llvm-svn: 309558
2017-07-31 13:21:42 +00:00
Florian Hahn
3d431b22ac Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
Florian Hahn
e48aac838f [LoopInterchange] Do not interchange loops with function calls.
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.

For now, I think it is only safe to interchange loops with calls marked
as readnone.

With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.

Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35489

llvm-svn: 309547
2017-07-31 09:00:52 +00:00
Sam Elliott
410ed659bc Migrate PGOMemOptSizeOpt to use new OptimizationRemarkEmitter Pass
Summary:
Fixes PR33790.

This patch still needs a yaml-style test, which I shall write tomorrow

Reviewers: anemet

Reviewed By: anemet

Subscribers: anemet, llvm-commits

Differential Revision: https://reviews.llvm.org/D35981

llvm-svn: 309497
2017-07-30 00:35:33 +00:00
Sumanth Gundapaneni
5ecc6dbfd3 [SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables
Differential Revision: https://reviews.llvm.org/D35579

llvm-svn: 309444
2017-07-28 22:25:40 +00:00
Adrian Prantl
c83c29a7b7 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Alexey Bataev
078766df94 [SLP] Allow vectorization of the instruction from the same basic blocks only, NFC.
Summary:
After some changes in SLP vectorizer we missed some additional checks to
limit the instructions for vectorization. We should not perform analysis
of the instructions if the parent of instruction is not the same as the
parent of the first instruction in the tree or it was analyzed already.

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D34881

llvm-svn: 309425
2017-07-28 20:11:16 +00:00
Wei Mi
d9cf09c389 [GVN] Recommit the patch "Add phi-translate support in scalarpre"
Recommit after workaround the bug PR31652.

Three bugs fixed in previous recommits: The first one is to use CurrentBlock
instead of PREInstr's Parent as param of performScalarPREInsertion because
the Parent of a clone instruction may be uninitialized. The second one is stop
PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst
is defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 309397
2017-07-28 15:47:25 +00:00
Davide Italiano
6e5954a153 [JumpThreading] Stop falsely preserving LazyValueInfo.
JumpThreading claims to preserve LVI, but it doesn't preserve
the analyses which LVI holds a reference to (e.g. the Dominator).
In the current pass manager infrastructure, after JT runs, the
PM frees these analyses (including DominatorTree) but preserves
LVI.

CorrelatedValuePropagation runs immediately after and queries
a corrupted domtree, causing weird miscompiles.

This commit disables the preservation of LVI for the time being.
Eventually, we should either move LVI to a proper dependency
tracking mechanism (i.e. an analyses shouldn't hold references
to other analyses and compute them on demand if needed), or
we should teach all the passes preserving LVI to preserve the
analyses LVI depends on.

The new pass manager has a mechanism to invalidate LVI in case
one of the analyses it depends on becomes invalid, so this problem
shouldn't exist (at least not in this immediate form), but handling
of analyses holding references is still a very delicate subject.

Fixes PR33917 (and rustc).

llvm-svn: 309355
2017-07-28 03:10:43 +00:00