1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

8375 Commits

Author SHA1 Message Date
Teresa Johnson
f4af38fa5f [PM] Split LoopUnrollPass and make partial unroller a function pass
Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.

*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.

Reviewers: chandlerc

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36157

llvm-svn: 309886
2017-08-02 20:35:29 +00:00
Jakub Kuderski
8f78266b9f [Dominators] Teach LoopDeletion to use the new incremental API
Summary:
This patch makes LoopDeletion use the incremental DominatorTree API.

We modify LoopDeletion to perform the deletion in 5 steps:
1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch.
2. Inform the DomTree about the new edge.
3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable.
4. Inform the DomTree about the deleted edge.
5. Remove the unreachable block from the function.

Creating the dummy conditional branch is necessary to perform incremental DomTree update.
We should consider using the batch updater when it's ready.

Reviewers: dberlin, davide, grosser, sanjoy

Reviewed By: dberlin, grosser

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35391

llvm-svn: 309850
2017-08-02 18:17:52 +00:00
Davide Italiano
205a09d9d6 [NewGVN] Fold single-use variables. NFCI.
llvm-svn: 309790
2017-08-02 04:05:49 +00:00
Davide Italiano
b829a675a1 [NewGVN] Remove a (now stale) comment. NFCI.
llvm-svn: 309789
2017-08-02 03:51:40 +00:00
Chad Rosier
e36216c004 [Value Tracking] Default argument to true and rename accordingly. NFC.
IMHO this is a bit more readable.

llvm-svn: 309739
2017-08-01 20:18:54 +00:00
Max Kazantsev
f2bbb1101e [IRCE][NFC] Add another assert that AddRecExpr's step is not zero
One more assertion of this kind. It is a preparation step for generalizing
to the case of stride not equal to +1/-1.

llvm-svn: 309663
2017-08-01 06:49:29 +00:00
Max Kazantsev
3d9db69707 [IRCE][NFC] Add assert that AddRecExpr's step is not zero
We should never return zero steps, ensure this fact by adding
a sanity check when we are analyzing the induction variable.

llvm-svn: 309661
2017-08-01 06:27:51 +00:00
Alina Sbirlea
73dde02cf6 Default MemoryLocation passed to getModRefInfo should be None (D35441)
llvm-svn: 309645
2017-08-01 00:47:17 +00:00
Alina Sbirlea
7b373d280b Allow None as a MemoryLocation to getModRefInfo
Summary:
Adding part of the changes in D30369 (needed to make progress):
Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA.

Original summary from D30369, by dberlin:
Currently, we have instructions which affect memory but have no memory
location. If you call, for example, MemoryLocation::get on a fence,
it asserts. This means things specifically have to avoid that. It
also means we end up with a copy of each API, one taking a memory
location, one not.

This starts to fix that.

We add MemoryLocation::getOrNone as a new call, and reimplement the
old asserting version in terms of it.

We make MemoryLocation optional in the (Instruction, MemoryLocation)
version of getModRefInfo, and kill the old one argument version in
favor of passing None (it had one caller). Now both can handle fences
because you can just use MemoryLocation::getOrNone on an instruction
and it will return a correct answer.

We use all this to clean up part of MemorySSA that had to handle this difference.

Note that literally every actual getModRefInfo interface we have could be made private and replaced with:

getModRefInfo(Instruction, Optional<MemoryLocation>)
and
getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>)

and delegating to the right ones, if we wanted to.

I have not attempted to do this yet.

Reviewers: dberlin, davide, dblaikie

Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D35441

llvm-svn: 309641
2017-08-01 00:28:29 +00:00
David Majnemer
fb412dedc3 [IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer
We are not allowed to reason about an initializer value without first
consulting hasDefinitiveInitializer.

llvm-svn: 309594
2017-07-31 17:47:07 +00:00
Florian Hahn
f80450c0bb Extend ifdefs to more unused helper functions.
This fixes a buildbot failure with -Werror introduced by r309553

llvm-svn: 309572
2017-07-31 16:11:43 +00:00
Florian Hahn
3d431b22ac Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
Florian Hahn
e48aac838f [LoopInterchange] Do not interchange loops with function calls.
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.

For now, I think it is only safe to interchange loops with calls marked
as readnone.

With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.

Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35489

llvm-svn: 309547
2017-07-31 09:00:52 +00:00
Wei Mi
d9cf09c389 [GVN] Recommit the patch "Add phi-translate support in scalarpre"
Recommit after workaround the bug PR31652.

Three bugs fixed in previous recommits: The first one is to use CurrentBlock
instead of PREInstr's Parent as param of performScalarPREInsertion because
the Parent of a clone instruction may be uninitialized. The second one is stop
PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst
is defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 309397
2017-07-28 15:47:25 +00:00
Davide Italiano
6e5954a153 [JumpThreading] Stop falsely preserving LazyValueInfo.
JumpThreading claims to preserve LVI, but it doesn't preserve
the analyses which LVI holds a reference to (e.g. the Dominator).
In the current pass manager infrastructure, after JT runs, the
PM frees these analyses (including DominatorTree) but preserves
LVI.

CorrelatedValuePropagation runs immediately after and queries
a corrupted domtree, causing weird miscompiles.

This commit disables the preservation of LVI for the time being.
Eventually, we should either move LVI to a proper dependency
tracking mechanism (i.e. an analyses shouldn't hold references
to other analyses and compute them on demand if needed), or
we should teach all the passes preserving LVI to preserve the
analyses LVI depends on.

The new pass manager has a mechanism to invalidate LVI in case
one of the analyses it depends on becomes invalid, so this problem
shouldn't exist (at least not in this immediate form), but handling
of analyses holding references is still a very delicate subject.

Fixes PR33917 (and rustc).

llvm-svn: 309355
2017-07-28 03:10:43 +00:00
Davide Italiano
8c7e46a022 [JumpThreading] Add an option to dump LazyValueInfo after the run.
Differential Revision:  https://reviews.llvm.org/D35973

llvm-svn: 309353
2017-07-28 02:57:43 +00:00
Daniel Neilson
0d6908f2bf All libcalls should be considered to be GC-leaf functions.
Summary:
It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc),
but these passes will not mark the call as a gc-leaf-function. All libcalls are
actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that
available libcalls are equivalent to gc-leaf-function calls.

Reviewers: sanjoy, anna, reames

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35840

llvm-svn: 309291
2017-07-27 16:49:39 +00:00
Wei Mi
60633f66f7 Disable loop unswitching for some patterns containing equality comparison with undef.
This is a workaround for the bug described in PR31652 and
http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary
solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for
some simple patterns we can know the equality comparison may contains undef,
so we regard such comparison as unsafe and will not do loop-unswitching for
them. We also need to disable the select simplification when one of select
operand is undef and its result feeds into equality comparison.

The patch cannot clear the safety issue caused by the bug, but it can suppress
the issue from happening to some extent.

Differential Revision: https://reviews.llvm.org/D35811

llvm-svn: 309059
2017-07-25 23:37:17 +00:00
Chandler Carruth
87057f138b [LIR] Teach LIR to avoid extending the BE count prior to adding one to
it when safe.

Very often the BE count is the trip count minus one, and the plus one
here should fold with that minus one. But because the BE count might in
theory be UINT_MAX or some such, adding one before we extend could in
some cases wrap to zero and break when we scale things.

This patch checks to see if it would be safe to add one because the
specific case that would cause this is guarded for prior to entering the
preheader. This should handle essentially all of the common loop idioms
coming out of C/C++ code once canonicalized by LLVM.

Before this patch, both forms of loop in the added test cases ended up
subtracting one from the size, extending it, scaling it up by 8 and then
adding 8 back onto it. This is really silly, and it turns out made it
all the way into generated code very often, so this is a surprisingly
important cleanup to do.

Many thanks to Sanjoy for showing me how to do this with SCEV.

Differential Revision: https://reviews.llvm.org/D35758

llvm-svn: 308968
2017-07-25 10:48:32 +00:00
Florian Hahn
2d7c76cc7a [LoopInterchange] Update code to use range-based for loops (NFC).
Summary:
The remaining non range-based for loops do not iterate over full ranges,
so leave them as they are.

Reviewers: karthikthecool, blitz.opensource, mcrosier, mkuper, aemerson

Reviewed By: aemerson

Subscribers: aemerson, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35777

llvm-svn: 308872
2017-07-24 11:41:30 +00:00
Jonas Paulsson
c38a4eb7d4 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
David Majnemer
00aacae685 [LICM] Make sinkRegion and hoistRegion non-recursive
Large CFGs can cause us to blow up the stack because we would have a
recursive step for each basic block in a region.

Instead, create a worklist and iterate it. This limits the stack usage
to something more manageable.

Differential Revision: https://reviews.llvm.org/D35609

llvm-svn: 308582
2017-07-20 03:27:02 +00:00
Davide Italiano
dc30551029 [TRE] Move to the new OptRemark API.
Fixes PR33788.

Differential Revision:  https://reviews.llvm.org/D35570

llvm-svn: 308524
2017-07-19 21:13:22 +00:00
Balaram Makam
58953554d4 [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary:
When simplifying unconditional branches from empty blocks, we pre-test if the
BB belongs to a set of loop headers and keep the block to prevent passes from
destroying canonical loop structure. However, the current algorithm fails if
the destination of the branch is a loop header. Especially when such a loop's
latch block is folded into loop header it results in additional backedges and
LoopSimplify turns it into a nested loop which prevent later optimizations
from being applied (e.g., loop  unrolling and loop interleaving).

This patch augments the existing algorithm by further checking if the
destination of the branch belongs to a set of loop headers and defer
eliminating it if yes to LateSimplifyCFG.

Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl

Reviewed By: efriedma

Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35411

llvm-svn: 308422
2017-07-19 08:53:34 +00:00
Weiming Zhao
33e867b55e Fix DebugLoc propagation for unreachable LoadInst
Summary: Currently, when GVN creates a load and when InstCombine creates a new store for unreachable Load, the DebugLoc info gets lost.

Reviewers: dberlin, davide, aprantl

Reviewed By: aprantl

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D34639

llvm-svn: 308404
2017-07-19 01:27:24 +00:00
Davide Italiano
9e4866a81c [TRE] Simplify canTRE() a bit using all_of(). NFCI.
This has a ~11 years old FIXME, which may not be true today.
We might consider removing this code altogether.

llvm-svn: 308319
2017-07-18 15:42:59 +00:00
Max Kazantsev
1bdc0b6169 [IRCE] Recognize loops with ne/eq latch conditions
In some particular cases eq/ne conditions can be turned into equivalent
slt/sgt conditions. This patch teaches parseLoopStructure to handle some
of these cases.

Differential Revision: https://reviews.llvm.org/D35010

llvm-svn: 308264
2017-07-18 04:53:48 +00:00
Florian Hahn
522d2634b3 [LoopInterchange] Add some optimization remarks.
Reviewers: anemet, karthikthecool, blitz.opensource

Reviewed By: anemet

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35122

llvm-svn: 308094
2017-07-15 13:13:19 +00:00
Geoff Berry
9de6b5b6de [EarlyCSE] Handle calls with no MemorySSA info.
Summary:
When checking for memory dependencies between calls using MemorySSA,
handle cases where the calls have no MemoryAccess associated with them
because the AA analysis being used has determined that the call does not
read/write memory.

Fixes PR33756

Reviewers: dberlin, davide

Subscribers: mcrosier, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D35317

llvm-svn: 308051
2017-07-14 20:13:21 +00:00
Haicheng Wu
fe94f47749 [JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB()
Add the following pattern to TryToUnfoldSelectInCurrBB()

bb:
   %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ...
   %c = cmp %p, 0
   %s = select %c, trueval, falseval

The Select in the above pattern will be unfolded and then jump-threaded. The
current implementation does not allow CMP in the middle of PHI and Select.

Differential Revision: https://reviews.llvm.org/D34762

llvm-svn: 308050
2017-07-14 19:16:47 +00:00
Max Kazantsev
674af075df [IRCE] Fix corner case with Start = INT_MAX
When iterating through loop

  for (int i = INT_MAX; i > 0; i--)

We fail to generate the pre-loop for it. It happens because we use the
overflown value in a comparison predicate when identifying whether or not
we need it.

In old logic, we used SLE predicate against Greatest value which exceeds all
seen values of the IV and might be overflown. Now we use the GreatestSeen
value of this IV with SLT predicate.

Also added a test that ensures that a pre-loop is generated for such loops.

Differential Revision: https://reviews.llvm.org/D35347

llvm-svn: 308001
2017-07-14 06:35:03 +00:00
Jakub Kuderski
892202cb99 [LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701.
Summary:
LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator.

When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr.

This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 | PR33701 ]].

In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree.

Reviewers: dberlin, davide, uabelho, grosser

Subscribers: efriedma, mzolotukhin

Differential Revision: https://reviews.llvm.org/D35074

llvm-svn: 307828
2017-07-12 18:42:16 +00:00
Konstantin Zhuravlyov
d382d6f3fc Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Davide Italiano
699e10e5f9 [NewGVN] Check for congruency of memory accesses.
This is fine as nothing in the code relies on leader and memory
leader being the same for a given congruency class. Ack'ed by
Dan.

Fixes PR33720.

llvm-svn: 307699
2017-07-11 19:49:12 +00:00
Davide Italiano
b96151efc8 [NewGVN] Fix an innocent typo I found while debugging PR33720.
llvm-svn: 307694
2017-07-11 19:19:45 +00:00
Davide Italiano
1312c89b9a [NewGVN] Clarify the function invariants formatting them properly.
llvm-svn: 307692
2017-07-11 19:15:36 +00:00
Leo Li
11ba81c3df [ConstantHoisting] Remove dupliate logic in constant hoisting
Summary:
As metioned in https://reviews.llvm.org/D34576, checkings in
`collectConstantCandidates` can be replaced by using
`llvm::canReplaceOperandWithVariable`.

The only special case is that `collectConstantCandidates` return false for
all `IntrinsicInst` but it is safe for us to collect constant candidates from
`IntrinsicInst`.

Reviewers: pirama, efriedma, srhines

Reviewed By: efriedma

Subscribers: llvm-commits, javed.absar

Differential Revision: https://reviews.llvm.org/D34921

llvm-svn: 307587
2017-07-10 20:45:34 +00:00
Davide Italiano
540d27f972 [NewGVN] Simplify a lambda a little bit. NFCI.
llvm-svn: 307586
2017-07-10 20:45:00 +00:00
Craig Topper
86739c18e2 [IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC
llvm-svn: 307491
2017-07-09 07:04:00 +00:00
Hiroshi Inoue
1d578b7e75 fix trivial typos; NFC
sucessor -> successor 

llvm-svn: 307488
2017-07-09 05:54:44 +00:00
Yaxun Liu
1a5878b840 [InferAddressSpaces] Fix assertion about null pointer
InferAddressSpaces does not check address space in collectFlatAddressExpressions,
which causes values with non flat address space put into Postorder and causes
assertion in cloneValueWithNewAddressSpace.

This patch fixes assertion in OpenCL 2.0 conformance test generic_address_space
subtest for amdgcn target.

Differential Revision: https://reviews.llvm.org/D34991

llvm-svn: 307349
2017-07-07 02:40:13 +00:00
Wei Mi
a3256711ec [ConstHoisting] Turn on consthoist-with-block-frequency by default.
Using profile information to guide consthoisting is generally helpful for
performance, so the patch turns it on by default. No compile time or perf
regression were found using spec2000 and spec2006 on x86.  Some significant
improvement (>20%) was seen on internal benchmarks.

Differential Revision: https://reviews.llvm.org/D35063

llvm-svn: 307338
2017-07-07 00:11:05 +00:00
Wei Mi
538215e467 [ConstHoisting] choose to hoist when frequency is the same.
The patch is to adjust the strategy of frequency based consthoisting:
Previously when the candidate block has the same frequency with the existing
blocks containing a const, it will not hoist the const to the candidate block.
For that case, now we change the strategy to hoist the const if only existing
blocks have more than one block member. This is helpful for reducing code size.

Differential Revision: https://reviews.llvm.org/D35084

llvm-svn: 307328
2017-07-06 22:32:27 +00:00
Craig Topper
d8ebaac997 [Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.

llvm-svn: 307292
2017-07-06 18:39:47 +00:00
Wei Mi
279c30993a [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale.
When the formulae search space is huge, LSR uses a series of heuristic to keep
pruning the search space until the number of possible solutions are within
certain limit.

The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs,
which picks the register which is used by the most LSRUses and deletes the other
formulae which don't use the register. This is a effective way to prune the search
space, but quite often not a good way to keep the best solution. We saw cases before
that the heuristic pruned the best formula candidate out of search space.

To relieve the problem, we introduce a new heuristic called
NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to
reduce the search space while keeping the best formula, we want to keep as many
formulae with different Scale and ScaledReg as possible. That is because the central
idea of LSR is to choose a group of loop induction variables and use those induction
variables to represent LSRUses. An induction variable candidate is often represented
by the Scale and ScaledReg in a formula. If we have more formulae with different
ScaledReg and Scale to choose, we have better opportunity to find the best solution.
That is why we believe pruning search space by only keeping the best formula with the
same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use
two criteria to choose the best formula with the same Scale and ScaledReg. The first
criteria is to select the formula using less non shared registers, and the second
criteria is to select the formula with less cost got from RateFormula. The patch
implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the
last resort.

Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly
testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help
on the two improved internal cases we saw.

Differential Revision: https://reviews.llvm.org/D34583

llvm-svn: 307269
2017-07-06 15:52:14 +00:00
Anna Thomas
289d0ebad4 [LoopDeletion] NFC: Add loop being analyzed debug statement
llvm-svn: 307096
2017-07-04 17:00:03 +00:00
Anna Thomas
690b0b954d [LoopDeletion] NFC: Add debug statements to the optimization
We have a DEBUG option for loop deletion, but no related debug messages.
Added some debug messages to state why loop deletion failed.

llvm-svn: 307078
2017-07-04 14:05:19 +00:00
Florian Hahn
80a9b3cbaf [LoopInterchange] Add more debug messages to currentLimitations().
Summary: This makes it easier to find out which limitation prevented this pass from doing its work.

Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier

Reviewed By: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D34940

llvm-svn: 307035
2017-07-03 15:32:00 +00:00
Benjamin Kramer
3a7e958ff3 Revert "[GVN] Recommit the patch "Add phi-translate support in scalarpre"."
This reverts commit r306313. This breaks selfhost at -O3 and PR33652.
Let me know if you need additional information on reproducing the issue.

llvm-svn: 307021
2017-07-03 12:23:10 +00:00
Hiroshi Inoue
cae5418697 fix trivial typos; NFC
suport -> support

llvm-svn: 306968
2017-07-02 03:24:54 +00:00