1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00
Commit Graph

5813 Commits

Author SHA1 Message Date
Sanjoy Das
367ed95f05 [SCEV] Don't create SCEV expressions that break LCSSA
Prevent `createNodeFromSelectLikePHI` from creating SCEV expressions
that break LCSSA.

A better fix for the same issue is to teach SCEVExpander to not break
LCSSA by inserting PHI nodes at appropriate places.  That's planned for
the future.

Fixes PR25360.

llvm-svn: 251756
2015-10-31 23:21:40 +00:00
Diego Novillo
b170ccb76f SamplePGO - Count sample records in embedded profiles when computing coverage.
The initial coverage checking code for sample records failed to count
records inside inlined profiles. This change fixes the oversight.

llvm-svn: 251752
2015-10-31 21:53:58 +00:00
Davide Italiano
572717843f [SimplifyLibCalls] Add test to ensure transform is not executed if fast-math
attribute is not present.

During my refactor in r251595 I changed the behavior of optimizeSqrt(),
skipping the transformation if the function wasn't marked with unsafe-fp-math
attribute. This fixed a bug, as confirmed by Sanjay (before the optimization
was silently executed anyway), although it wasn't my primary aim.
This commit adds a test to ensure the code doesn't break again.

Reported by: Marcello Maggioni
Discussed with: Sanjay Patel

llvm-svn: 251747
2015-10-31 20:59:32 +00:00
Justin Bogner
4a77d84be2 [PM] Port StripDeadPrototypes to the new pass manager
This is a really straightforward port. Also adds a test for the pass,
since it only seemed to be tested tangentially before.

llvm-svn: 251726
2015-10-30 23:28:12 +00:00
Justin Bogner
cc016f8933 [PM] Port ADCE to the new pass manager
llvm-svn: 251725
2015-10-30 23:13:18 +00:00
Justin Bogner
7d746e1b5b Whitespace. NFC
llvm-svn: 251724
2015-10-30 23:02:38 +00:00
Dehao Chen
05a0184887 Recommit r251680 (also need to update clang test)
Update the discriminator assignment algorithm

* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.

original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }

; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3

llvm-svn: 251689
2015-10-30 05:07:15 +00:00
Dehao Chen
35437a14da Remove oneline.ll test.
llvm-svn: 251688
2015-10-30 04:48:24 +00:00
Dehao Chen
fb972e9c09 Revert r251680:
Update the discriminator assignment algorithm

* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.

original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }

; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3

llvm-svn: 251685
2015-10-30 04:29:05 +00:00
Dehao Chen
2d6739bc77 Update the discriminator assignment algorithm
* If a scope has already been assigned a discriminator, do not reassign a nested discriminator for it.
* If the file and line both match, even if the column does not match, we should assign a new discriminator for the stmt.

original code:
; #1 int foo(int i) {
; #2 if (i == 3 || i == 5) return 100; else return 99;
; #3 }

; i == 3: discriminator 0
; i == 5: discriminator 2
; return 100: discriminator 1
; return 99: discriminator 3

llvm-svn: 251680
2015-10-30 02:38:29 +00:00
Diego Novillo
23fc3dd779 Revert r251593.
The patch in r251593 was only papering over the problem. The actual fix
was committed in r251623.

llvm-svn: 251635
2015-10-29 16:20:38 +00:00
Cong Hou
cb16a9a0d5 Revert the revision 251592 as it fails a test on some platforms.
llvm-svn: 251617
2015-10-29 05:35:22 +00:00
Philip Reames
0e15fadf4d [LVI/CVP] Teach LVI about range metadata
Somewhat shockingly for an analysis pass which is computing constant ranges, LVI did not understand the ranges provided by range metadata.

As part of this change, I included a change to CVP primarily because doing so made it much easier to write small self contained test cases. CVP was previously only handling the non-local operand case, but given that LVI can sometimes figure out information about instructions standalone, I don't see any reason to restrict this.  There could possibly be a compile time impact from this, but I suspect it should be minimal.  If anyone has an example which substaintially regresses, please let me know.  I could restrict the block local handling to ICmps feeding Terminator instructions if needed.  

Note that this patch continues a somewhat bad practice in LVI. In many cases, we know facts about values, and separate context sensitive facts about values. LVI makes no effort to distinguish and will frequently cache the same value fact repeatedly for different contexts. I would like to change this, but that's a large enough change that I want it to go in separately with clear documentation of what's changing. Other examples of this include the non-null handling, and arguments.

As a meta comment: the entire motivation of this change was being able to write smaller (aka reasonable sized) test cases for a future patch teaching LVI about select instructions.

Differential Revision: http://reviews.llvm.org/D13543

llvm-svn: 251606
2015-10-29 03:57:17 +00:00
Philip Reames
0d08416c1f [InstSimplify] sgt on i1s also encodes implication
Follow on to http://reviews.llvm.org/D13074, implementing something pointed out by Sanjoy. His truth table from his comment on that bug summarizes things well:
LHS | RHS | LHS >=s RHS | LHS implies RHS
0 | 0 | 1 (0 >= 0) | 1
0 | 1 | 1 (0 >= -1) | 1
1 | 0 | 0 (-1 >= 0) | 0
1 | 1 | 1 (-1 >= -1) | 1

The key point is that an "i1 1" is the value "-1", not "1".

Differential Revision: http://reviews.llvm.org/D13756

llvm-svn: 251597
2015-10-29 03:19:10 +00:00
Philip Reames
526a418eb7 [SimplifyCFG] Constant fold a branch implied by it's incoming edge
The most common use case is when eliminating redundant range checks in an example like the following:
c = a[i+1] + a[i];

Note that all the smarts of the transform (the implication engine) is already in ValueTracking and is tested directly through InstructionSimplify.

Differential Revision: http://reviews.llvm.org/D13040

llvm-svn: 251596
2015-10-29 03:11:49 +00:00
Diego Novillo
a973516d22 Tweak test check pattern to fix bot failure.
llvm-svn: 251593
2015-10-29 02:15:02 +00:00
Cong Hou
d3f40238e0 Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor.
To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers.

llvm-svn: 251592
2015-10-29 01:28:44 +00:00
Diego Novillo
9692d53ba5 SamplePGO - Add flag to check sampling coverage.
This adds the flag -mllvm -sample-profile-check-coverage=N to the
SampleProfile pass. N is the percent of input sample records that the
user expects to apply.  If the pass does not use N% (or more) of the
sample records in the input, it emits a warning.

This is useful to detect some forms of stale profiles. If the code has
drifted enough from the original profile, there will be records that do
not match the IR anymore.

This will not detect cases where a sample profile record for line L is
referring to some other instructions that also used to be at line L.

llvm-svn: 251568
2015-10-28 22:30:25 +00:00
Hal Finkel
6916956d7a Revert "r251451 - [AliasSetTracker] Use mod/ref information for UnknownInstr"
It looks like this broke the stage 2 builder:
  http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/6989/

Original commit message:

AliasSetTracker does not need to convert the access mode to ModRefAccess if the
new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
sets.

Patch by Andrew Zhogin!

llvm-svn: 251562
2015-10-28 22:13:41 +00:00
Sanjoy Das
83c1a62e0a [JumpThreading] Use dominating conditions to prove implications
Summary:
If P branches to Q conditional on C and Q branches to R conditional on
C' and C => C' then the branch conditional on C' can be folded to an
unconditional branch.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13972

llvm-svn: 251557
2015-10-28 21:27:08 +00:00
Chad Rosier
e43db3f835 Reapply: [LIR] Add support for creating memsets from loops with a negative stride.
The simple fix is to prevent forming memcpy from loops with a negative stride.

llvm-svn: 251518
2015-10-28 14:38:49 +00:00
Chad Rosier
09a83ae386 Revert "[LIR] Add support for creating memsets from loops with a negative stride."
This reverts commit r251512.  This is causing LNT/chomp to fail.

llvm-svn: 251513
2015-10-28 13:54:09 +00:00
Chad Rosier
de1c0bfef9 [LIR] Add support for creating memsets from loops with a negative stride.
http://reviews.llvm.org/D14125

llvm-svn: 251512
2015-10-28 12:55:34 +00:00
Chen Li
552163ae71 Revert r251492 "[IndVarSimplify] Rewrite loop exit values with their
initial values from loop preheader", because it broke some bots.

llvm-svn: 251498
2015-10-28 05:15:51 +00:00
Chen Li
f5f6088676 [IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.


Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13974

llvm-svn: 251492
2015-10-28 04:45:47 +00:00
Sanjoy Das
bb484e2b4d [GVN] Make a test case more robust
The singleton !range metadata gets simplified more aggressively after a
later change, so change the !range metadata to contain more than one
element.

While at it, turn some `; CHECK` s to `; CHECK-LABEL:` s.

llvm-svn: 251485
2015-10-28 03:20:05 +00:00
David Majnemer
067f270fea [SimplifyCFG] Don't DCE catchret because the successor is unreachable
CatchReturnInst has side-effects: it runs a destructor.  This destructor
could conceivably run forever/call exit/etc. and should not be removed.

llvm-svn: 251461
2015-10-27 22:43:56 +00:00
Hal Finkel
9cc3c36523 [AliasSetTracker] Use mod/ref information for UnknownInstr
AliasSetTracker does not need to convert the access mode to ModRefAccess if the
new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
sets.

Patch by Andrew Zhogin!

llvm-svn: 251451
2015-10-27 20:37:04 +00:00
David Majnemer
7b9b63574c [ScalarEvolutionExpander] PHI on a catchpad can be used on both edges
A PHI on a catchpad might be used by both edges out of the catchpad,
feeding back into a loop.  In this case, just use the insertion point.
Anything more clever would require new basic blocks or PHI placement.

llvm-svn: 251442
2015-10-27 19:48:28 +00:00
NAKAMURA Takumi
53299c896e Revert r251291, "Loop Vectorizer - skipping "bitcast" before GEP"
It causes miscompilation of llvm/lib/ExecutionEngine/Interpreter/Execution.cpp.
See also PR25324.

llvm-svn: 251436
2015-10-27 19:02:36 +00:00
Charlie Turner
a9f1af80ab [SLP] Be more aggressive about reduction width selection.
Summary:
This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach.

It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize.

I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT.

The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something!

Reviewers: nadav, jmolloy

Subscribers: mssimpso, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D14116

llvm-svn: 251428
2015-10-27 17:59:03 +00:00
Charlie Turner
32140fced6 [SLP] Try a bit harder to find reduction PHIs
Summary:
Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads.

There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT.

Reviewers: jmolloy, mcrosier, nadav

Subscribers: mssimpso, nadav, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14063

llvm-svn: 251425
2015-10-27 17:54:16 +00:00
Charlie Turner
18cdd84f54 [SLP] Treat SelectInsts as reduction values.
Summary:
Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values.

The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here.

Reviewers: jmolloy, mzolotukhin, spatel, nadav

Subscribers: nadav, mssimpso, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D13949

llvm-svn: 251424
2015-10-27 17:49:11 +00:00
Diego Novillo
13365b7f1f Fix SamplePGO segfault when debug info is missing.
When emitting a remark for a conditional branch annotation, the remark
uses the line location information of the conditional branch in the
message.  In some cases, that information is unavailable and the
optimization would segfaul. I'm still not sure whether this is a bug or
WAI, but the optimizer should not die because of this.

llvm-svn: 251420
2015-10-27 17:37:00 +00:00
David Majnemer
e6ee317fc1 [ScalarEvolutionExpander] Properly insert no-op casts + EH Pads
We want to insert no-op casts as close as possible to the def.  This is
tricky when the cast is of a PHI node and the BasicBlocks between the
def and the use cannot hold any instructions.  Iteratively walk EH pads
until we hit a non-EH pad.

This fixes PR25326.

llvm-svn: 251393
2015-10-27 07:36:42 +00:00
Igor Laevsky
b3b79f34cd [RS4GC] Strip noalias attribute after statepoint rewrite
We should remove noalias along with dereference and dereference_or_null attributes 
because statepoint could potentially touch the entire heap including noalias objects.

Differential Revision: http://reviews.llvm.org/D14032

llvm-svn: 251333
2015-10-26 19:06:01 +00:00
Diego Novillo
73fa815dcb SamplePGO - Add optimization reports.
This adds a couple of optimization remarks to the SamplePGO
transformation. When it decides to inline a hot function (to mimic the
inline stack and repeat useful inline decisions in the original build).

It will also report branch destinations. For instance, given the code
fragment:

     6      if (i < 1000)
     7        sum -= i;
     8      else
     9        sum += -i * rand();

If the 'else' branch is taken most of the time, building this code with
-Rpass=sample-profile will produce:

a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile]
      sum += -i * rand();
             ^

llvm-svn: 251330
2015-10-26 18:52:53 +00:00
Evgeniy Stepanov
6d86bc6f79 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

The previous iteration of this change was reverted in r250461. This
version leaves the generic, compiler-rt based implementation in
SafeStack.cpp instead of moving it to TargetLoweringBase in order to
allow testing without a TargetMachine.

llvm-svn: 251324
2015-10-26 18:28:25 +00:00
Diego Novillo
7c45627db9 Cleanup test case debug info. NFC.
llvm-svn: 251320
2015-10-26 18:14:44 +00:00
Elena Demikhovsky
5e7ce7f64e Loop Vectorizer - skipping "bitcast" before GEP
Vectorization of memory instruction (Load/Store) is possible when the pointer is coming from GEP. The GEP analysis allows to estimate the profit.
In some cases we have a "bitcast" between GEP and memory instruction.
I added code that skips the "bitcast".

http://reviews.llvm.org/D13886

llvm-svn: 251291
2015-10-26 13:42:41 +00:00
Silviu Baranga
8fa79f5d2b [InstCombine] Teach instcombine not to create extra PHI nodes when folding GEPs
Summary:
InstCombine tries to transform GEP(PHI(GEP1, GEP2, ..)) into GEP(GEP(PHI(...))
when possible. However, this may leave the old PHI node around. Even if we
do end up folding the GEPs, having an extra PHI node might not be beneficial.

This change makes the transformation more conservative. We now only do this if
the PHI has only one use, and can therefore be removed after the transformation.

Reviewers: jmolloy, majnemer

Subscribers: mcrosier, mssimpso, llvm-commits

Differential Revision: http://reviews.llvm.org/D13887

llvm-svn: 251281
2015-10-26 10:25:05 +00:00
Chen Li
c9e8b188d2 Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
llvm-svn: 251149
2015-10-23 21:13:01 +00:00
Hal Finkel
1a64f66683 Handle non-constant shifts in computeKnownBits, and use computeKnownBits for constant folding in InstCombine/Simplify
First, the motivation: LLVM currently does not realize that:

  ((2072 >> (L == 0)) >> 7) & 1 == 0

where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the
lowest-order bit is always zero. There are obviously several ways to go about
fixing this, but the generic solution pursued in this patch is to teach
computeKnownBits something about shifts by a non-constant amount. Previously,
we would give up completely on these. Instead, in cases where we know something
about the low-order bits of the shift-amount operand, we can combine (and
together) the associated restrictions for all shift amounts consistent with
that knowledge. As a further generalization, I refactored all of the logic for
all three kinds of shifts to have this capability. This works well in the above
case, for example, because the dynamic shift amount can only be 0 or 1, and
thus we can say a lot about the known bits of the result.

This brings us to the second part of this change: Even when we know all of the
bits of a value via computeKnownBits, nothing used to constant-fold the result.
This introduces the necessary code into InstCombine and InstSimplify. I've
added it into both because:

  1. InstCombine won't automatically pick up the associated logic in
     InstSimplify (InstCombine uses InstSimplify, but not via the API that
     passes in the original instruction).

  2. Putting the logic in InstCombine allows the resulting simplifications to become
     part of the iterative worklist

  3. Putting the logic in InstSimplify allows the resulting simplifications to be
     used by everywhere else that calls SimplifyInstruction (inlining, unrolling,
     and many others).

And this requires a small change to our definition of an ephemeral value so
that we don't break the rest case from r246696 (where the icmp feeding the
@llvm.assume, is also feeding a br). Under the old definition, the icmp would
not be considered ephemeral (because it is used by the br), but this causes the
assume to remove itself (in addition to simplifying the branch structure), and
it seems more-useful to prevent that from happening.

llvm-svn: 251146
2015-10-23 20:37:08 +00:00
Tim Northover
e2a148b97f GVN: don't try to replace instruction with itself.
After some look-ahead PRE was added for GEPs, an instruction could end
up in the table of candidates before it was actually inspected. When
this happened the pass might decide it was the best candidate to
replace itself. This didn't go well.

Should fix PR25291

llvm-svn: 251145
2015-10-23 20:30:02 +00:00
Chen Li
87928bb2ce [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction.  

Reviewers: hfinkel, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13718

llvm-svn: 251061
2015-10-22 20:48:38 +00:00
Sanjoy Das
6701b8778f [SCEV] Remove a test case added in r249168
The test case wasn't testing what it was commented to be testing; and
when I tried to fix the test I noticed that SCEV does not support the
simplification that the test was supposed to test.

This change removes the test case to avoid confusion.

llvm-svn: 251053
2015-10-22 19:57:41 +00:00
Sanjoy Das
592ded635e [SCEV] Opportunistically interpret unsigned constraints as signed
Summary:
An unsigned comparision is equivalent to is corresponding signed version
if both the operands being compared are positive.  Teach SCEV to use
this fact when profitable.

Reviewers: atrick, hfinkel, reames, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13687

llvm-svn: 251051
2015-10-22 19:57:34 +00:00
Sanjoy Das
7eba4a995b [SCEV] Teach SCEV some axioms about non-wrapping arithmetic
Summary:
 - A s<  (A + C)<nsw> if C >  0
 - A s<= (A + C)<nsw> if C >= 0
 - (A + C)<nsw> s<  A if C <  0
 - (A + C)<nsw> s<= A if C <= 0

Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.

Reviewers: atrick, hfinkel, reames, nlewycky

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13686

llvm-svn: 251050
2015-10-22 19:57:29 +00:00
Sanjoy Das
fd03ccdaff [SCEV] Mark AddExprs as nsw or nuw if legal
Summary:
This uses `ScalarEvolution::getRange` and not potentially control
dependent `nsw` and `nuw` bits on the arithmetic instruction.

Reviewers: atrick, hfinkel, nlewycky

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13613

llvm-svn: 251048
2015-10-22 19:57:19 +00:00
Michael Liao
9a38e95740 [InstCombine] Revise the test case to match full sequene
llvm-svn: 250950
2015-10-21 21:50:58 +00:00