1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

6103 Commits

Author SHA1 Message Date
Sanjoy Das
b0b3d4c99d Add a "gc-transition" operand bundle
Summary:
This adds a new kind of operand bundle to LLVM denoted by the
`"gc-transition"` tag.  Inputs to `"gc-transition"` operand bundle are
lowered into the "transition args" section of `gc.statepoint` by
`RewriteStatepointsForGC`.

This removes the last bit of functionality that was unsupported in the
deopt bundle based code path in `RewriteStatepointsForGC`.

Reviewers: pgavlin, JosephTremoulet, reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16342

llvm-svn: 258338
2016-01-20 19:50:25 +00:00
Sanjay Patel
3635b71b45 [LibCallSimplifier] don't get fooled by a fake sqrt()
The test case will crash without this patch because the subsequent call to
hasUnsafeAlgebra() assumes that the call instruction is an FPMathOperator
(ie, returns an FP type).

This part of the function signature check was omitted for the sqrt() case, 
but seems to be in place for all other transforms.

Before:
http://reviews.llvm.org/rL257400
...we would have needlessly continued execution in optimizeSqrt(), but the
bug was harmless because we'd eventually fail some other check and return
without damage.

This should fix:
https://llvm.org/bugs/show_bug.cgi?id=26211

Differential Revision: http://reviews.llvm.org/D16198

llvm-svn: 258325
2016-01-20 17:41:14 +00:00
Joseph Tremoulet
de5c9a8723 [Inliner/WinEH] Honor implicit nounwinds
Summary:
Funclet EH tables require that a given funclet have only one unwind
destination for exceptional exits.  The verifier will therefore reject
e.g. two cleanuprets with different unwind dests for the same cleanup, or
two invokes exiting the same funclet but to different unwind dests.
Because catchswitch has no 'nounwind' variant, and because IR producers
are not *required* to annotate calls which will not unwind as 'nounwind',
it is legal to nest a call or an "unwind to caller" catchswitch within a
funclet pad that has an unwind destination other than caller; it is
undefined behavior for such a call or catchswitch to unwind.

Normally when inlining an invoke, calls in the inlined sequence are
rewritten to invokes that unwind to the callsite invoke's unwind
destination, and "unwind to caller" catchswitches in the inlined sequence
are rewritten to unwind to the callsite invoke's unwind destination.
However, if such a call or "unwind to caller" catchswitch is located in a
callee funclet that has another exceptional exit with an unwind
destination within the callee, applying the normal transformation would
give that callee funclet multiple unwind destinations for its exceptional
exits.  There would be no way for EH table generation to determine which
is the "true" exit, and the verifier would reject the function
accordingly.

Add logic to the inliner to detect these cases and leave such calls and
"unwind to caller" catchswitches as calls and "unwind to caller"
catchswitches in the inlined sequence.

This fixes PR26147.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: alexcrichton, llvm-commits

Differential Revision: http://reviews.llvm.org/D16319

llvm-svn: 258273
2016-01-20 02:15:15 +00:00
Sanjay Patel
691c821001 add tests to show missing memset/malloc optimizations (PR25892)
llvm-svn: 258218
2016-01-19 23:07:10 +00:00
Sanjoy Das
c6887c7e27 [SCEV] Fix PR26207
In some cases, the max backedge taken count can be more conservative
than the exact backedge taken count (for instance, because
ScalarEvolution::getRange is not control-flow sensitive whereas
computeExitLimitFromICmp can be).  In these cases,
computeExitLimitFromCond (specifically the bit that deals with `and` and
`or` instructions) can create an ExitLimit instance with a
`SCEVCouldNotCompute` max backedge count expression, but a computable
exact backedge count expression.  This violates an implicit SCEV
assumption: a computable exact BE count should imply a computable max BE
count.

This change

 - Makes the above implicit invariant explicit by adding an assert to
   ExitLimit's constructor

 - Changes `computeExitLimitFromCond` to be more robust around
   conservative max backedge counts

llvm-svn: 258184
2016-01-19 20:53:51 +00:00
Sanjay Patel
a2ab3d6165 [LibCallSimplifier] use instruction-level fast-math-flags to shrink calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

llvm-svn: 258158
2016-01-19 18:38:52 +00:00
Sanjay Patel
a46637dede [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, [small integer]) calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

As with D15937, the intent of the patch is to preserve the current behavior of the transform
except that we use the pow call's 'fast' attribute as a trigger rather than a function-level
attribute.

The TODO comment notes a potential follow-on patch that would propagate FMF to the new
instructions.

Differential Revision: http://reviews.llvm.org/D16122

llvm-svn: 258153
2016-01-19 18:15:12 +00:00
Sanjoy Das
aa011535f3 [IndVars] Fix PR25576
`LCSSASafePhiForRAUW` as computed was incorrect -- in cases like
these (this exact example does not actually trigger the bug):

define i32 @f(i32 %n, i1* %c) {
entry:
  br label %outer.loop

outer.loop:
  br label %inner.loop

inner.loop:
  %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ]
  %iv.inc = add nuw nsw i32 %iv, 1
  %tc = udiv i32 %n, 13
  %be.cond = icmp ult i32 %iv, %tc
  br i1 %be.cond, label %inner.loop, label %inner.exit

inner.exit:
  %iv.lcssa = phi i32 [ %iv, %inner.loop ]
  %outer.be.cond = load volatile i1, i1* %c
  br i1 %outer.be.cond, label %outer.loop, label %leave

leave:
  %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ]
  ret i32 %iv.lcssa.lcssa
}

`LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit
value of `%iv` for `%inner.loop` to `%tc` (this can happen due to
`SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA.

To fix this, instead of computing `SafePhi` with special logic, decide
the safety of RAUW directly via `replacementPreservesLCSSAForm`.

llvm-svn: 258016
2016-01-17 18:12:52 +00:00
Artur Pilipenko
bb5abf9eb3 Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D16226

llvm-svn: 258010
2016-01-17 12:35:29 +00:00
Igor Laevsky
1f8ac9245d [BasicAliasAnalysis] Take into account operand bundles in the getModRefInfo function
Differential Revision: http://reviews.llvm.org/D16225

llvm-svn: 257991
2016-01-16 12:15:53 +00:00
Matthew Simpson
bcc32afd72 Reapply r257800 with fix
The fix uniques the bundle of getelementptr indices we are about to vectorize
since it's possible for the same index to be used by multiple instructions.
The original commit message is below.

[SLP] Vectorize the index computations of getelementptr instructions.

This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.

llvm-svn: 257918
2016-01-15 18:51:51 +00:00
Silviu Baranga
777f975cab Re-commit r257064, after it was reverted in r257340.
This contains a fix for the issue that caused the revert:
we no longer assume that we can insert instructions after the
instruction that produces the base pointer. We previously
assumed that this would be ok, because the instruction produces
a value and therefore is not a terminator. This is false for invoke
instructions. We will now insert these new instruction directly
at the location of the users.

Original commit message:

[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs

Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.

InstCombine was already doing this when comparing two GEPs if the base
pointers were the same. However, in the case where we have complex
pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to
or from integers, etc) the value of the original base pointer will be
hidden to the optimizer and this transformation will be disabled.

This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the relevant
uses of GEPs to GEPs with a common base pointer. The GEP comparison
will be converted to a comparison done on indices.

Reviewers: majnemer, jmolloy

Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits

Differential Revision: http://reviews.llvm.org/D15146

llvm-svn: 257897
2016-01-15 15:52:05 +00:00
Matthew Simpson
676ccfcd0a Revert "[SLP] Vectorize the index computations of getelementptr instructions."
This reverts commit r257800.

llvm-svn: 257888
2016-01-15 13:10:46 +00:00
James Molloy
7697faf6db [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

llvm-svn: 257875
2016-01-15 09:20:19 +00:00
Keno Fischer
927308d763 Reapply r257105 "[Verifier] Check that debug values have proper size"
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.

r257550 commit message was:

```
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``

```

llvm-svn: 257850
2016-01-15 00:46:17 +00:00
Matthew Simpson
b2378417a2 [SLP] Vectorize the index computations of getelementptr instructions.
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.

Differential Revision: http://reviews.llvm.org/D14829

llvm-svn: 257800
2016-01-14 20:46:27 +00:00
Keno Fischer
33808751c1 [SROA] Also insert a bit piece expression if only one piece is needed
Summary: If SROA creates only one piece (e.g. because the other is not needed),
it still needs to create a bit_piece expression if that bit piece is smaller
than the original size of the alloca.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16187

llvm-svn: 257795
2016-01-14 20:06:34 +00:00
Keno Fischer
939b9c069c [Utils] Fix incorrect dbg.declare store conversion
Summary: The dbg.declare -> dbg.value conversion did not check which operand of
the store instruction the alloca was passed to. As a result code that stored the
address of an alloca, rather than storing to the alloca, would still trigger
the conversion routine, leading to the insertion of an incorrect dbg.value
intrinsic.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16169

llvm-svn: 257787
2016-01-14 19:12:27 +00:00
Akira Hatanaka
3525b3a30f [Inliner] Merge the attributes of the caller and callee functions
This patch turns off the fast-math optimization attribute on the caller
if the callee's fast-math attribute is not turned on.

For example,

- before inlining
 caller: "less-precise-fpmad"="true"
 callee: "less-precise-fpmad"="false"

- after inlining
 caller: "less-precise-fpmad"="false"

Alternatively, it's possible to block inlining if the caller's and
callee's attributes don't match. If this approach is preferable to the
one in this patch, we can discuss post-commit.

rdar://problem/19836465

Differential Revision: http://reviews.llvm.org/D7802

llvm-svn: 257575
2016-01-13 06:02:45 +00:00
Keno Fischer
97a5fb3666 Re-Revert r257105 (Verifier debug info changes)
While I investigate some new buildbot failures. This was originally reapplied
as r257550 and r257558.

llvm-svn: 257563
2016-01-13 02:31:14 +00:00
Keno Fischer
d8825d5008 Reapply r257105 "[Verifier] Check that debug values have proper size"
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```

llvm-svn: 257550
2016-01-13 00:31:44 +00:00
Fiona Glaser
3959ead5a7 CannotBeOrderedLessThanZero: add some missing cases
llvm-svn: 257542
2016-01-12 23:37:30 +00:00
Keno Fischer
a2e765d377 [Utils] Insert DW_OP_bit_piece when only describing part of the variable
Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext
to find a value to describe the variable (in the expectation that those
zext/sext instruction will go away later). However, those values do not
cover the entire variable and thus need a DW_OP_bit_piece.

Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16061

llvm-svn: 257534
2016-01-12 22:46:09 +00:00
Sanjay Patel
489a46e98d [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, 0.5) calls
Also, propagate the FMF to the newly created sqrt() call.

llvm-svn: 257503
2016-01-12 19:06:35 +00:00
Teresa Johnson
772d13fff8 Fix bot failure from r257493: remove extraneous temp file read
This was left from an earlier version of the test.

llvm-svn: 257494
2016-01-12 17:53:59 +00:00
Teresa Johnson
10e78a41c7 [ThinLTO] Handle an external call from an import to an alias in dest
The findExternalCalls routine ignores calls to functions already
defined in the dest module. This was not handling the case where
the definition in the current module is actually an alias to a
function call.

llvm-svn: 257493
2016-01-12 17:48:44 +00:00
Sanjay Patel
91e6a8ee15 [LibCallSimplifier] use instruction-level fast-math-flags to transform pow(exp(x)) calls
See also:
http://reviews.llvm.org/rL255555
http://reviews.llvm.org/rL256871
http://reviews.llvm.org/rL256964
http://reviews.llvm.org/rL257400
http://reviews.llvm.org/rL257404
http://reviews.llvm.org/rL257414

llvm-svn: 257491
2016-01-12 17:30:37 +00:00
Sanjay Patel
930be29a19 consolidate exp/exp2 tests
The transform is identical, so keep the tests together and save some overhead.

llvm-svn: 257484
2016-01-12 17:00:38 +00:00
Sanjay Patel
c16495bc10 Add/edit tests to include instruction-level FMF on calls
Prepatory patch before changing LibCallSimplifier to use the FMF.
Also, tighten the CHECK lines and give the tests more meaningful names.
Similar changes to:
http://reviews.llvm.org/rL257414

llvm-svn: 257481
2016-01-12 16:50:17 +00:00
Teresa Johnson
d8e00ee37a [IRMover] Don't copy personality, etc unless creating def
Function::copyAttributesFrom will copy the personality function, prefix
data and prolog data from the source function to the new function, and
is invoked when the IRMover copies the function prototype. This puts a
reference to a constant in the source module on a function in the dest
module, which causes an error when deleting the source module after
importing, since the personality function in the source module still has
uses (this would presumably also be an issue for the prologue and prefix
data). Remove the copies added to the dest copy when creating the new
prototype, as they are mapped properly when/if we link the function body.

llvm-svn: 257420
2016-01-12 00:24:24 +00:00
Sanjay Patel
42e7daf81c [LibCallSimplifier] use instruction-level fast-math-flags to transform log calls
Also, add tests to verify that we're checking 'fast' on both calls of each transform pair,
tighten the CHECK lines, and give the tests more meaningful names.

This is a continuation of:
http://reviews.llvm.org/rL255555
http://reviews.llvm.org/rL256871
http://reviews.llvm.org/rL256964
http://reviews.llvm.org/rL257400
http://reviews.llvm.org/rL257404

llvm-svn: 257414
2016-01-11 23:31:48 +00:00
Sanjay Patel
dfd0791d6d [LibCallSimplifier] don't allow sqrt transform unless all ops are unsafe
Fix the FIXME added with:
http://reviews.llvm.org/rL257400

llvm-svn: 257404
2016-01-11 22:50:36 +00:00
Justin Bogner
98deb31a78 LoopUnroll: Use the optsize threshold for minsize as well
Currently we're unrolling loops more in minsize than in optsize, which
means -Oz will have a larger code size than -Os. That doesn't make any
sense.

This resolves the FIXME about this in LoopUnrollPass and extends the
optsize test to make sure we use the smaller threshold for minsize as
well.

llvm-svn: 257402
2016-01-11 22:39:43 +00:00
Sanjay Patel
9ac7e74796 [LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt calls
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555

The intent of the patch is to preserve the current behavior of the transform except
that we use the sqrt instruction's 'fast' attribute as a trigger rather than the
function-level attribute.

But this raises a bug noted by the new FIXME comment.

In order to do this transform:
sqrt((x * x) * y) ---> fabs(x) * sqrt(y)

...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'. 
If any of those ops is strict, we should bail out.

Differential Revision: http://reviews.llvm.org/D15937

llvm-svn: 257400
2016-01-11 22:34:19 +00:00
Silviu Baranga
90360019af Revert r257164 - it has caused spec2k6 failures in LTO mode
llvm-svn: 257340
2016-01-11 16:19:38 +00:00
David Majnemer
cc00affa03 Add test for r257279.
llvm-svn: 257280
2016-01-10 07:13:33 +00:00
Chen Li
e4ebcc71ab [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary:
This is a fix of D13718. D13718 was committed but then reverted because of the following bug:
https://llvm.org/bugs/show_bug.cgi?id=25299

This patch fixes the issue shown in the bug.

Reviewers: majnemer, reames

Subscribers: jevinskie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14308

llvm-svn: 257277
2016-01-10 05:48:01 +00:00
Manuel Jacob
3215e2f49f [RS4GC] Update and simplify handling of Constants in findBaseDefiningValueOfVector().
Summary:
This is analogous to r256079, which removed an overly strong assertion, and
r256812, which simplified the code by replacing three conditionals by one.

Reviewers: reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D16019

llvm-svn: 257250
2016-01-09 04:02:16 +00:00
Philip Reames
6204508de8 [rs4gc] Optionally directly relocated vector of pointers
This patch teaches rewrite-statepoints-for-gc to relocate vector-of-pointers directly rather than trying to split them. This builds on the recent lowering/IR changes to allow vector typed gc.relocates.

The motivation for this is that we recently found a bug in the vector splitting code where depending on visit order, a vector might not be relocated at some safepoint. Specifically, the bug is that the splitting code wasn't updating the side tables (live vector) of other safepoints. As a result, a vector which was live at two safepoints might not be updated at one of them. However, if you happened to visit safepoints in post order over the dominator tree, everything worked correctly. Weirdly, it turns out that post order is actually an incredibly common order to visit instructions in in practice. Frustratingly, I have not managed to write a test case which actually hits this. I can only reproduce it in large IR files produced by actual applications.

Rather than continue to make this code more complicated, we can remove all of the complexity by just representing the relocation of the entire vector natively in the IR.

At the moment, the new functionality is hidden behind a flag. To use this code, you need to pass "-rs4gc-split-vector-values=0". Once I have a chance to stress test with this option and get feedback from other users, my plan is to flip the default and remove the original splitting code. I would just remove it now, but given the rareness of the bug, I figured it was better to leave it in place until the new approach has been stress tested.

Differential Revision: http://reviews.llvm.org/D15982

llvm-svn: 257244
2016-01-09 01:31:13 +00:00
Haicheng Wu
d23d8b325b [JumpThreading] Split select that has constant conditions coming from the PHI node
Look for PHI/Select in the same BB of the form

bb:
  %p = phi [false, %bb1], [true, %bb2], [false, %bb3], [true, %bb4], ...
  %s = select p, trueval, falseval

And expand the select into a branch structure. This later enables
jump-threading over bb in this pass.

Using the similar approach of SimplifyCFG::FoldCondBranchOnPHI(), unfold
select if the associated PHI has at least one constant.  If the unfolded
select is not jump-threaded, it will be folded again in the later
optimizations.

llvm-svn: 257198
2016-01-08 19:39:39 +00:00
Justin Bogner
879f86bb78 LoopInfo: Simplify ownership of Loop objects
It's strange that LoopInfo mostly owns the Loop objects, but that it
defers deleting them to the loop pass manager. Instead, change the
oddly named "updateUnloop" to "markAsRemoved" and have it queue the
Loop object for deletion. We can't delete the Loop immediately when we
remove it, since we need its pointer identity still, so we'll mark the
object as "invalid" so that clients can see what's going on.

llvm-svn: 257191
2016-01-08 19:08:53 +00:00
Teresa Johnson
ce9de68594 [ThinLTO] Delay metadata materializtion in function importer
The function importer was still materializing metadata when modules were
loaded for function importing. We only want to materialize it when we
are going to invoke the metadata linking postpass. Materializing it
before function importing is not only unnecessary, but also causes
metadata referenced by imported functions to be mapped in early, and
then not connected to the rest of the module level metadata when it is
ultimately linked in.

Augmented the test case to specifically check for the metadata being
properly connected, which it wasn't before this fix.

llvm-svn: 257171
2016-01-08 14:17:41 +00:00
Silviu Baranga
93f7373429 Re-commit r257064, this time with a fixed assert
In setInsertionPoint if the value is not a PHI, Instruction or
Argument it should be a Constant, not a ConstantExpr.

Original commit message:

[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs

Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.

InstCombine was already doing this when comparing two GEPs if the base
pointers were the same. However, in the case where we have complex
pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to
or from integers, etc) the value of the original base pointer will be
hidden to the optimizer and this transformation will be disabled.

This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the relevant
uses of GEPs to GEPs with a common base pointer. The GEP comparison
will be converted to a comparison done on indices.

Reviewers: majnemer, jmolloy

Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits

Differential Revision: http://reviews.llvm.org/D15146

llvm-svn: 257164
2016-01-08 11:11:04 +00:00
Chandler Carruth
1b5532dd29 [attrs] Split the late-revisit pattern for deducing norecurse in
a top-down manner into a true top-down or RPO pass over the call graph.

There are specific patterns of function attributes, notably the
norecurse attribute, which are most effectively propagated top-down
because all they us caller information.

Walk in RPO over the call graph SCCs takes the form of a module pass run
immediately after the CGSCC pass managers postorder walk of the SCCs,
trying again to deduce norerucrse for each singular SCC in the call
graph.

This removes a very legacy pass manager specific trick of using a lazy
revisit list traversed during finalization of the CGSCC pass. There is
no analogous finalization step in the new pass manager, and a lazy
revisit list is just trying to produce an RPO iteration of the call
graph. We can do that more directly if more expensively. It seems
unlikely that this will be the expensive part of any compilation though
as we never examine the function bodies here. Even in an LTO run over
a very large module, this should be a reasonable fast set of operations
over a reasonably small working set -- the function call graph itself.

In the future, if this really is a compile time performance issue, we
can look at building support for both post order and RPO traversals
directly into a pass manager that builds and maintains the PO list of
SCCs.

Differential Revision: http://reviews.llvm.org/D15785

llvm-svn: 257163
2016-01-08 10:55:52 +00:00
Sanjay Patel
898b29bc66 [InstCombine] insert a new shuffle in a safe place (PR25999)
Limit this transform to a basic block and guard against PHIs.
Hopefully, this fixes the remaining failures in PR25999:
https://llvm.org/bugs/show_bug.cgi?id=25999

llvm-svn: 257133
2016-01-08 01:39:16 +00:00
Aditya Nandakumar
f2609534e3 Instructions to be redone only if from the same BB
While adding instructions(possible roots) to be redone, make sure they
are from the same basic block.

llvm-svn: 257112
2016-01-07 23:22:55 +00:00
Keno Fischer
37415bceb0 Temporarily revert r257105 "[Verifier] Check that debug values have proper size"
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.

llvm-svn: 257107
2016-01-07 22:39:11 +00:00
Keno Fischer
c41229c60a [Verifier] Check that debug values have proper size
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276

llvm-svn: 257105
2016-01-07 22:18:37 +00:00
David Majnemer
2296864d81 [SCCP] Don't violate the lattice invariants
We marked values which are 'undef' as constant instead of undefined
which violates SCCP's invariants.  If we can figure out that a
computation results in 'undef', leave it in the undefined state.

This fixes PR16052.

llvm-svn: 257102
2016-01-07 21:36:16 +00:00
David Majnemer
bda025cd91 Add test for r256912
I forgot to add this with the rest of r256912.

llvm-svn: 257088
2016-01-07 19:27:16 +00:00