1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00
Commit Graph

13263 Commits

Author SHA1 Message Date
Philip Reames
babfa0e3a4 [JumpThreading] Simplify comparisons when simplifying branches
If we have recognized that a conditional is constant at a particular location in the code (while trying to decide if we can simplify a conditional branch), we can eagerly replace that condition with a constant if it's definition is post dominated by the branch in question.

In practice, this ends up being a compile time savings at most. JumpThreading would have visited each using branch anyways. CVP would have visited the cmp itself again. Unless LVI gives up early, we shouldn't gain any addition power by doing this transformation early. What we do gain is simplicity and compile time.

Differential Revision: http://reviews.llvm.org/D9312

llvm-svn: 236684
2015-05-07 00:19:14 +00:00
David Blaikie
78bd95f7d4 Revert "[opaque pointer type] Pass explicit pointer type through GEP constant folding"
Causes regressions in Clang. Reverting while I investigate.

This reverts commit r236670.

llvm-svn: 236678
2015-05-06 23:56:21 +00:00
Sanjoy Das
a21256f2f4 [Statepoints] Clean up PlaceSafepoints.cpp: de-duplicate code.
Common duplicated code and remove unnecessary code.

llvm-svn: 236674
2015-05-06 23:53:21 +00:00
Sanjoy Das
3b649a4b42 [Statepoints] Clean up PlaceSafepoints.cpp: variable naming.
Use CamelCase.  NFC.

llvm-svn: 236673
2015-05-06 23:53:19 +00:00
Sanjoy Das
9dd1354b02 [IRBuilder] Add a CreateGCStatepointInvoke.
Renames the original CreateGCStatepoint to CreateGCStatepointCall, and
moves invoke creating functionality from PlaceSafepoints.cpp to
IRBuilder.cpp.

This changes the labels generated for PlaceSafepoints/invokes.ll so use
a regex there to make the basic block labels more resilient.

llvm-svn: 236672
2015-05-06 23:53:09 +00:00
David Blaikie
70e0090e88 [opaque pointer type] Pass explicit pointer type through GEP constant folding
llvm-svn: 236670
2015-05-06 23:49:14 +00:00
Pete Cooper
762f70e897 Change typeIncompatible to return an AttrBuilder instead of new-ing an AttributeSet.
This makes use of the new API which can remove attributes from a set given a builder.

This is much faster than creating a temporary set and reduces llc time by about 0.3% which was all spent creating temporary attributes sets on the context.

llvm-svn: 236668
2015-05-06 23:19:56 +00:00
Alexey Samsonov
e7bde408aa [SanitizerCoverage] Fix a couple of typos. NFC.
llvm-svn: 236643
2015-05-06 21:35:25 +00:00
Ismail Pazarbasi
30cb89fe4a Implement createSanitizerCtor, common helper function for all sanitizers
Summary:
This helper function creates a ctor function, which calls sanitizer's
init function with given arguments. This constructor is then expected
to be added to module's ctors. The patch helps unifying how sanitizer
constructor functions are created, and how init functions are called
across all sanitizers.

Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8777

llvm-svn: 236627
2015-05-06 18:48:22 +00:00
Wei Mi
338b822ab9 [X86] Disable loop unrolling in loop vectorization pass when VF is 1.
The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture,
by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce
the cost of overflow check, memory boundary check and extra prologue/epilogue code when
regular unroller will unroll the loop another time. Disable it when VF==1 remove the
unnecessary cost on x86. The same can be done for other platforms after verifying
interleaving/memory bound checking to be not perf critical on those platforms.

Differential Revision: http://reviews.llvm.org/D9515

llvm-svn: 236613
2015-05-06 17:12:25 +00:00
Adam Nemet
c1a1c6bbed [DomTree] verifyDomTree to unconditionally perform DT verification
I folded the check for the flag -verify-dom-info into the only caller
where I think it is supposed to be checked: verifyAnalysis.  (The idea
of the flag is to enable this expensive verification in
verifyPreservedAnalysis.)

I'm assuming that when manually scheduling the verification pass
with -passes=verify<domtree>, we do want to perform the verification.

llvm-svn: 236575
2015-05-06 08:18:41 +00:00
Sanjoy Das
197092fa7d [Statepoint] Clean up Statepoint.h: accessor names.
Use getFoo() as accessors consistently and some other naming changes.

llvm-svn: 236564
2015-05-06 02:36:26 +00:00
David Majnemer
06d0bf99d8 [Inliner] Discard empty COMDAT groups
COMDAT groups which have become rendered unused because of inline are
discardable if we can prove that we've made the group empty.

This fixes PR22285.

llvm-svn: 236539
2015-05-05 20:14:22 +00:00
David Blaikie
b888632efe [opaque pointer type] Track explicit GEP pointee type through in-memory IR
llvm-svn: 236510
2015-05-05 18:03:48 +00:00
Justin Bogner
6f44b3969b InstrProf: Instrumenter support for setting profile output from command line
This change is the second of 3 patches to add support for specifying
the profile output from the command line via -fprofile-instr-generate=<path>,
where the specified output path/file will be overridden by the
LLVM_PROFILE_FILE environment variable.

This patch adds the necessary support to the llvm instrumenter, specifically
a new member of GCOVOptions for clang to save the specified filename, and
support for calling the new compiler-rt interface from __llvm_profile_init.

Patch by Teresa Johnson. Thanks!

llvm-svn: 236288
2015-04-30 23:49:23 +00:00
Matthias Braun
bfbcc9d6b2 InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bits
When optimizing demanded bits of the operands of an Add we have to
remove the nsw/nuw flags as we have no guarantee anymore that we don't
wrap.  This is legal here because the top bit is not demanded.  In fact
this operaion was already performed but missed in the case of an Add
with a constant on the right side.  To fix this this patch refactors the
code to unify the code paths in SimplifyDemandedUseBits() handling of
Add/Sub:

- The transformation of Add->Or is removed from the simplify demand
  code because the equivalent transformation exists in
  InstCombiner::visitAdd()
- KnownOnes/KnownZero are not adjusted for Add x, C anymore as
  computeKnownBits() already performs these computations.
- The simplification of the operands is unified. In this new version
  constant on the right side of a Sub are shrunk now as I could not find
  a reason why not to do so.
- The special case for clearing nsw/nuw in ShrinkDemandedConstant() is
  not necessary anymore as the caller does that already.

Differential Revision: http://reviews.llvm.org/D9415

llvm-svn: 236269
2015-04-30 22:05:30 +00:00
Matthias Braun
dd934add12 InstCombine: Move Sub->Xor rule from SimplifyDemanded to InstCombine
The rule that turns a sub to xor if the LHS is 2^n-1 and the remaining bits
are known zero, does not use the demanded bits at all: Move it to the
normal InstCombine code path.

Differential Revision: http://reviews.llvm.org/D9417

llvm-svn: 236268
2015-04-30 22:04:26 +00:00
Sanjoy Das
dd8fba973b [InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.
Summary:
Optimizing these well are especially interesting for IRCE since it
"clamps" values by generating this sort of pattern through SCEV
expressions.

Depends on D9352.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9353

llvm-svn: 236203
2015-04-30 04:56:04 +00:00
Sanjoy Das
668a7c9deb [InstCombine] Add a new formula for SMIN.
Summary:
After this change `MatchSelectPattern` recognizes the following form
of SMIN:

  Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C)

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9352

llvm-svn: 236202
2015-04-30 04:56:00 +00:00
David Blaikie
ba3e12d3c8 [opaque pointer type] Store the value type of an alloca
llvm-svn: 236175
2015-04-29 23:00:35 +00:00
David Blaikie
0f91b70796 [opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.

This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.

llvm-svn: 236160
2015-04-29 21:22:39 +00:00
Duncan P. N. Exon Smith
09b5c9c24d IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
Philip Reames
ddbf5c48a9 [RewriteStatepointsForGC] Exclude constant values from being considered live at a safepoint
There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code.

To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work.

Differential Revision: http://reviews.llvm.org/D9236

llvm-svn: 235821
2015-04-26 19:48:03 +00:00
Philip Reames
95df870e77 Don't Place Entry Safepoints Before the llvm.frameescape() Intrinsic
llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block.

Patch by: Swaroop.Sridhar@microsoft.com
Differential Revision: http://reviews.llvm.org/D8910

llvm-svn: 235820
2015-04-26 19:41:23 +00:00
Sanjay Patel
6e05431540 [x86] instcombine more cases of insertps into a shufflevector
This is a follow-on to D8833 (insertps optimization when the zero mask is not used).

In this patch, we check for the case where the zmask is used, but both input vectors
to the insertps intrinsic are the same operand or the zmask overrides the destination
lane. This lets us replace the 2nd shuffle input operand with the zero vector.

Differential Revision: http://reviews.llvm.org/D9257

llvm-svn: 235810
2015-04-25 20:55:25 +00:00
Hans Wennborg
0a37a86ca9 SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes
When using bit tests for hole checks, we call AddPredecessorToBlock to give the
phi node a value from the bit test block. This would break if we've
previously called removePredecessor on the default destination because the
switch is fully covered.

Test case by Mark Lacey.

llvm-svn: 235771
2015-04-24 20:57:56 +00:00
Andrew Kaylor
2ac2b48422 Fix LoopInterchange/reductions.ll test for debug builds
llvm-svn: 235734
2015-04-24 17:39:16 +00:00
Aaron Ballman
3bccb51b7e Removing dead code; NFC. This code was triggering a C4718 warning (recursive call has no side effects, deleting) with MSVC.
llvm-svn: 235717
2015-04-24 12:51:45 +00:00
Jingyue Wu
f2297fbf03 Resurrect r235688
We should skip vector types which are not SCEVable.

test/CodeGen/NVPTX/sched2.ll passes

llvm-svn: 235695
2015-04-24 04:22:39 +00:00
Michael Zolotukhin
a881230f49 Fix a couple of typos in comments.
llvm-svn: 235674
2015-04-24 00:10:27 +00:00
Michael Zolotukhin
4742917459 Fix comment for NoCommonBits.
Maybe there is a better wording, but at least it should be technically
correct now.

llvm-svn: 235660
2015-04-23 22:55:48 +00:00
David Blaikie
62279d8d0a Recommit r235458: [opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst
(reverted in r235533)

Original commit message:

"Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.

Totally open to ideas/suggestions/improvements, of course.

With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)"

The remapping done in ValueMapper for LTO was insufficient as the types
weren't correctly mapped (though I was using the post-mapped operands,
some of those operands might not have been mapped yet so the type
wouldn't be post-mapped yet). Instead use the pre-mapped type and
explicitly map all the types.

llvm-svn: 235651
2015-04-23 21:36:23 +00:00
Philip Reames
3f4453de96 Move Value.isDereferenceablePointer to ValueTracking [NFC]
Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking.

This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality.

Patch by: Artur Pilipenko <apilipenko@azulsystems.com>
Differential Revision: reviews.llvm.org/D9075

llvm-svn: 235611
2015-04-23 17:36:48 +00:00
Karthik Bhat
c8423bd788 Move common loop utility function isInductionPHI into LoopUtils.cpp
This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp.
This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON

llvm-svn: 235577
2015-04-23 08:29:20 +00:00
Karthik Bhat
715aca2f92 Add support to interchange loops with reductions.
This patch enables interchanging of tightly nested loops with reductions.
Differential Revision: http://reviews.llvm.org/D8314

llvm-svn: 235571
2015-04-23 04:51:44 +00:00
David Majnemer
c54f703dad [InstCombine] Use a more targeted fix instead of r235544
Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while
taking advantage that the sign bit is not set.  We do this optimization
to further shrink the mask but shrinking the mask isn't NSW/NUW
preserving in this case.

llvm-svn: 235558
2015-04-22 22:42:05 +00:00
David Majnemer
61d5d8dd8e [InstCombine] Clear out nsw/nuw if we modify computation in the chain
An nsw/nuw operation relies on the values feeding into it to not
overflow if 'poison' is not to be produced.  This means that
optimizations which make modifications to the bottom of a chain (like
SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that
they will be preserved.

This fixes PR23309.

llvm-svn: 235544
2015-04-22 20:59:28 +00:00
David Blaikie
ec41387ad6 Revert "[opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst"
This reverts commit r235458.

It looks like this might be breaking something LTO-ish. Looking into it
& will recommit with a fix/test case/etc once I've got more to go on.

llvm-svn: 235533
2015-04-22 18:16:49 +00:00
Sanjay Patel
4fe485083f don't repeat function names in comments; NFC
llvm-svn: 235531
2015-04-22 18:04:46 +00:00
David Blaikie
0477b5459c [opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst
Calls to llvm::Value::mutateType are becoming extra-sensitive now that
instructions have extra type information that will not be derived from
operands or result type (alloca, gep, load, call/invoke, etc... ). The
special-handling for mutateType will get more complicated as this work
continues - it might be worth making mutateType virtual & pushing the
complexity down into the classes that need special handling. But with
only two significant uses of mutateType (vectorization and linking) this
seems OK for now.

Totally open to ideas/suggestions/improvements, of course.

With this, and a bunch of exceptions, we can roundtrip an indirect call
site through bitcode and IR. (a direct call site is actually trickier...
I haven't figured out how to deal with the IR deserializer's lazy
construction of Function/GlobalVariable decl's based on the type of the
entity which means looking through the "pointer to T" type referring to
the global)

llvm-svn: 235458
2015-04-21 23:26:57 +00:00
Wei Mi
24e5246dd6 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimization,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D8911

llvm-svn: 235455
2015-04-21 23:02:15 +00:00
Wei Mi
6a1df2cb86 Revert r235451 since it is attached to a wrong Differential Revision. Sorry.
llvm-svn: 235453
2015-04-21 22:56:09 +00:00
Wei Mi
9c45f921b6 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimizations,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D9007

llvm-svn: 235451
2015-04-21 22:37:09 +00:00
Ahmed Bougacha
1da6ba7a99 [MemCpyOpt] Use the raw i8* dest when optimizing memset+memcpy.
MemIntrinsic::getDest() looks through pointer casts, and using it
directly when building the new GEP+memset results in stuff like:

  %0 = getelementptr i64* %p, i32 16
  %1 = bitcast i64* %0 to i8*
  call ..memset(i8* %1, ...)

instead of the correct:

  %0 = bitcast i64* %p to i8*
  %1 = getelementptr i8* %0, i32 16
  call ..memset(i8* %1, ...)

Instead, use getRawDest, which just gives you the i8* value.
While there, use the memcpy's dest, as it's live anyway.

In most cases, when the optimization triggers, the memset and memcpy
sizes are the same, so the built memset is 0-sized and eliminated.
The problem occurs when they're different.

Fixes a regression caused by r235232: PR23300.

llvm-svn: 235419
2015-04-21 21:28:33 +00:00
Daniel Berlin
3991a891cf Revamp PredIteratorCache interface to be cleaner.
Summary:
This lets us use range based for loops.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9169

llvm-svn: 235416
2015-04-21 21:11:50 +00:00
Sanjoy Das
bcc5cadeec [LSR][NFC] Remove a stale comment.
The comment was made stale in r171735.

llvm-svn: 235414
2015-04-21 20:42:50 +00:00
Jingyue Wu
1d8abb9985 [SLSR] garbage-collect unused instructions
Summary:
After we rewrite a candidate, the instructions used by the old form may
become unused. This patch cleans up these unused instructions so that we
needn't run DCE after SLSR.

Test Plan: removed -dce in all the SLSR tests

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9101

llvm-svn: 235410
2015-04-21 19:56:18 +00:00
Jingyue Wu
51502d56c6 [SeparateConstOffsetFromGEP] garbage-collect intermediate instructions
Summary: so that we needn't run DCE after this pass.

Test Plan: removed -dce from the commandline in split-gep.ll and split-gep-and-gvn.ll

Reviewers: meheff

Subscribers: llvm-commits, HaoLiu, hfinkel, jholewinski

Differential Revision: http://reviews.llvm.org/D9096

llvm-svn: 235409
2015-04-21 19:53:18 +00:00
Daniel Berlin
5fd7b29133 Move IDF Calculation to a separate file, expose an interface to it.
Summary:
MemorySSA uses this algorithm as well, and this enables us to reuse the code in both places.

There are no actual algorithm or datastructure changes in here, just code movement.

Reviewers: qcolombet, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9118

llvm-svn: 235406
2015-04-21 19:13:02 +00:00
Duncan P. N. Exon Smith
8911bb4284 DebugInfo: Drop rest of DIDescriptor subclasses
Delete the remaining subclasses of (the already deleted) `DIDescriptor`.
Part of PR23080.

llvm-svn: 235404
2015-04-21 18:44:06 +00:00
Duncan P. N. Exon Smith
8c237b09f4 DebugInfo: Assert dbg.declare/value insts are valid
Remove early returns for when `getVariable()` is null, and just assert
that it never happens.  The Verifier already confirms that there's a
valid variable on these intrinsics, so we should assume the debug info
isn't broken.  I also updated a check for a `!dbg` attachment, which the
Verifier similarly guarantees.

llvm-svn: 235400
2015-04-21 18:24:23 +00:00
Duncan P. N. Exon Smith
dc9077df15 DebugInfo: Delete subclasses of DIScope
Delete subclasses of (the already defunct) `DIScope`, updating users to
use the raw pointers from the `Metadata` hierarchy directly.

llvm-svn: 235356
2015-04-20 22:10:08 +00:00
Akira Hatanaka
3dec4c0844 [InlineFunction] Don't add lifetime markers for zero-sized allocas.
This commit fixes the code which adds lifetime markers in InlineFunction to skip
zero-sized allocas instead of asserting on them.

rdar://problem/20531155

llvm-svn: 235312
2015-04-20 16:11:05 +00:00
Karthik Bhat
e3028c8087 [NFC] Refactor identification of reductions as common utility function.
This patch refactors reduction identification code out of LoopVectorizer and
exposes them as common utilities.
No functional change.
Review: http://reviews.llvm.org/D9046

llvm-svn: 235284
2015-04-20 04:38:33 +00:00
Ahmed Bougacha
16c0cd0047 [MemCpyOpt] Don't force i64 when promoting memset/memcpy sizes.
Harden r235258 to support any integer bitwidth.  The quick glance at
the reference made me think only i32 and i64 were valid types, but
they're not special, so any overload is legal.

Thanks to David Majnemer for noticing!

llvm-svn: 235261
2015-04-18 23:06:04 +00:00
Ahmed Bougacha
b3e63fba19 [MemCpyOpt] Promote both memset/memcpy sizes if differently typed.
Followup to r235232, which caused PR23278.

We can't assume the memset and memcpy sizes have the same type, as
nothing in the language reference prevents that.
Instead, zext both to i64 if they disagree.

While there, robustify tests by using i8 %c rather than i8 0 for the
memset character.

llvm-svn: 235258
2015-04-18 17:57:41 +00:00
Benjamin Kramer
18e9411446 [InstCombine] Create zero constants on demand.
No functional change intended.

llvm-svn: 235257
2015-04-18 16:52:08 +00:00
David Majnemer
fa43478f01 [InstCombine] (mul nsw 1, INT_MIN) != (shl nsw 1, 31)
Multiplying INT_MIN by 1 doesn't trigger nsw.  However, shifting 1 into
the sign bit *does* trigger nsw.

llvm-svn: 235250
2015-04-18 04:41:30 +00:00
Duncan P. N. Exon Smith
c7f08b339f DebugInfo: Remove DIDescriptor from the DebugInfo API
Stop using `DIDescriptor` and its subclasses in the `DebugInfoFinder`
API, as well as the rest of the API hanging around in `DebugInfo.h`.

llvm-svn: 235240
2015-04-17 23:20:10 +00:00
Ahmed Bougacha
027739a2e4 [MemCpyOpt] Optimize double-storing by memset+memcpy.
A common idiom in some code is to do the following:

  memset(dst, 0, dst_size);
  memcpy(dst, src, src_size);

Some of the memset is redundant; instead, we can do:

  memcpy(dst, src, src_size);
  memset(dst + src_size, 0,
         dst_size <= src_size ? 0 : dst_size - src_size);

Original patch by: Joel Jones
Differential Revision: http://reviews.llvm.org/D498

llvm-svn: 235232
2015-04-17 22:20:57 +00:00
Jingyue Wu
80d7caa6b3 [NaryReassociate] run NaryReassociate iteratively
Summary:
An alternative is to use a worklist approach. However, that approach
would break the traversing order so that we couldn't lookup SeenExprs
efficiently. I don't see a clear winner here, so I picked the easier approach.

Along with two minor improvements:
1. preserves ScalarEvolution by forgetting instructions replaced
2. removes dead code locally avoiding the need of running DCE afterwards

Test Plan: add to slsr-add.ll a test that requires multiple iterations

Reviewers: broune, dberlin, atrick, meheff

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9058

llvm-svn: 235151
2015-04-17 00:25:10 +00:00
Jingyue Wu
8020175ee7 [NaryReassociate] speeds up candidate searching
Summary:
This fixes a left-over efficiency issue in D8950.

As Andrew and Daniel suggested, we can store the candidates in a stack
and pop the top element when it does not dominate the current
instruction. This reduces the worst-case time complexity to O(n).

Test Plan: a new test in nary-add.ll that exercises this optimization.

Reviewers: broune, dberlin, meheff, atrick

Reviewed By: atrick

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D9055

llvm-svn: 235129
2015-04-16 18:42:31 +00:00
Sanjay Patel
f808782564 [X86, SSE] instcombine common cases of insertps intrinsics into shuffles
This is very similar to D8486 / r232852 (vperm2). If we treat insertps intrinsics
as shufflevectors, we can optimize them better.

I've left all but the full zero case of the zero mask variants out of this patch. 
I don't think those can be converted into a single shuffle in all cases, but I'd
be happy to be proven wrong as I was for vperm2f128.

Either way, we'd need to support whatever sequence we come up with for those cases
in the backend before converting them here.

Differential Revision: http://reviews.llvm.org/D8833

llvm-svn: 235124
2015-04-16 17:52:13 +00:00
Aaron Ballman
98ba163915 Silencing a -Wunused-but-set-variable warning; NFC.
llvm-svn: 235094
2015-04-16 13:29:36 +00:00
Duncan P. N. Exon Smith
963aaa7c5d DebugInfo: Gut DIScope, DIEnumerator and DISubrange
The only class the still has API left is `DIDescriptor` itself.

llvm-svn: 235067
2015-04-16 01:37:00 +00:00
Duncan P. N. Exon Smith
d7810b7970 DebugInfo: Gut DICompileUnit and DIFile
Continuing gutting `DIDescriptor` subclasses; this edition,
`DICompileUnit` and `DIFile`.  In the name of PR23080.

llvm-svn: 235055
2015-04-15 23:19:27 +00:00
Duncan P. N. Exon Smith
380b5bd2b0 DebugInfo: Remove 'inlinedAt:' field from MDLocalVariable
Remove 'inlinedAt:' from MDLocalVariable.  Besides saving some memory
(variables with it seem to be single largest `Metadata` contributer to
memory usage right now in -g -flto builds), this stops optimization and
backend passes from having to change local variables.

The 'inlinedAt:' field was used by the backend in two ways:

 1. To tell the backend whether and into what a variable was inlined.
 2. To create a unique id for each inlined variable.

Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg`
attachment, and change the DWARF backend to use a typedef called
`InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`.
This `DebugLoc` is already passed reliably through the backend (as
verified by r234021).

This commit removes the check from r234021, but I added a new check
(that will survive) in r235048, and changed the `DIBuilder` API in
r235041 to require a `!dbg` attachment whose 'scope:` is in the same
`MDSubprogram` as the variable's.

If this breaks your out-of-tree testcases, perhaps the script I used
(mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778
in a moment.

llvm-svn: 235050
2015-04-15 22:29:27 +00:00
Duncan P. N. Exon Smith
15acdd2891 DebugInfo: Require a DebugLoc in DIBuilder::insertDeclare()
Change `DIBuilder::insertDeclare()` and `insertDbgValueIntrinsic()` to
take an `MDLocation*`/`DebugLoc` parameter which it attaches to the
created intrinsic.  Assert at creation time that the `scope:` field's
subprogram matches the variable's.  There's a matching `clang` commit to
use the API.

The context for this is PR22778, which is removing the `inlinedAt:`
field from `MDLocalVariable`, instead deferring to the `!dbg` location
attached to the debug info intrinsic.  The best way to ensure we always
have a `!dbg` attachment is to require one at creation time.  I'll be
adding verifier checks next, but this API change is the best way to
shake out frontend bugs.

Note: I added an `llvm_unreachable()` in `bindings/go` and passed in
`nullptr` for the `DebugLoc`.  The `llgo` folks will eventually need to
pass a valid `DebugLoc` here.

llvm-svn: 235041
2015-04-15 21:18:07 +00:00
Daniel Berlin
8880b4b1c6 Add range iterators for post order and inverse post order. Use them
llvm-svn: 235026
2015-04-15 17:41:42 +00:00
Jingyue Wu
1c64515420 [SLSR] handle candidate form (B + i * S)
Summary:
With this patch, SLSR may rewrite

S1: X = B + i * S
S2: Y = B + i' * S

to

S2: Y = X + (i' - i) * S

A secondary improvement: if (i' - i) is a power of 2, emit Y as X + (S << log(i' - i)). (S << log(i' -i)) is in a canonical form and thus more likely GVN'ed than (i' - i) * S.

Test Plan: slsr-add.ll

Reviewers: hfinkel, sanjoy, meheff, broune, eliben

Reviewed By: eliben

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8983

llvm-svn: 235019
2015-04-15 16:46:13 +00:00
Richard Trieu
5eedf6b231 Change range-based for-loops to be -Wrange-loop-analysis clean.
No functionality change.

llvm-svn: 234963
2015-04-15 01:21:15 +00:00
Jingyue Wu
5bb1ce8bf8 Simplify n-ary adds by reassociation
Summary:
This transformation reassociates a n-ary add so that the add can partially reuse
existing instructions. For example, this pass can simplify

  void foo(int a, int b) {
    bar(a + b);
    bar((a + 2) + b);
  }

to

  void foo(int a, int b) {
    int t = a + b;
    bar(t);
    bar(t + 2);
  }

saving one add instruction.

Fixes PR22357 (https://llvm.org/bugs/show_bug.cgi?id=22357).

Test Plan: nary-add.ll

Reviewers: broune, dberlin, hfinkel, meheff, sanjoy, atrick

Reviewed By: sanjoy, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8950

llvm-svn: 234855
2015-04-14 04:59:22 +00:00
Duncan P. N. Exon Smith
b45f18bf7d DebugInfo: Update signature of DICompileUnit::replace*()
Change `DICompileUnit::replaceSubprograms()` and
`DICompileUnit::replaceGlobalVariables()` to match the `MDCompileUnit`
equivalents that they're wrapping.

llvm-svn: 234852
2015-04-14 03:51:36 +00:00
Duncan P. N. Exon Smith
3b5fae31ff DebugInfo: Gut DISubprogram and DILexicalBlock*
Gut the `DIDescriptor` wrappers around `MDLocalScope` subclasses.  Note
that `DILexicalBlock` wraps `MDLexicalBlockBase`, not `MDLexicalBlock`.

llvm-svn: 234850
2015-04-14 03:40:37 +00:00
Sanjoy Das
b9907c45f6 [LoopUnrollRuntime] Avoid high-cost trip count computation.
Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count.  Avoid runtime unrolling if this computation
will be expensive.

Depends on D8993.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8994

llvm-svn: 234846
2015-04-14 03:20:38 +00:00
Sanjoy Das
30ca3be9e4 [SCEV] Refactor out isHighCostExpansion. NFCI.
Summary:
Move isHighCostExpansion from IndVarSimplify to SCEVExpander.  This
exposed function will be used in a subsequent change.

Reviewers: bogner, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8995

llvm-svn: 234844
2015-04-14 03:20:28 +00:00
Duncan P. N. Exon Smith
ac19c888b2 DebugInfo: Gut DIVariable and DIGlobalVariable
Gut all the non-pointer API from the variable wrappers, except an
implicit conversion from `DIGlobalVariable` to `DIDescriptor`.  Note
that if you're updating out-of-tree code, `DIVariable` wraps
`MDLocalVariable` (`MDVariable` is a common base class shared with
`MDGlobalVariable`).

llvm-svn: 234840
2015-04-14 02:22:36 +00:00
Duncan P. N. Exon Smith
cbde5503ef DebugInfo: Gut DILocation
This is along the same lines as r234832, but for `DILocation`.  Clean
out all accessors from `DILocation`.  Any callers should be using
`MDLocation` directly (e.g., via `operator->()`).

llvm-svn: 234835
2015-04-14 01:35:55 +00:00
Duncan P. N. Exon Smith
2e9424982a DebugInfo: Gut DIExpression
Completely gut `DIExpression`, turning it into a simple wrapper around
`MDExpression *`.  There are two bits of magic left:

  - It's constructed from `const MDExpression*` but convertible to
    `MDExpression*`.
  - It's default-constructed to `nullptr`.

Otherwise, it should behave quite like a raw pointer.  Once I've done
the same to the rest of the `DIDescriptor` subclasses, I'll come back to
delete them entirely (and update call sites as necessary to deal with
the missing magic).

llvm-svn: 234832
2015-04-14 01:12:42 +00:00
Philip Reames
e3d188364a [RewriteStatepointsForGC] Delete dead code [NFC]
Before we had real liveness, we needed to track every value that base pointer
insertion code created because these now might be live.  We now just rerun 
the data flow liveness algorithm (which is actually faster!) and no longer 
need the associated code.

llvm-svn: 234827
2015-04-14 00:41:34 +00:00
Duncan P. N. Exon Smith
d8e6e70445 DebugInfo: Move DILocation::computeNewDiscriminators()
As documented in PR23200 (and the FIXMEs I've added to the code here),
this logic is fairly broken: it modifies the `LLVMContext` in a way that
affects other modules and cannot be serialized to assembly/bitcode.  For
now, move it over to `MDLocation::computeNewDiscriminators()` anyway.

llvm-svn: 234825
2015-04-14 00:35:42 +00:00
Duncan P. N. Exon Smith
68bf43ebd7 AddDiscriminators: Create new MDLocation directly
I don't see a reason to add the `copyWithNewScope()` API over to
`MDLocation` -- it seems to be a holdover from when creating locations
required knowing details of operand layout -- so change
`AddDiscriminators` to call `MDLocation::get()` directly.  Should be no
functionality change here.

llvm-svn: 234824
2015-04-14 00:34:30 +00:00
Duncan P. N. Exon Smith
d72f4d08c6 StripSymbols: Use DIGlobalVariable::getConstant() instead of getGlobal()
The only difference between the two is a `dyn_cast<>` to
`GlobalVariable`.  If optimizations have left anything behind when a
global gets replaced, then it doesn't seem like the debug info is dead.

I can't seem to find an optimization that would leave behind a
non-`GlobalVariable` without nulling the reference entirely, so I
haven't added a testcase (but I'll be deleting `getGlobal()` in a future
commit).

llvm-svn: 234792
2015-04-13 20:13:30 +00:00
Nick Lewycky
faeb8d6bf0 GCC complains thusly: "attributes at the beginning of statement are ignored [-Werror=attributes]". Very well then! NFC
llvm-svn: 234788
2015-04-13 20:03:08 +00:00
Philip Reames
df2ddd44c8 [RwriteStatepointsForGC] Minor indentation and naming [NFC]
Use early-return style that's preferred in LLVM and updating the naming in places I touched with other changes in the last few days.  Hopefully, NFC.

llvm-svn: 234785
2015-04-13 20:00:30 +00:00
Nick Lewycky
e8c4d44b5f Subtraction is not commutative. Fixes PR23212!
llvm-svn: 234780
2015-04-13 19:17:37 +00:00
Philip Reames
e1fc0fc6fc [RewriteStatepointsForGC] Avoid inserting empty holder
We use dummy calls to adjust the liveness of values over statepoints in the midst of the insertion.  If there are no values which need held live, there's no point in actually inserting the holder.  

llvm-svn: 234779
2015-04-13 19:07:47 +00:00
Philip Reames
e39c40ef6f [RewriteStatepointsForGC] Fix a latent bug in normalization for invoke statepoint [NFC]
Since we're restructuring the CFG, we also need to make sure to update the analsis passes. While I'm touching the code, I dedicided to restructure it a bit.  The code involved here was very confusing.  This change moves the normalization to essentially being a pre-pass before the main insertion work and updates a few comments to actually say what is happening and *why*.

The restructuring should be covered by existing tests.  I couldn't easily see how to create a test for the invalidation bug.  Suggestions welcome.

llvm-svn: 234769
2015-04-13 18:07:21 +00:00
Philip Reames
9877caf968 [RewriteStatepointsForGC] Strengthen assertions around liveness
This is related to the issues addressed in 234651.  These assertions check the properties ensured by that change at the place of use.  Note that a similiar property is checked in checkBasicSSA, but without the reachability constraint.  Technically, the liveness would be correct to include unreachable values, but this would be problematic for actual relocation.

llvm-svn: 234766
2015-04-13 17:35:55 +00:00
Philip Reames
9edd4ded24 [RewriteStatepointsForGC] Move an expensive debugging check to XDEBUG
The check in question is attempting to help find cases where we haven't relocated a pointer at a safepoint we should have.  It does this by coercing the value to null at any safepoint which doesn't relocate it.  

Unfortunately, this turns out to be rather expensive in terms of memory usage and time.  The number of stores inserted can grow with O(number of values x number of statepoints).  On at least one example I looked at, over half of peak memory usage was coming from this check.  

With this change, the check is no longer enabled by default in Asserts builds.  It is enabled for expensive asserts builds and has a command line option to enable it in both Asserts and non-Asserts builds.  

llvm-svn: 234761
2015-04-13 16:41:32 +00:00
Mark Lacey
149e2c7419 Fix typo.
llvm-svn: 234706
2015-04-12 18:18:51 +00:00
Sanjoy Das
880c33e480 [LoopUnrollRuntime] Clean up a predicate.
Clean up a predicate I added in r229731, fix the relevant comment and
add a test case.  The earlier version is confusing to read and was also
buggy (probably not a coincidence) till Alexey fixed it in r233881.

llvm-svn: 234701
2015-04-12 01:24:01 +00:00
Benjamin Kramer
a304c427ee Mark empty default constructors as =default if it makes the type POD
NFC

llvm-svn: 234694
2015-04-11 18:57:14 +00:00
Benjamin Kramer
70b4ac9a5e Remove empty non-virtual destructors or mark them =default when non-public
These add no value but can make a class non-trivially copyable. NFC.

llvm-svn: 234688
2015-04-11 15:32:26 +00:00
Alexander Kornienko
71412ece39 Use 'override/final' instead of 'virtual' for overridden methods
The patch is generated using clang-tidy misc-use-override check.

This command was used:

  tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
    -checks='-*,misc-use-override' -header-filter='llvm|clang' \
    -j=32 -fix -format

http://reviews.llvm.org/D8925

llvm-svn: 234679
2015-04-11 02:11:45 +00:00
Duncan P. N. Exon Smith
c28a37b367 DebugInfo: Rewrite atSameLineAs() as MDLocation::canDiscriminate()
Rewrite `DILocation::atSameLineAs()` as `MDLocation::canDiscriminate()`
with a doxygen comment explaining its purpose.  I've added a few FIXMEs
where I think this check is too weak; fixing that is tracked by PR23199.

llvm-svn: 234674
2015-04-11 01:00:47 +00:00
Philip Reames
ea5bc5309c [Statepoints] Fix a release only build failure
A function which is used only in Asserts builds needs to be defined only in Asserts builds.

llvm-svn: 234667
2015-04-11 00:06:47 +00:00
Philip Reames
305873b9c0 [RewriteStatepointsForGC] Use a SetVector for a worklist [NFC]
Using a SetVector to replace equivelent but more verbose functionality.

llvm-svn: 234662
2015-04-10 23:11:26 +00:00
Philip Reames
7a73afff41 [RewriteStatepointsForGC] Use an actual liveness algorithm
When rewriting statepoints to make relocations explicit, we need to have a conservative but consistent notion of where a particular pointer is live at a particular site. The old code just used dominance, which is correct, but decidedly more conservative then it needed to be. This patch implements a simple dataflow algorithm that's run one per function (well, twice counting fixup after base pointer insertion). There's still lots of room to make this faster, but it's fast enough for all practical purposes today.

Differential Revision: http://reviews.llvm.org/D8674

llvm-svn: 234657
2015-04-10 22:53:14 +00:00
Philip Reames
3f3061f6c4 [RewriteStatepointsForGC] clang-format file
Format the entire file to reduce diff of change to follow.

llvm-svn: 234656
2015-04-10 22:34:56 +00:00
Philip Reames
41ab93dc90 [RewriteStatepointsForGC] Missed review comment from 234651 & build fix
After submitting 234651, I noticed I hadn't responded to a review comment by mjacob.  This patch addresses that comment and fixes a Release only build problem due to an unused variable.  

llvm-svn: 234653
2015-04-10 22:16:58 +00:00
Philip Reames
7e15b77030 [RewriteStatepointsForGC] Preprocess the IR to remove unreachable blocks and single entry phis
Two related small changes:

    Various dominance based queries about liveness can get confused if we're talking about unreachable blocks. To avoid reasoning about such cases, just remove them before rewriting statepoints.
    Remove single entry phis (likely left behind by LCSSA) to reduce the number of live values.

Both of these are motivated by http://reviews.llvm.org/D8674 which will be submitted shortly.

Differential Revision: http://reviews.llvm.org/D8675

llvm-svn: 234651
2015-04-10 22:07:04 +00:00
Philip Reames
f5a62ffd7f [RewriteStatepointsForGC] Limited support for vectors of pointers
This patch adds limited support for inserting explicit relocations when there's a vector of pointers live over the statepoint. This doesn't handle the case where the vector contains a mix of base and non-base pointers; that's future work.

The current implementation just scalarizes the vector over the gc.statepoint before doing the explicit rewrite. An alternate approach would be to plumb the vector all the way though the backend lowering, but doing that appears challenging. In particular, the size of the indirect spill slot is currently assumed to be sizeof(pointer) throughout the backend.

In practice, this is enough to allow running the SLP and Loop vectorizers before RewriteStatepointsForGC.

Differential Revision: http://reviews.llvm.org/D8671

llvm-svn: 234647
2015-04-10 21:48:25 +00:00
Sanjoy Das
68c711fb56 [InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP.
Summary:
This change moves creating calls to `llvm.uadd.with.overflow` from
InstCombine to CodeGenPrep.  Combining overflow check patterns into
calls to the said intrinsic in InstCombine inhibits optimization because
it introduces an intrinsic call that not all other transforms and
analyses understand.

Depends on D8888.

Reviewers: majnemer, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8889

llvm-svn: 234638
2015-04-10 21:07:09 +00:00
Reid Kleckner
4e07d71509 [WinEH] Try to make outlining invokes work a little better
WinEH currently turns invokes into calls. Long term, we will reconsider
this, but for now, make sure we remap the operands and clone the
successors of the new terminator.

llvm-svn: 234608
2015-04-10 16:26:42 +00:00
Benjamin Kramer
024e0f290e [CallSite] Make construction from Value* (or Instruction*) explicit.
CallSite roughly behaves as a common base CallInst and InvokeInst. Bring
the behavior closer to that model by making upcasts explicit. Downcasts
remain implicit and work as before.

Following dyn_cast as a mental model checking whether a Value *V isa
CallSite now looks like this: 
  if (auto CS = CallSite(V)) // think dyn_cast
instead of:
  if (CallSite CS = V)

This is an extra token but I think it is slightly clearer. Making the
ctor explicit has the advantage of not accidentally creating nullptr
CallSites, e.g. when you pass a Value * to a function taking a CallSite
argument.

llvm-svn: 234601
2015-04-10 14:50:08 +00:00
Benjamin Kramer
f6149322d4 Reduce dyn_cast<> to isa<> or cast<> where possible.
No functional change intended.

llvm-svn: 234586
2015-04-10 11:24:51 +00:00
Cameron Zwarich
f48f486c1d Eliminate O(n^2) worst-case behavior in SSA construction
The code uses a priority queue and a worklist, which share the same
visited set, but the visited set is only updated when inserting into
the priority queue. Instead, switch to using separate visited sets
for the priority queue and worklist.

llvm-svn: 234425
2015-04-08 18:26:20 +00:00
Adam Nemet
23b1ef3354 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
(Re-apply r234361 with a fix and a testcase for PR23157)

Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234424
2015-04-08 17:48:40 +00:00
Sanjoy Das
4afe7378c6 [InstCombine] Refactor out OptimizeOverflowCheck. NFCI.
Summary:
This patch adds an enum `OverflowCheckFlavor` and a function
`OptimizeOverflowCheck`.  This will allow InstCombine to optimize
overflow checks without directly introducing an intermediate call to the
`llvm.$op.with.overflow` instrinsics.

This specific change is a refactoring and does not intend to change
behavior.

Reviewers: majnemer, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8888

llvm-svn: 234388
2015-04-08 04:27:22 +00:00
Adam Nemet
8c21d63b1f Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform stores"
This reverts commit r234361.

It caused PR23157.

llvm-svn: 234387
2015-04-08 04:16:55 +00:00
Adam Nemet
2d24bce667 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234361
2015-04-07 21:46:16 +00:00
Duncan P. N. Exon Smith
2f210ef9ab DebugInfo: Remove DITypedArray<>, replace with typedefs
Replace all uses of `DITypedArray<>` with `MDTupleTypedArrayWrapper<>`
and `MDTypeRefArray`.  The APIs are completely different, but the
provided functionality is the same: treat an `MDTuple` as if it's an
array of a particular element type.

To simplify this patch a bit, I've temporarily typedef'ed
`DebugNodeArray` to `DIArray` and `MDTypeRefArray` to `DITypeArray`.
I've also temporarily conditionalized the accessors to check for null --
eventually these should be changed to asserts and the callers should
check for null themselves.

There's a tiny accompanying patch to clang.

llvm-svn: 234290
2015-04-07 04:14:33 +00:00
Duncan P. N. Exon Smith
60136dee1f Transforms: Stop using DIDescriptor::is*() and auto-casting
Same as r234255, but for lib/Analysis and lib/Transforms.

llvm-svn: 234257
2015-04-06 23:27:00 +00:00
David Blaikie
95456c7ea7 ArgPromo: Bail out earlier for varargs functions
llvm-svn: 234224
2015-04-06 21:27:31 +00:00
Ismail Pazarbasi
7fbc7a6a77 Move checkInterfaceFunction to ModuleUtils
Summary:
Instead of making a local copy of `checkInterfaceFunction` for each
sanitizer, move the function in a common place.

Reviewers: kcc, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8775

llvm-svn: 234220
2015-04-06 21:09:08 +00:00
Duncan P. N. Exon Smith
ea3fb29a5a DebugInfo: Remove DIDescriptor::Verify()
Remove `DIDescriptor::Verify()` and the `Verify()`s from subclasses.
They had already been gutted, and just did an `isa<>` check.

In a couple of cases I've temporarily dropped the check entirely, but
subsequent commits are going to disallow conversions to the
`DIDescriptor`s directly from `MDNode`, so the checks will come back in
another form soon enough.

llvm-svn: 234201
2015-04-06 19:49:39 +00:00
Jingyue Wu
cee19c8e7e [SLSR] consider &B[S << i] as &B[(1 << i) * S]
Summary: This reduces handling &B[(1 << i) * s] to handling &B[i * S].

Test Plan: slsr-gep.ll

Reviewers: meheff

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D8837

llvm-svn: 234180
2015-04-06 17:15:48 +00:00
David Blaikie
924235accc clang-format my last commit
llvm-svn: 234127
2015-04-05 22:44:57 +00:00
David Blaikie
80477b5659 [opaque pointer type] The last of the GEP IRBuilder API migrations
There's still lots of callers passing nullptr, of course - some because
they'll never be migrated (InstCombines for bitcasts - well they don't
make any sense when the pointer type is opaque anyway, for example) and
others that will need more engineering to pass Types around.

llvm-svn: 234126
2015-04-05 22:41:44 +00:00
David Blaikie
2e42346c23 [opaque pointer type] More GEP API migrations
llvm-svn: 234108
2015-04-04 21:07:10 +00:00
David Blaikie
685d00cae6 [opaque pointer type] More GEP IRBuilder API migrations
llvm-svn: 234064
2015-04-03 23:03:54 +00:00
David Blaikie
968a03f8cd [opaque pointer type] More GEP IRBuilder API migrations...
llvm-svn: 234058
2015-04-03 21:33:42 +00:00
David Blaikie
061faafe3e Use early returns to reduce indentation.
llvm-svn: 234057
2015-04-03 21:32:06 +00:00
David Majnemer
2243ffe08d [InstCombine] Use DataLayout to determine vector element width
InstCombine didn't realize that it needs to use DataLayout to determine
how wide pointers are.  This lead to assertion failures.

This fixes PR23113.

llvm-svn: 234046
2015-04-03 20:18:40 +00:00
David Blaikie
69c6682f8d [opaque pointer type] More GEP API migrations in IRBuilder uses
The plan here is to push the API changes out from the common components
(like Constant::getGetElementPtr and IRBuilder::CreateGEP related
functions) and just update callers to either pass the type if it's
obvious, or pass null.

Do this with LoadInst as well and anything else that comes up, then to
start porting specific uses to not pass null anymore - this may require
some refactoring in each case.

llvm-svn: 234042
2015-04-03 19:41:44 +00:00
Reid Kleckner
012454b680 [ASan] Don't use stack malloc for 32-bit functions using inline asm
This prevents us from running out of registers in the backend.

Introducing stack malloc calls prevents the backend from recognizing the
inline asm operands as stack objects. When the backend recognizes a
stack object, it doesn't need to materialize the address of the memory
in a physical register. Instead it generates a simple SP-based memory
operand. Introducing a stack malloc forces the backend to find a free
register for every memory operand. 32-bit x86 simply doesn't have enough
registers for this to succeed in most cases.

Reviewers: kcc, samsonov

Differential Revision: http://reviews.llvm.org/D8790

llvm-svn: 233979
2015-04-02 21:44:55 +00:00
Jingyue Wu
f16e3eebfb [SLSR] handles off bounds GEPs
Summary:
The old requirement on GEP candidates being in bounds is unnecessary.
For off-bound GEPs, we still have

  &B[i * S] = B + (i * S) * e = B + (i * e) * S

Test Plan: slsr_offbound_gep in slsr-gep.ll

Reviewers: meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8809

llvm-svn: 233949
2015-04-02 21:18:32 +00:00
David Blaikie
f300e9b75c [opaque pointer type] API migration for GEP constant factories
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.

llvm-svn: 233938
2015-04-02 18:55:32 +00:00
Alexey Samsonov
03b2851bcd Fix a bug indicated by -fsanitize=shift-exponent.
llvm-svn: 233881
2015-04-02 01:30:10 +00:00
Ahmed Bougacha
8544aa4b2b [SimplifyLibCalls] Ignore nobuiltin/unavailable fortified libcalls.
We used to do this before refactorings around r225640.
Some clang users checked for _chk libcall availability using:
  __has_builtin(__builtin___memcpy_chk)
When compiling with -fno-builtin, this is always true.
When passing -ffreestanding/-mkernel, which both imply -fno-builtin, we
end up with fortified libcalls, which isn't acceptable in a freestanding
environment which only provides their non-fortified counterparts.

Until we change clang and/or teach external users to check for availability
differently, disregard the "nobuiltin" attribute and TLI::has.

Workaround for PR23093.

llvm-svn: 233776
2015-04-01 00:45:09 +00:00
David Blaikie
39f56d0f3c [opaque pointer type] Change GetElementPtrInst::getIndexedType to take the pointee type
This pushes the use of PointerType::getElementType up into several
callers - I'll essentially just have to keep pushing that up the stack
until I can eliminate every call to it...

llvm-svn: 233604
2015-03-30 21:41:43 +00:00
David Blaikie
ad75ffb137 [opaque pointer type] More IRBuilder::createGEP (non-inbounds) migrations: CodeGenPrepare and SimplifyLibCalls
llvm-svn: 233596
2015-03-30 20:42:56 +00:00
Duncan P. N. Exon Smith
7f2c33ec62 Transforms: Use the new DebugLoc API, NFC
Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API.

llvm-svn: 233587
2015-03-30 19:49:49 +00:00
David Blaikie
614465c9a9 Constrain the type of a parameter now that callers without this constraint have been removed.
llvm-svn: 233419
2015-03-27 20:56:11 +00:00
David Blaikie
b57edabf0c Recommit r233116 better: Remove a redundant instcombine involving bitcasts of geps of bitcasts
This just didn't need to be here at all, but the assertion I tried to
add wasn't appropriate either - the circumstance isn't impossible, it's
just not important to deal with it here - the gep-rooted version of this
instcombine will handle this case, we don't need to duplicate it for the
case where the gep happens to be used in a bitcast.

llvm-svn: 233404
2015-03-27 20:13:55 +00:00
Anna Zaks
bb3446e865 [asan] Speed up isInterestingAlloca check
We make many redundant calls to isInterestingAlloca in the AddressSanitzier
pass. This is especially inefficient for allocas that have many uses. Let's
cache the results to speed up compilation.

The compile time improvements depend on the input. I did not see much
difference on benchmarks; however, I have a test case where compile time
goes from minutes to under a second.

llvm-svn: 233397
2015-03-27 18:52:01 +00:00
Yaron Keren
3856893d6d Remove superfluous .str() and replace std::string concatenation with Twine.
llvm-svn: 233392
2015-03-27 17:51:30 +00:00
James Molloy
844dff224e Reapply r233175 and r233183: float2int.
This re-adds float2int to the tree, after fixing PR23038. It turns
out the argument to APSInt() is true-if-unsigned, rather than
true-if-signed :(. Added testcase and explanatory comment.

llvm-svn: 233370
2015-03-27 10:36:57 +00:00
Sanjoy Das
a1908cd835 [NFC] Fix typo in comment.
llvm-svn: 233363
2015-03-27 06:01:56 +00:00
Philip Reames
77522a75bc Code cleanup [NFC]
The assertion here was more expensive then it needed to be.  We're only inserting allocas in the entry block, so we only need to consider ones in the entry block.

llvm-svn: 233362
2015-03-27 05:53:16 +00:00
Philip Reames
92b4844b30 More code cleanup [NFC]
llvm-svn: 233361
2015-03-27 05:47:00 +00:00
Philip Reames
96d87ad71c More code cleanup [NFC]
Minor naming, one potentially unsafe cast

llvm-svn: 233359
2015-03-27 05:39:32 +00:00
Philip Reames
f28af3049b Code simplification and style cleanup
All the removed assertions are either implied locally by the assert at the top of the function or properties of the verifier.

llvm-svn: 233358
2015-03-27 05:34:44 +00:00
Karthik Bhat
2c8e43726d Refactor Code inside LoopVectorizer's function isInductionVariable.
This patch exposes LoopVectorizer's isInductionVariable function as common
a functionality.
http://reviews.llvm.org/D8608

llvm-svn: 233352
2015-03-27 03:44:15 +00:00
Nick Lewycky
17b59ec505 Revert r233175 and r233183 with it. This pulls float2int back out of the tree, due to PR23038.
llvm-svn: 233350
2015-03-27 02:00:11 +00:00
Benjamin Kramer
ab138a6078 InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0
Anding and comparing with zero can be done in a single instruction on
most archs so this is a bit cheaper.

llvm-svn: 233291
2015-03-26 17:12:06 +00:00
Jingyue Wu
537669aad8 [SLSR] handle candidate form &B[i * S]
Summary:
This patch enhances SLSR to handle another candidate form &B[i * S]. If
we found two candidates

S1: X = &B[i * S]
S2: Y = &B[i' * S]

and S1 dominates S2, we can replace S2 with

Y = &X[(i' - i) * S]

Test Plan:
slsr-gep.ll
X86/no-slsr.ll: verify that we do not run SLSR on GEPs that already fit into
an addressing mode

Reviewers: eliben, atrick, meheff, hfinkel

Reviewed By: hfinkel

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D7459

llvm-svn: 233286
2015-03-26 16:49:24 +00:00
Andrea Di Biagio
713ffecfc7 [optnone] Skip pass Float2Int on optnone functions.
Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int
is not run on optnone functions.

llvm-svn: 233183
2015-03-25 12:22:37 +00:00
James Molloy
17b6105997 Reapply r233062: "float2int": Add a new pass to demote from float to int where possible.
Now with a fix for PR23008 and extra regression test.

llvm-svn: 233175
2015-03-25 10:03:42 +00:00
David Blaikie
27c176e356 Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)

SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)

llvm-svn: 233131
2015-03-24 23:34:31 +00:00
Sanjay Patel
1a17022bda optimize the AVX2 (integer) version of vperm2 into a shuffle
...because this is what happens when an instruction
set puts its underwear on after its pants.

This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110

llvm-svn: 233127
2015-03-24 22:39:29 +00:00
David Blaikie
0dc1532dd9 Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)

llvm-svn: 233126
2015-03-24 22:38:16 +00:00
Philip Reames
d57b6c2219 Merge empty landing pads in SimplifyCFG
This patch tries to merge duplicate landing pads when they branch to a common shared target.

Given IR that looks like this:
lpad1:
  %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
         cleanup
  br label %shared_resume
lpad2:
  %exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
          cleanup
  br label %shared_resume
shared_resume:
  call void @fn()
  ret void
}

We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.

Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.

Differential Revision: http://reviews.llvm.org/D8297

llvm-svn: 233125
2015-03-24 22:28:45 +00:00
David Blaikie
8b46a1bcb2 Revert "Remove an InstCombine that seems to have become redundant."
Assertion fires in compiler-rt. Guess it does fire..

This reverts commit r233116.

llvm-svn: 233121
2015-03-24 21:50:35 +00:00
David Blaikie
0914ef9c6b Remove an InstCombine that seems to have become redundant.
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.

llvm-svn: 233116
2015-03-24 21:31:31 +00:00
Sanjay Patel
c077161b46 [X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles
This is the IR optimizer follow-on patch for D8563: the x86 backend patch
that converts this kind of shuffle back into a vperm2.

This is also a continuation of the transform that started in D8486. 
In that patch, Andrea suggested that we could convert vperm2 intrinsics that
use zero masks into a single shuffle. 

This is an implementation of that suggestion.

Differential Revision: http://reviews.llvm.org/D8567

llvm-svn: 233110
2015-03-24 20:36:42 +00:00
Hans Wennborg
fbd19841f0 Revert r233062 ""float2int": Add a new pass to demote from float to int where possible."
This caused PR23008, compiles failing with: "Use still stuck around after Def is
destroyed: %.sroa.speculated"

Also reverting follow-up r233064.

llvm-svn: 233105
2015-03-24 20:07:08 +00:00
Sanjoy Das
5e0d47a08c [IRCE] Fix how IRCE checks for no-sign-overflow.
IRCE requires the induction variables it handles to not sign-overflow.
The current scheme of checking if sext({X,+,S}) == {sext(X),+,sext(S)}
fails when SCEV simplifies sext(X) too.  After this change we //also//
check no-signed-wrap by looking at the flags set on the SCEVAddRecExpr.

llvm-svn: 233102
2015-03-24 19:29:22 +00:00
Sanjoy Das
867451d1ec [IRCE] Fix a regression introduced in r232444.
IRCE should not try to eliminate range checks that check an induction
variable against a loop-varying length.

llvm-svn: 233101
2015-03-24 19:29:18 +00:00
Benjamin Kramer
99a748be6f [float2int] Sort includes and add missing raw_ostream include.
llvm-svn: 233064
2015-03-24 11:28:47 +00:00
James Molloy
32530d77c8 "float2int": Add a new pass to demote from float to int where possible.
It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used.

This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers.

This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point.

To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around.

llvm-svn: 233062
2015-03-24 11:15:23 +00:00
Benjamin Kramer
6a9aa608f1 Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.
llvm-svn: 232998
2015-03-23 19:32:43 +00:00
Benjamin Kramer
36883e98af [ctorutils] Update and sort includes. NFC.
llvm-svn: 232995
2015-03-23 19:06:17 +00:00
Benjamin Kramer
e51698aa6c Another set of missing raw_ostream.h. Still no functional change.
llvm-svn: 232993
2015-03-23 18:45:56 +00:00
Benjamin Kramer
45a545b9c6 Purge unused includes throughout libSupport.
NFC.

llvm-svn: 232976
2015-03-23 18:07:13 +00:00
Benjamin Kramer
d52ec1c0ec Move private classes into anonymous namespaces
NFC.

llvm-svn: 232944
2015-03-23 12:30:58 +00:00
Benjamin Kramer
b1b20b3003 [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform.
llvm-svn: 232903
2015-03-21 22:04:26 +00:00
Benjamin Kramer
facc0a564b [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

llvm-svn: 232902
2015-03-21 21:09:33 +00:00
Benjamin Kramer
6c7bd16fc0 SimplifyLibCalls: Add basic optimization of memchr calls.
This is just memchr(x, y, 0) -> nullptr and constant folding.

llvm-svn: 232896
2015-03-21 15:36:21 +00:00
Kostya Serebryany
55ec403858 [sanitizer] experimental tracing for cmp instructions
llvm-svn: 232873
2015-03-21 01:29:36 +00:00
Sanjay Patel
cf8a53f502 [X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles
vperm2* intrinsics are just shuffles. 
In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:

1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
   really angry (and so I have regrets about some patches from last week).

2. Doing mask conversion logic in header files is hard to write and 
   subsequently read.

There are a couple of TODOs in this patch to complete this optimization.

Differential Revision: http://reviews.llvm.org/D8486

llvm-svn: 232852
2015-03-20 21:47:56 +00:00
Andrew Kaylor
7b78ee54b5 Fixing a bug with WinEH PHI handling
llvm-svn: 232851
2015-03-20 21:42:54 +00:00
Duncan P. N. Exon Smith
c31f1b0224 SanitizerCoverage: Check for null DebugLocs
After a WIP patch to make `DIDescriptor` accessors more strict, this
started asserting.

llvm-svn: 232832
2015-03-20 18:48:45 +00:00
Duncan P. N. Exon Smith
34c48c3fe8 SampleProfile: Check for missing debug locations
Don't use `DebugLoc` accessors if we're pointing at null, which will be
a problem after a WIP patch to make the `DIDescriptor` accessors more
strict.  Caught by Frontend/profile-sample-use-loc-tracking.c (in
clang).

llvm-svn: 232792
2015-03-20 00:56:55 +00:00
Duncan P. N. Exon Smith
57ccff4630 Verifier: Remove the separate -verify-di pass
Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
Instead, call into the `DebugInfoVerifier` from inside
`VerifierLegacyPass::finalizeModule()`.  This better matches the logic
in `verifyModule()` (used by the new PassManager), avoids requiring two
separate passes to verify the IR, and makes the API for "add a pass to
verify the IR" simple.

Note: the `-verify-debug-info` flag still works (for now, at least;
eventually it might make sense to just remove it).

llvm-svn: 232772
2015-03-19 22:24:17 +00:00
Peter Collingbourne
1254323d85 LowerBitSets: Avoid reusing byte set addresses.
Each use of the byte array uses a different alias. This makes the
backend less likely to reuse previously computed byte array addresses,
improving the security of the CFI mechanism based on this pass.

Differential Revision: http://reviews.llvm.org/D8455

llvm-svn: 232770
2015-03-19 22:02:10 +00:00
Peter Collingbourne
3bdbf14413 libLTO, llvm-lto, gold: Introduce flag for controlling optimization level.
This change also introduces a link-time optimization level of 1. This
optimization level runs only the globaldce pass as well as cleanup passes for
passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html

llvm-svn: 232769
2015-03-19 22:01:00 +00:00
Duncan P. N. Exon Smith
e8a4b7fb5f PassManagerBuilder: Remove effectively dead 'StripDebug' option
`StripDebug` was only used by tools/opt/opt.cpp in
`AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
command-line flag before it calls `AddStandardLinkPasses()`.  Stripping
debug info twice isn't very useful.

llvm-svn: 232765
2015-03-19 21:37:17 +00:00
Peter Collingbourne
c3999ab5f3 GlobalDCE: Improve performance for large modules containing comdats.
When we encounter a global with a comdat, rather than iterating over
every global in the module to find globals in the same comdat, store the
members in a multimap. This effectively lowers the complexity to O(N log N),
improving performance significantly for large modules such as might be
encountered during LTO.

It looks like we used to do something like this until r219191.

No functional change.

Differential Revision: http://reviews.llvm.org/D8431

llvm-svn: 232743
2015-03-19 18:23:29 +00:00
Daniel Jasper
7e5f7305cf [InstCombine] Don't fold a GEP into itself through a PHI node
This can only occur (I think) through the back-edge of the loop.

However, folding a GEP into itself means that the value of the previous
iteration needs to be stored in the meantime, thus requiring an
additional register variable to be live, but not actually achieving
anything (the gep still needs to be executed once per loop iteration).

The attached test case is derived from:
  typedef unsigned uint32;
  typedef unsigned char uint8;
  inline uint8 *f(uint32 value, uint8 *target) {
    while (value >= 0x80) {
      value >>= 7;
      ++target;
    }
    ++target;
    return target;
  }
  uint8 *g(uint32 b, uint8 *target) {
    target = f(b, f(42, target));
    return target;
  }

What happens is that the GEP stored in incptr2 is folded into itself
through the loop's back-edge and the phi-node stored in loopptr,
effectively incrementing the ptr by "2" in each iteration instead of "1".

In this case, it is actually increasing the number of GEPs required as
the GEP before the loop can't be folded away anymore. For comparison:

With this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
    %shr = lshr i32 %newval, 7
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %0, i64 2
    ret i8* %incptr3
  }

Without this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %incptr = getelementptr inbounds i8, i8* %buffer, i64 1
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %shr = lshr i32 %newval, 7
    %incptr2 = getelementptr inbounds i8, i8* %0, i64 2
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
    ret i8* %incptr3
  }

Review: http://reviews.llvm.org/D8245
llvm-svn: 232718
2015-03-19 11:05:08 +00:00
Sanjoy Das
1f60e8293a [ConstantRange] Split makeICmpRegion in two.
Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts.  The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true.  The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8345

llvm-svn: 232575
2015-03-18 00:41:24 +00:00
Michael Zolotukhin
4b02a3bee3 Try to fix a test broken by one of my previous commits.
llvm-svn: 232536
2015-03-17 20:31:56 +00:00
Michael Zolotukhin
4658f05da2 LoopVectorize: teach loop vectorizer to vectorize calls.
The tests would be committed in a commit for http://reviews.llvm.org/D8131

Review: http://reviews.llvm.org/D8095
llvm-svn: 232530
2015-03-17 19:46:50 +00:00
Michael Zolotukhin
65fd2f0883 LoopVectorizer: Add TargetTransformInfo.
Review: http://reviews.llvm.org/D8092
llvm-svn: 232522
2015-03-17 19:17:18 +00:00
Kostya Serebryany
a916c4e25e [asan] remove redundant ifndefs. NFC
llvm-svn: 232521
2015-03-17 19:13:23 +00:00
Michael Liao
a26d21b8dd [SwitchLowering] Remove incoming values in the reverse order
- To prevent invalidating *successive* indices.
 

llvm-svn: 232510
2015-03-17 18:03:10 +00:00
David Blaikie
eaa0813800 Fix GCC -Wparentheses warning (& reformat now that the precedence is fixed)
Benign warning (clang deliberately suppresses this case) but does
regularly produce bad formatting, so it's nice to fix/reformat.

llvm-svn: 232508
2015-03-17 17:48:24 +00:00
Dmitry Vyukov
31e6082bb3 asan: optimization experiments
The experiments can be used to evaluate potential optimizations that remove
instrumentation (assess false negatives). Instead of completely removing
some instrumentation, you set Exp to a non-zero value (mask of optimization
experiments that want to remove instrumentation of this instruction).
If Exp is non-zero, this pass will emit special calls into runtime
(e.g. __asan_report_exp_load1 instead of __asan_report_load1). These calls
make runtime terminate the program in a special way (with a different
exit status). Then you run the new compiler on a buggy corpus, collect
the special terminations (ideally, you don't see them at all -- no false
negatives) and make the decision on the optimization.

The exact reaction to experiments in runtime is not implemented in this patch.
It will be defined and implemented in a subsequent patch.

http://reviews.llvm.org/D8198

llvm-svn: 232502
2015-03-17 16:59:19 +00:00
Reid Kleckner
edfb61fbc6 Use an underlying enum type of unsigned to silence a -Wmicrosoft warning about being unable to put (unsigned)-1 into the default underyling type of int
llvm-svn: 232498
2015-03-17 16:50:20 +00:00
Sanjoy Das
56a1924548 [IRCE] Add a -irce-print-range-checks option.
-irce-print-range-checks prints out the set of range checks recognized
by IRCE.

llvm-svn: 232451
2015-03-17 01:40:22 +00:00
Duncan P. N. Exon Smith
5637a3e3c5 MapMetadata: Allow unresolved metadata if it won't change
Allow unresolved nodes through the `MapMetadata()` if
`RF_NoModuleLevelChanges`, since there's no remapping to do anyway.

This fixes PR22929.  I'll add a clang test as a follow-up.

llvm-svn: 232449
2015-03-17 01:14:40 +00:00
Sanjoy Das
62f7ca531c [IRCE] Add comments, NFC.
This change adds some comments that justify why a potentially
overflowing operation is safe.

llvm-svn: 232445
2015-03-17 00:42:16 +00:00
Sanjoy Das
59e77f5c4e [IRCE] Support half-range checks.
This change to IRCE gets it to recognize "half" range checks.  Half
range checks are range checks that only either check if the index is
`slt` some positive integer ("length") or if the index is `sge` `0`.

The range solver does not try to be clever / aggressive about solving
half-range checks -- it transforms "I < L" to "0 <= I < L" and "0 <= I"
to "0 <= I < INT_SMAX".  This is safe, but not always optimal.

llvm-svn: 232444
2015-03-17 00:42:13 +00:00
Justin Bogner
99971234a2 GCOV: Make the exit block placement from r223193 optional
By default we want our gcov emission to stay 4.2 compatible, which
means we need to continue emit the exit block last by default. We add
an option to emit it before the body for users that need it.

llvm-svn: 232438
2015-03-16 23:52:03 +00:00
Peter Collingbourne
cf293cd018 LowerBitSets: do not use private aliases at all on Darwin.
LLVM currently turns these into linker-private symbols, which can be dead
stripped by the Darwin linker.

llvm-svn: 232435
2015-03-16 23:36:24 +00:00
Gabor Horvath
e3a1ff29d9 [llvm] Replacing asserts with static_asserts where appropriate
Summary:
This patch consists of the suggestions of clang-tidy/misc-static-assert check.


Reviewers: alexfh

Reviewed By: alexfh

Subscribers: xazax.hun, llvm-commits

Differential Revision: http://reviews.llvm.org/D8343

llvm-svn: 232366
2015-03-16 09:53:42 +00:00
Dmitry Vyukov
0cd1eb5d21 asan: fix overflows in isSafeAccess
As pointed out in http://reviews.llvm.org/D7583
The current checks can cause overflows when object size/access offset cross Quintillion bytes.

http://reviews.llvm.org/D8193

llvm-svn: 232358
2015-03-16 08:04:26 +00:00
Michael Gottesman
218e4e613a One more try with unused.
llvm-svn: 232357
2015-03-16 08:00:27 +00:00
Michael Gottesman
e18ce984ec Add in an unreachable after a covered switch to appease certain bots.
llvm-svn: 232356
2015-03-16 07:46:34 +00:00
Michael Gottesman
7c3e8a7b1d Remove a used that snuck in that seems to be triggering the MSVC buildbots.
llvm-svn: 232355
2015-03-16 07:34:17 +00:00
Michael Gottesman
41c4b0ad29 [objc-arc] Fix indentation of debug logging so it is easy to read the output.
llvm-svn: 232352
2015-03-16 07:02:39 +00:00
Michael Gottesman
1ce7630734 [objc-arc] Make the ARC optimizer more conservative by forcing it to be non-safe in both direction, but mitigate the problem by noting that we just care if there was a further use.
The problem here is the infamous one direction known safe. I was
hesitant to turn it off before b/c of the potential for regressions
without an actual bug from users hitting the problem. This is that bug ;
).

The main performance impact of having known safe in both directions is
that often times it is very difficult to find two releases without a use
in-between them since we are so conservative with determining potential
uses. The one direction known safe gets around that problem by taking
advantage of many situations where we have two retains in a row,
allowing us to avoid that problem. That being said, the one direction
known safe is unsafe. Consider the following situation:

retain(x)
retain(x)
call(x)
call(x)
release(x)

Then we know the following about the reference count of x:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
retain(x)
// rc(x) == N+2
call A(x)
call B(x)
// rc(x) >= 1 (since we can not release a deallocated pointer).
release(x)
// rc(x) >= 0

That is all the information that we can know statically. That means that
we know that A(x), B(x) together can release (x) at most N+1 times. Lets
say that we remove the inner retain, release pair.

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
call A(x)
call B(x)
// rc(x) >= 1
release(x)
// rc(x) >= 0

We knew before that A(x), B(x) could release x up to N+1 times meaning
that rc(x) may be zero at the release(x). That is not safe. On the other
hand, consider the following situation where we have a must use of
release(x) that x must be kept alive for after the release(x)**. Then we
know that:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
retain(x)
// rc(x) == N+2
call A(x)
call B(x)
// rc(x) >= 2 (since we know that we are going to release x and that that release can not be the last use of x).
release(x)
// rc(x) >= 1 (since we can not deallocate the pointer since we have a must use after x).
…
// rc(x) >= 1
use(x)

Thus we know that statically the calls to A(x), B(x) can together only
release rc(x) N times. Thus if we remove the inner retain, release pair:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
call A(x)
call B(x)
// rc(x) >= 1
…
// rc(x) >= 1
use(x)

We are still safe unless in the final … there are unbalanced retains,
releases which would have caused the program to blow up anyways even
before optimization occurred. The simplest form of must use is an
additional release that has not been paired up with any retain (if we
had paired the release with a retain and removed it we would not have
the additional use). This fits nicely into the ARC framework since
basically what you do is say that given any nested releases regardless
of what is in between, the inner release is known safe. This enables us to get
back the lost performance.

<rdar://problem/19023795>

llvm-svn: 232351
2015-03-16 07:02:36 +00:00
Michael Gottesman
ba44aaa2d6 [objc-arc] Treat memcpy, memove, memset as just using pointers, not decrementing them.
This will be tested in the next commit (which required it). The commit
is going to update a bunch of tests at the same time.

llvm-svn: 232350
2015-03-16 07:02:32 +00:00
Michael Gottesman
c8fabe48ac [objc-arc] Rename ConnectTDBUTraversals => PairUpRetainsReleases.
This is a name that is more descriptive of what the method really does. NFC.

llvm-svn: 232349
2015-03-16 07:02:30 +00:00
Michael Gottesman
4cfd36d50b [objc-arc] Move initialization of ARCMDKindCache into the class itself. I also made it lazy.
llvm-svn: 232348
2015-03-16 07:02:27 +00:00
Michael Gottesman
de8fcd1699 [objc-arc] Change EntryPointType to an enum class outside of ARCRuntimeEntryPoints called ARCRuntimeEntryPointKind.
llvm-svn: 232347
2015-03-16 07:02:24 +00:00
David Blaikie
441e94fa44 [opaque pointer type] IRBuilder gep migration progress
llvm-svn: 232294
2015-03-15 01:03:19 +00:00
Mehdi Amini
459bdb4f65 Update InstCombine to transform aggregate stores into scalar stores.
Summary: This is a first step toward getting proper support for aggregate loads and stores.

Test Plan: Added unittests

Reviewers: reames, chandlerc

Reviewed By: chandlerc

Subscribers: majnemer, joker.eph, chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D7780

Patch by Amaury Sechet

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 232284
2015-03-14 22:19:33 +00:00
David Blaikie
ed36e98a6c Add some missed formatting
llvm-svn: 232281
2015-03-14 21:40:12 +00:00
David Blaikie
4b49042238 [opaque pointer type] gep API migration, ArgPromo
This involved threading the type-to-gep through a data structure, since
the code was relying on the pointer type to carry this information. I
imagine there will be a lot of this work across the project... slow
work chasing each use case, but the assertions will help keep me honest.

llvm-svn: 232277
2015-03-14 21:11:26 +00:00
David Blaikie
6b8c7476b9 [opaque pointer type] more gep API migration
llvm-svn: 232274
2015-03-14 19:53:33 +00:00
David Blaikie
7279c7f872 [opaque pointer type] more gep API migrations
Adding nullptr to all the IRBuilder stuff because it's the first thing
that fails to build when testing without the back-compat functions, so
I'll keep having to re-add these locally for each chunk of migration I
do. Might as well check them in to save me the churn. Eventually I'll
have to migrate these too, but I'm going breadth-first.

llvm-svn: 232270
2015-03-14 19:24:04 +00:00
David Blaikie
2db52814fd [opaque pointer type] Start migrating GEP creation to explicitly specify the pointee type
I'm just going to migrate these in a pretty ad-hoc & incremental way -
providing the backwards compatible API for now, then locally removing
it, fixing a few callers, adding it back in and commiting those callers.
Rinse, repeat.

The assertions should ensure that if I get this wrong we'll find out
about it and not just have one giant patch to revert, recommit, revert,
recommit, etc.

llvm-svn: 232240
2015-03-14 01:53:18 +00:00
Peter Collingbourne
03b0d472bb LowerBitSets: Do not export symbols for bit set referenced globals on Darwin.
The linker on that platform may re-order symbols or strip dead symbols, which
will break bit set checks. Avoid this by hiding the symbols from the linker.

llvm-svn: 232235
2015-03-14 00:00:49 +00:00
Robert Lougher
103950c1eb Reapply "[Reassociate] Add initial support for vector instructions."
This reapplies the patch previously committed at revision 232190.  This was
reverted at revision 232196 as it caused test failures in tests that did not
expect operands to be commuted.  I have made the tests more resilient to
reassociation in revision 232206.

llvm-svn: 232209
2015-03-13 20:53:01 +00:00
Duncan P. N. Exon Smith
e5ae5021db instcombine: alloca: Canonicalize scalar allocation array size
As a follow-up to r232200, add an `-instcombine` to canonicalize scalar
allocations to `i32 1`.  Since r232200, `iX 1` (for X != 32) are only
created by RAUWs, so this shouldn't fire too often.  Nevertheless, it's
a cheap check and a nice cleanup.

llvm-svn: 232202
2015-03-13 19:42:09 +00:00
Duncan P. N. Exon Smith
a5cb5caa92 instcombine: alloca: Limit array size type promotion
Move type promotion of the size of the array allocation to the end of
`simplifyAllocaArraySize()`.  This avoids promoting the type of the
array size if it's a `ConstantInt`, since the next -instcombine
iteration will drop it to a scalar allocation anyway.  Similarly, this
avoids promoting the type if it's an `UndefValue`, in which case the
alloca gets RAUW'ed.

This is NFC when considered over the lifetime of -instcombine, since
it's just reducing the number of iterations needed to reach fixed point.

llvm-svn: 232201
2015-03-13 19:34:55 +00:00
Duncan P. N. Exon Smith
4326380356 AsmWriter: Write alloca array size explicitly (and -instcombine fixup)
Write the `alloca` array size explicitly when it's non-canonical.
Previously, if the array size was `iX 1` (where X is not 32), the type
would mutate to `i32` when round-tripping through assembly.

The testcase I added fails in `verify-uselistorder` (as well as
`FileCheck`), since the use-lists for `i32 1` and `i64 1` change.
(Manman Ren came across this when running `verify-uselistorder` on some
non-trivial, optimized code as part of PR5680.)

The type mutation started with r104911, which allowed array sizes to be
something other than an `i32`.  Starting with r204945, we
"canonicalized" to `i64` on 64-bit platforms -- and then on every
round-trip through assembly, mutated back to `i32`.

I bundled a fixup for `-instcombine` to avoid r204945 on scalar
allocations.  (There wasn't a clean way to sequence this into two
commits, since the assembly change on its own caused testcase churn, and
the `-instcombine` change can't be tested without the assembly changes.)

An obvious alternative fix -- change `AllocaInst::AllocaInst()`,
`AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for
scalar allocations -- was rejected out of hand, since this required
teaching them each about the data layout.

A follow-up commit will add an `-instcombine` to canonicalize the scalar
allocation array size to `i32 1` rather than leaving `iX 1` alone.

rdar://problem/20075773

llvm-svn: 232200
2015-03-13 19:30:44 +00:00
Duncan P. N. Exon Smith
eb33647b74 instcombine: alloca: Remove nesting in simplifyAllocaArraySize(), NFC
llvm-svn: 232199
2015-03-13 19:26:33 +00:00
Duncan P. N. Exon Smith
56d9d5dfc6 instcombine: alloca: Split out simplifyAllocaArraySize(), NFC
Follow-up commits will change some of the logic here.  Splitting into a
separate function simplifies the logic by allowing early returns instead
of deeper nesting.

llvm-svn: 232197
2015-03-13 19:22:03 +00:00
Robert Lougher
c9db4beacb Revert: "[Reassociate] Add initial support for vector instructions."
This reverts revision 232190 due to buildbot failure reported on clang-hexagon-elf
for test arm64_vtst.c.  To be investigated.

llvm-svn: 232196
2015-03-13 19:20:46 +00:00
Robert Lougher
f99d4427a8 [Reassociate] Add initial support for vector instructions.
This patch adds initial support for vector instructions to the reassociation
pass. It enables most parts of the pass to work with vectors but to keep the
size of the patch small, optimization of Xor trees, canonicalization of
negative constants and converting shifts to muls, etc., have been left out.
This will be handled in later patches.

The patch is based on an initial patch by Chad Rosier.

Differential Revision: http://reviews.llvm.org/D7566

llvm-svn: 232190
2015-03-13 18:33:27 +00:00
Kevin Qin
5fb79c9e03 Reapply 'Run LICM pass after loop unrolling pass.'
It's firstly committed at r231630, and reverted at r231635.

Function pass InstructionSimplifier is inserted as barrier to
make sure loop unroll pass won't affect on LICM pass.

llvm-svn: 232011
2015-03-12 05:36:01 +00:00
Andrew Kaylor
6bed959c78 Extended support for native Windows C++ EH outlining
Differential Review: http://reviews.llvm.org/D7886

llvm-svn: 231981
2015-03-11 23:22:06 +00:00
David Majnemer
613caa1eaa InstCombine: Don't fold call bitcast into args if callee is byval
This fixes a bug reported here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150309/265341.html

llvm-svn: 231948
2015-03-11 18:03:05 +00:00
Sanjay Patel
f8b6a29222 Inliner should not add callgraph edges for intrinsic calls (PR22857)
The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics.

This patch makes sure that the Inliner does not add such an edge to the callgraph.

Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857

Differential Revision: http://reviews.llvm.org/D8231

llvm-svn: 231927
2015-03-11 15:12:32 +00:00
Philip Reames
d1be160c96 If a conditional branch jumps to the same target, remove the condition
Given that large parts of inst combine is restricted to instructions which have one use, getting rid of a use on the condition can help the effectiveness of the optimizer. Also, it allows the condition to potentially be deleted by instcombine rather than waiting for another pass.

I noticed this completely by accident in another test case. It's not anything that actually came from a real workload.

p.s. We should probably do the same thing for switch instructions.

Differential Revision: http://reviews.llvm.org/D8220

llvm-svn: 231881
2015-03-10 22:52:37 +00:00
Sanjay Patel
e4ff5d4d5c remove function names from comments; NFC
llvm-svn: 231826
2015-03-10 19:42:57 +00:00
Michael Zolotukhin
5abcdaa7c0 Enable loop-rotate before loop-vectorize by default
llvm-svn: 231820
2015-03-10 19:07:41 +00:00
Adam Nemet
5ee2447b48 [LAA-memchecks 2/3] Move number of memcheck threshold checking to LV
Now the analysis won't "fail" if the memchecks exceed the threshold.  It
is the transform pass' responsibility to perform the check.

This allows the transform pass to further analyze/eliminate the
memchecks.  E.g. in Loop distribution we only need to check pointers
that end up in different partitions.

Note that there is a slight change of functionality here.  The logic in
analyzeLoop is that if dependence checking fails due to non-constant
distance between the pointers, another attempt is made to prove safety
of the dependences purely using run-time checks.

Before this patch we could fail the loop due to exceeding the memcheck
threshold after the first step, now we only check the threshold in the
client after the full analysis.  There is no measurable compile-time
effect but I wanted to record this here.

llvm-svn: 231817
2015-03-10 18:54:23 +00:00
Sanjay Patel
1a55477781 remove names from comments; NFC
llvm-svn: 231813
2015-03-10 18:41:22 +00:00
Sanjay Patel
25d06d29cd fix typos; NFC
llvm-svn: 231812
2015-03-10 18:37:05 +00:00
Sanjay Patel
83c4b90c27 remove function names from comments; NFC
llvm-svn: 231801
2015-03-10 16:42:24 +00:00
Owen Anderson
fa465e39c1 Fix a crash in InstCombine where we could try to truncate a switch comparison to zero width.
llvm-svn: 231761
2015-03-10 06:51:39 +00:00
Owen Anderson
c22a907f78 Fix an infinite loop in InstCombine when an instruction with no users and side effects can be constant folded.
ReplaceInstUsesWith needs to return nullptr when the input has no users,
because in that case it does not mutate the program.  Otherwise, we can
get stuck in an infinite loop of repeatedly attempting to constant fold
and instruction with no users.

llvm-svn: 231755
2015-03-10 05:13:47 +00:00
Mehdi Amini
f88efe5f8a DataLayout is mandatory, update the API to reflect it with references.
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
2015-03-10 02:37:25 +00:00
Kostya Serebryany
1c5044f9e7 [sanitizer] fix instrumentation with -mllvm -sanitizer-coverage-block-threshold=0 to actually do something useful.
llvm-svn: 231736
2015-03-10 01:58:27 +00:00
Kostya Serebryany
a4f5e50c70 [sanitizer] decrease sanitizer-coverage-block-threshold from 1000 to 500 as another horrible workaround for PR17409
llvm-svn: 231733
2015-03-10 01:11:53 +00:00
Benjamin Kramer
02fbb23c87 Remove the remaining uses of abs64 and nuke it.
std::abs works just fine and we're already using it in many places. NFC intended.

llvm-svn: 231696
2015-03-09 20:20:16 +00:00
Benjamin Kramer
5ce1e9672a Make helper functions static.
Found by -Wmissing-prototypes. NFC.

llvm-svn: 231664
2015-03-09 16:23:46 +00:00
Benjamin Kramer
27ae7b62e0 SymbolRewriter: Hide implementation details
NFC.

llvm-svn: 231660
2015-03-09 15:50:47 +00:00
Kevin Qin
4ba876c24d Revert r231630 - Run LICM pass after loop unrolling pass.
As it broke llvm bootstrap.

llvm-svn: 231635
2015-03-09 07:26:37 +00:00
Kevin Qin
0f6643694d Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization.
Runtime unrolling is an expensive optimization which can bring benefit
only if the loop is hot and iteration number is relatively large enough.
For some loops, we know they are not worth to be runtime unrolled.
The scalar loop from vectorization is one of the cases.

llvm-svn: 231631
2015-03-09 06:14:18 +00:00
Kevin Qin
92a0be0434 Run LICM pass after loop unrolling pass.
Runtime unrollng will introduce a runtime check in loop prologue.
If the unrolled loop is a inner loop, then the proglogue will be inside
the outer loop. LICM pass can help to promote the runtime check out if
the checked value is loop invariant.

llvm-svn: 231630
2015-03-09 06:14:07 +00:00
David Blaikie
350b9cbf65 Simplify expressions involving boolean constants with clang-tidy
Patch by Richard (legalize at xmission dot com).

Differential Revision: http://reviews.llvm.org/D8154

llvm-svn: 231617
2015-03-09 01:57:13 +00:00
Olivier Sallenave
09fc2f3b93 Do not restrict interleaved unrolling to small loops, depending on the target.
llvm-svn: 231528
2015-03-06 23:12:04 +00:00
Benjamin Kramer
036f9f4b77 LoopInterchange: Remove empty method.
llvm-svn: 231503
2015-03-06 19:37:26 +00:00
Benjamin Kramer
8d23a2424d LoopInterchange: Rephrase instruction moving using ilist's splice and factor it into a function
+ Random cleanups. No functional change.

llvm-svn: 231501
2015-03-06 18:59:14 +00:00
Benjamin Kramer
92996a5287 Fold init() helpers into constructors. NFC.
llvm-svn: 231486
2015-03-06 16:21:15 +00:00
Daniel Jasper
6a75d92096 Change the way in which error case is being handled.
Specifically this:
* Prevents an "unused" warning in non-assert builds.
* In that error case return with out removing a child loop instead of
  looping forever.

llvm-svn: 231459
2015-03-06 10:39:14 +00:00
Karthik Bhat
78ffabf782 Add a new pass "Loop Interchange"
This pass interchanges loops to provide a more cache-friendly memory access.

For e.g. given a loop like -
  for(int i=0;i<N;i++)
    for(int j=0;j<N;j++)
      A[j][i] = A[j][i]+B[j][i];

is interchanged to -
  for(int j=0;j<N;j++)
    for(int i=0;i<N;i++)
      A[j][i] = A[j][i]+B[j][i];

This pass is currently disabled by default.

To give a brief introduction it consists of 3 stages-

LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix.
LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time.
LoopInterchangeTransform : Which does the actual transform.

LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks.

TODO:
1) Add support for reductions and lcssa phi.
2) Improve profitability model.
3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange.
4) Improve compile time regression found in llvm lnt due to this pass.
5) Fix issues in Dependency Analysis module.

A special thanks to Hal for reviewing this code.
Review: http://reviews.llvm.org/D7499

llvm-svn: 231458
2015-03-06 10:11:25 +00:00
Yaron Keren
de3ef7d99d Silence C4715 'not all control paths return a value' warnings.
llvm-svn: 231455
2015-03-06 07:49:14 +00:00
Michael Gottesman
e3e9e77868 [objc-arc] Sprinkle some more auto on some iterators.
llvm-svn: 231447
2015-03-06 02:10:03 +00:00
Michael Gottesman
beba035edd [objc-arc] Move the detection of potential uses or altering of a ref count onto PtrState.
llvm-svn: 231446
2015-03-06 02:07:12 +00:00
Michael Gottesman
7fd3df0bda [objc-arc] Move the checking of whether or not we can match onto PtrStates and out of the main dataflow.
These refactored computations check whether or not we are at a stage
of the sequence where we can perform a match. This patch moves the
computation out of the main dataflow and into
{BottomUp,TopDown}PtrState.

llvm-svn: 231439
2015-03-06 00:34:42 +00:00
Michael Gottesman
44a01af1ff [objc-arc] Refactor (Re-)initialization of PtrState from dataflow -> {TopDown,BottomUp}PtrState Class.
This initialization occurs when we see a new retain or release. Before
we performed the actual initialization inline in the dataflow. That is
just messy.

llvm-svn: 231438
2015-03-06 00:34:39 +00:00
Michael Gottesman
e8f633ef7e [objc-arc] Create two subclasses of PtrState in preparation for moving per ptr state change behavior onto a PtrState class.
This will enable the main ObjCARCOpts dataflow to work with higher
level concepts such as "can this ptr state be modified by this ref
count" and not need to understand the nitty gritty details of how that
is determined. This makes the dataflow cleaner.

llvm-svn: 231437
2015-03-06 00:34:36 +00:00
Michael Gottesman
9eb7d50d22 [objc-arc] Extract out MDNodes into a cache structure so the information can be passed around.
llvm-svn: 231436
2015-03-06 00:34:33 +00:00
Michael Gottesman
18520dd445 [objc-arc] Remove annotations code.
It will always be in the history if it is needed again. Now it is just dead
code.

llvm-svn: 231435
2015-03-06 00:34:29 +00:00
Michael Gottesman
8db6488873 Fix build error.
llvm-svn: 231430
2015-03-05 23:57:07 +00:00
Michael Gottesman
c0588d3476 [objc-arc] Change some casts and loop iterators to use auto.
llvm-svn: 231427
2015-03-05 23:29:06 +00:00
Michael Gottesman
2c056fffac [objc-arc] Extract out state specific to a ref count from the main objc arc sequence dataflow. This will allow me to separate the actual ARC queries from the meat of the dataflow algorithm.
llvm-svn: 231426
2015-03-05 23:29:03 +00:00
Michael Gottesman
43db8da6a6 [objc-arc] Extract blot map vector into its own file. NFC.
llvm-svn: 231425
2015-03-05 23:28:58 +00:00
Michael Kuperstein
92d61decb9 [InstCombine] Fix an assertion when fmul has a ConstantExpr operand
isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions.

Patch by Pawel Jurek <pawel.jurek@intel.com>
Differential Revision: http://reviews.llvm.org/D8053

llvm-svn: 231359
2015-03-05 08:38:57 +00:00
Kostya Serebryany
2bdee60601 [sanitizer] add nosanitize metadata to more coverage instrumentation instructions
llvm-svn: 231333
2015-03-05 01:20:05 +00:00
Sanjoy Das
9ad7755e84 [IndVarSimplify] use the "canonical" way to infer no-wrap.
Summary:
rL225282 introduced an ad-hoc way to promote some additions to nuw or
nsw.  Since then SCEV has become smarter in directly proving no-wrap;
and using the canonical "ext(A op B) == ext(A) op ext(B)" method of
proving no-wrap is just as powerful now.  Rip out the existing
complexity in favor of getting SCEV to do all the heaving lifting
internally.

This change does not add any unit tests because it is supposed to be a
non-functional change.  Tests added in rL225282 and rL226075 are valid
tests for this change.

Reviewers: atrick, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7981

llvm-svn: 231306
2015-03-04 22:24:23 +00:00
Reid Kleckner
febeb46737 Try to satisfy sanitizer lint check
llvm-svn: 231284
2015-03-04 20:38:59 +00:00
Mehdi Amini
29ebc2d39f Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
Dmitry Vyukov
a173129afb asan: do not instrument direct inbounds accesses to stack variables
Do not instrument direct accesses to stack variables that can be
proven to be inbounds, e.g. accesses to fields of structs on stack.

But it eliminates 33% of instrumentation on webrtc/modules_unittests
(number of memory accesses goes down from 290152 to 193998) and
reduces binary size by 15% (from 74M to 64M) and improved compilation time by 6-12%.

The optimization is guarded by asan-opt-stack flag that is off by default.

http://reviews.llvm.org/D7583

llvm-svn: 231241
2015-03-04 13:27:53 +00:00
Philip Reames
59aed68d01 [RewriteStatepointsForGC] Fix a relocation bug w.r.t values defined by invoke instructions
RewriteStatepointsForGC pass emits an alloca for each GC pointer which will be relocated. It then inserts stores after def and all relocations, and inserts loads before each use as well. In the end, mem2reg is used to update IR with relocations in SSA form.

However, there is a problem with inserting stores for values defined by invoke instructions. The code didn't expect a def was a terminator instruction, and inserting instructions after these terminators resulted in malformed IR.

This patch fixes this problem by handling invoke instructions as a special case. If the def is an invoke instruction, the store will be inserted at the beginning of the normal destination block. Since return value from invoke instruction does not dominate the unwind destination block, no action is needed there.

Patch by: Chen Li
Differential Revision: http://reviews.llvm.org/D7923

llvm-svn: 231183
2015-03-04 00:13:52 +00:00
Kostya Serebryany
285f1f0e41 [sanitizer/coverage] Add AFL-style coverage counters (search heuristic for fuzzing).
Introduce -mllvm -sanitizer-coverage-8bit-counters=1
which adds imprecise thread-unfriendly 8-bit coverage counters.

The run-time library maps these 8-bit counters to 8-bit bitsets in the same way
AFL (http://lcamtuf.coredump.cx/afl/technical_details.txt) does:
counter values are divided into 8 ranges and based on the counter
value one of the bits in the bitset is set.
The AFL ranges are used here: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+.

These counters provide a search heuristic for single-threaded
coverage-guided fuzzers, we do not expect them to be useful for other purposes.

Depending on the value of -fsanitize-coverage=[123] flag,
these counters will be added to the function entry blocks (=1),
every basic block (=2), or every edge (=3).

Use these counters as an optional search heuristic in the Fuzzer library.
Add a test where this heuristic is critical.

llvm-svn: 231166
2015-03-03 23:27:02 +00:00
David Majnemer
0a7829af7e InstCombine: Ensure select condition types are identical before merging
Selection conditions may be vectors or scalars.  Make sure InstCombine
doesn't indiscriminately assume that a select which is value dependent
on another select have identical select condition types.

This fixes PR22773.

llvm-svn: 231156
2015-03-03 22:40:36 +00:00
David Blaikie
5914e69f39 RewriteStatepointsForGC::PhiState: Remove explicit copy ctor in favor of the Rule of Zero
The assertion was just checking a class invariant that's pretty easy to
verify by inspection (no mutating operations, and the two non-copy ctors
already ensure the state is maintained) so remove the explicit copy ctor
in favor of the default, thus allowing the use of the default copy
assignment operator without hitting the C++11 deprecation here.

llvm-svn: 231143
2015-03-03 21:49:07 +00:00
David Blaikie
5fd9cda286 Revert "Remove the explicit SDNodeIterator::operator= in favor of the implicit default"
Accidentally committed a few more of these cleanup changes than
intended. Still breaking these out & tidying them up.

This reverts commit r231135.

llvm-svn: 231136
2015-03-03 21:18:16 +00:00
David Blaikie
f9b228449d Remove the explicit SDNodeIterator::operator= in favor of the implicit default
There doesn't seem to be any need to assert that iterator assignment is
between iterators over the same node - if you want to reuse an iterator
variable to iterate another node, that's perfectly acceptable. Just
don't mix comparisons between iterators into disjoint sequences, as
usual.

llvm-svn: 231135
2015-03-03 21:17:08 +00:00
Peter Collingbourne
fca689c3d4 LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets.
By loading from indexed offsets into a byte array and applying a mask, a
program can test bits from the bit set with a relatively short instruction
sequence. For example, suppose we have 15 bit sets to lay out:

A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits),
F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits),
L (4 bits), M (3 bits), N (2 bits), O (1 bit)

These bits can be laid out in a 16-byte array like this:

      Byte Offset
    0123456789ABCDEF
Bit
  7 HHHHHHHHHIIIIIII
  6 GGGGGGGGGGJJJJJJ
  5 FFFFFFFFFFFKKKKK
  4 EEEEEEEEEEEELLLL
  3 DDDDDDDDDDDDDMMM
  2 CCCCCCCCCCCCCCNN
  1 BBBBBBBBBBBBBBBO
  0 AAAAAAAAAAAAAAAA

For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to
test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done
in 1-2 machine instructions on x86, or 4-6 instructions on ARM.

This uses the LPT multiprocessor scheduling algorithm to lay out the bits
efficiently.

Saves ~450KB of instructions in a recent build of Chromium.

Differential Revision: http://reviews.llvm.org/D7954

llvm-svn: 231043
2015-03-03 00:49:28 +00:00
Benjamin Kramer
11218d5663 LoopIdiom: Give globals for memset_pattern16 private linkage.
There's really no reason to have them have entries in the symbol table
anymore. Old versions of ld64 had some bugs in this area but those have
been fixed long ago.

llvm-svn: 231041
2015-03-03 00:17:09 +00:00
Sanjoy Das
3584fa135a Revert some changes that were made to fix PR20680.
This re-lands change r230921.  r230921 was reverted because it broke a
clang test; a checkin fixing the clang test will be commited shortly.

Summary:
As far as I can tell, the real bug causing the issue was fixed in
r230533.  SCEVExpander should mark an increment operation as nuw or nsw
only if it can *prove* that the operation does not overflow.  There
shouldn't be any situation where we have to do something different
because of no-wrap flags generated by SCEVExpander.

Revert "IndVarSimplify: Allow LFTR to fire more often"

This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213).

Revert "IndVarSimplify: Don't let LFTR compare against a poison value"

This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102).

Reviewers: majnemer, atrick, spatel

Differential Revision: http://reviews.llvm.org/D7979

llvm-svn: 231018
2015-03-02 21:41:07 +00:00
Michael Zolotukhin
b35e11704f Make ToVectorTy static.
llvm-svn: 231007
2015-03-02 20:43:24 +00:00
Benjamin Kramer
0d48361b44 SLPVectorizer: Rewrite ArrayRef slice compare to be more idiomatic.
NFC intended.

llvm-svn: 230965
2015-03-02 15:24:36 +00:00
NAKAMURA Takumi
233b254c9b Revert r230921, "Revert some changes that were made to fix PR20680.", for now.
It caused a failure on clang/test/Misc/backend-optimization-failure.cpp .

llvm-svn: 230929
2015-03-02 01:14:03 +00:00
Sanjoy Das
b11e512365 Revert some changes that were made to fix PR20680.
Summary:
As far as I can tell, the real bug causing the issue was fixed in
r230533.  SCEVExpander should mark an increment operation as nuw or nsw
only if it can *prove* that the operation does not overflow.  There
shouldn't be any situation where we have to do something different
because of no-wrap flags generated by SCEVExpander.

Revert "IndVarSimplify: Allow LFTR to fire more often"

This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213).

Revert "IndVarSimplify: Don't let LFTR compare against a poison value"

This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102).

Reviewers: majnemer, atrick, spatel

Differential Revision: http://reviews.llvm.org/D7979

llvm-svn: 230921
2015-03-01 23:36:26 +00:00
Benjamin Kramer
1a8767d1c0 TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator.
Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting
them early also slightly simplifies code.

Thanks to Sanjay for the IR test case.

llvm-svn: 230856
2015-02-28 16:47:27 +00:00
Benjamin Kramer
4718069d87 Convert push_back loops into append calls.
No functionality change intended.

llvm-svn: 230849
2015-02-28 13:20:15 +00:00
Yaron Keren
e4fac1eb58 Silence variable set but not used warning, NFC.
llvm-svn: 230848
2015-02-28 13:11:24 +00:00
Benjamin Kramer
0f0bd365d8 Replace std::copy with a back inserter with vector append where feasible
All of the cases were just appending from random access iterators to a
vector. Using insert/append can grow the vector to the perfect size
directly and moves the growing out of the loop. No intended functionalty
change.

llvm-svn: 230845
2015-02-28 10:11:12 +00:00
Philip Reames
c309763721 [RewriteStatepointsForGC] Reduce indentation via early continue [NFC]
llvm-svn: 230836
2015-02-28 01:57:44 +00:00
Philip Reames
7a2e8f33b1 [RewriteStatepointsForGC] Fix another order of iteration bug
It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated.  We need to nail this down to avoid non-deterministic output and possible test failures.  

The modified test is the one I first noticed something odd in.  The change is making it more strict to report the error.  With the test change, but without the code change, the test fails roughly 1 in 5.  With the code change, I've run ~30 runs without error.

Long term, the right fix here is to adjust the naming scheme.  I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend.  HJust because I only noticed one case doesn't mean it's actually the only case.  I hope to get to the right change Monday.

std->llvm data structure changes bugfix change #3

llvm-svn: 230835
2015-02-28 01:52:09 +00:00
Philip Reames
5658248d0a [RewriteStatepointsForGC] Reduce indentation via early continue [NFC]
llvm-svn: 230829
2015-02-28 00:54:41 +00:00
Philip Reames
c0081494c4 [RewriteStatepointsForGC] Fix iterator invalidation bug
Inserting into a DenseMap you're iterating over is not well defined.  This is unfortunate since this is well defined on a std::map.

"cleanup per llvm code style standards" bug #2

llvm-svn: 230827
2015-02-28 00:47:50 +00:00
Philip Reames
937676755d [RewriteStatepointsForGC] Add tests for the base pointer identification algorithm
These tests cover the 'base object' identification and rewritting portion of RewriteStatepointsForGC.  These aren't completely exhaustive, but they've proven to be reasonable effective over time at finding regressions.

In the process of porting these tests over, I found my first "cleanup per llvm code style standards" bug.  We were relying on the order of iteration when testing the base pointers found for a derived pointer.  When we switched from std::set to DenseSet, this stopped being a safe assumption.  I'm suspecting I'm going to find more of those.  In particular, I'm now really wondering about the main iteration loop for this algorithm.  I need to go take a closer look at the assumptions there.

I'm not really happy with the fact these are testing what is essentially debug output (i.e. enabled via command line flags).  Suggestions for how to structure this better are very welcome.  

llvm-svn: 230818
2015-02-28 00:20:48 +00:00
Sanjay Patel
c1da1c8295 remove function names from comments; NFC
llvm-svn: 230766
2015-02-27 17:27:15 +00:00
Anna Zaks
492074fb03 [asan] Skip promotable allocas to improve performance at -O0
Currently, the ASan executables built with -O0 are unnecessarily slow.
The main reason is that ASan instrumentation pass inserts redundant
checks around promotable allocas. These allocas do not get instrumented
under -O1 because they get converted to virtual registered by mem2reg.
With this patch, ASan instrumentation pass will only instrument non
promotable allocas, giving us a speedup of 39% on a collection of
benchmarks with -O0. (There is no measurable speedup at -O1.)

llvm-svn: 230724
2015-02-27 03:12:36 +00:00
Eric Christopher
302e78f721 Remove DebugLoc::print(LLVMContext, raw_ostream), it was just
forwarding to the one that didn't take a context.

llvm-svn: 230700
2015-02-26 23:32:17 +00:00
Hal Finkel
5cc8afebfd [InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores
InstCombine has long had logic to convert aligned Altivec load/store intrinsics
into regular loads and stores. This mirrors that functionality for QPX vector
load/store intrinsics.

llvm-svn: 230660
2015-02-26 18:56:03 +00:00
Sanjoy Das
a4bb8d9be1 IRCE: only touch loops that have been shown to have a high
backedge-taken count in profiliing data.

llvm-svn: 230619
2015-02-26 08:56:04 +00:00
Sanjoy Das
622e5ac37c IRCE: generalize to handle loops with decreasing induction variables.
IRCE can now split the iteration space for loops like:

   for (i = n; i >= 0; i--)
     a[i + k] = 42; // bounds check on access

llvm-svn: 230618
2015-02-26 08:19:31 +00:00
Sanjoy Das
7e890647a6 IRCE: print newline after printing an InductiveRangeCheck.
llvm-svn: 230607
2015-02-26 04:03:31 +00:00
Ramkumar Ramachandra
d6ef954184 PlaceSafepoints: use IRBuilder helpers
Use the IRBuilder helpers for gc.statepoint and gc.result, instead of
coding the construction by hand. Note that the gc.statepoint IRBuilder
handles only CallInst, not InvokeInst; retain that part of hand-coding.

Differential Revision: http://reviews.llvm.org/D7518

llvm-svn: 230591
2015-02-26 00:35:56 +00:00
Justin Bogner
a57594bf65 InstrProf: Make the __llvm_profile_runtime_user symbol hidden
This symbol exists only to pull in the required pieces of the runtime,
so nothing ever needs to refer to it. Making it hidden avoids the
potential for issues with duplicate symbols when linking profiled
libraries together.

llvm-svn: 230566
2015-02-25 22:52:20 +00:00
Sanjay Patel
7b4252051a only propagate equality comparisons of FP values that we are certain are non-zero
This is a follow-on to r227491 which tightens the check for propagating FP
values. If a non-constant value happens to be a zero, we would hit the same
bug as before.

Bug noted and patch suggested by Eli Friedman.

llvm-svn: 230564
2015-02-25 22:46:08 +00:00
JF Bastien
0f477f42f6 InstCombine: extract instead of shuffle when performing vector/array type punning
Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is.

Test Plan: make check

Reviewers: jvoung, chandlerc

Subscribers: llvm-commits
llvm-svn: 230560
2015-02-25 22:30:51 +00:00
Peter Collingbourne
e60f3e06fc LowerBitSets: Align referenced globals.
This change aligns globals to the next highest power of 2 bytes, up to a
maximum of 128. This makes it more likely that we will be able to compress
bit sets with a greater alignment. In many more cases, we can now take
advantage of a new optimization also introduced in this patch that removes
bit set checks if the bit set is all ones.

The 128 byte maximum was found to provide the best tradeoff between instruction
overhead and data overhead in a recent build of Chromium. It allows us to
remove ~2.4MB of instructions at the cost of ~250KB of data.

Differential Revision: http://reviews.llvm.org/D7873

llvm-svn: 230540
2015-02-25 20:42:41 +00:00
Charles Davis
f135d6973b [IC] Turn non-null MD on pointer loads to range MD on integer loads.
Summary:
This change fixes the FIXME that you recently added when you committed
(a modified version of) my patch.  When `InstCombine` combines a load and
store of an pointer to those of an equivalently-sized integer, it currently
drops any `!nonnull` metadata that might be present.  This change replaces
`!nonnull` metadata with `!range !{ 1, -1 }` metadata instead.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7621

llvm-svn: 230462
2015-02-25 05:10:25 +00:00
Peter Collingbourne
b005cb0cfc LowerBitSets: Introduce global layout builder.
The builder is based on a layout algorithm that tries to keep members of
small bit sets together. The new layout compresses Chromium's bit sets to
around 15% of their original size.

Differential Revision: http://reviews.llvm.org/D7796

llvm-svn: 230394
2015-02-24 23:17:02 +00:00
Sanjay Patel
b665feb92e remove function names from comments; NFC
llvm-svn: 230391
2015-02-24 22:43:06 +00:00
Kuba Brecka
fda2cdf0bc Fix alloca_instruments_all_paddings.cc test to work under higher -O levels (llvm part)
When AddressSanitizer only a single dynamic alloca and no static allocas, due to an early exit from FunctionStackPoisoner::poisonStack we forget to unpoison the dynamic alloca.  This patch fixes that.

Reviewed at http://reviews.llvm.org/D7810

llvm-svn: 230316
2015-02-24 09:47:05 +00:00
Sanjoy Das
de271b53ef New instcombine rule: max(~a,~b) -> ~min(a, b)
This case is interesting because ScalarEvolutionExpander lowers min(a,
b) as ~max(~a,~b).  I think the profitability heuristics can be made
more clever/aggressive, but this is a start.

Differential Revision: http://reviews.llvm.org/D7821

llvm-svn: 230285
2015-02-24 00:08:41 +00:00
Sanjay Patel
28567e334a add newline for easier reading; NFC
llvm-svn: 230265
2015-02-23 21:32:09 +00:00
Andrew Kaylor
4e424eb088 Remap frame variables for native Windows exception handling.
Differential Revision: http://reviews.llvm.org/D7770

llvm-svn: 230249
2015-02-23 20:01:56 +00:00
Chad Rosier
a95dc74f57 Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity.
This patch adds the isProfitableToHoist API.  For AArch64, we want to prevent a
fmul from being hoisted in cases where it is more profitable to form a
fmsub/fmadd.

Phabricator Review: http://reviews.llvm.org/D7299
Patch by Lawrence Hu <lawrence@codeaurora.org>

llvm-svn: 230241
2015-02-23 19:15:16 +00:00
Mehdi Amini
1b3cca0d56 InstSimplify: simplify 0 / X if nnan and nsz
From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 230238
2015-02-23 18:30:25 +00:00
David Blaikie
7420353cf2 Roll condition into an assert then wrap it 'ifndef NDEBUG' to protect from the inevitable "unused variable" warning in a non-asserts build.
llvm-svn: 230181
2015-02-22 20:58:38 +00:00
Hal Finkel
20a3a3322a [LICM] Refactor to expose functionality as utility functions
This refactors the core functionality of LICM: HoistRegion, SinkRegion and
PromoteAliasSet (renamed to promoteLoopAccessesToScalars) as utility functions
in LoopUtils. This will enable other transformations to make use of them
directly.

Patch by Ashutosh Nema.

llvm-svn: 230178
2015-02-22 18:35:32 +00:00
NAKAMURA Takumi
6031ced8c3 RewriteStatepointsForGC.cpp: Fix for -Asserts to mark isNullConstant() as LLVM_ATTRIBUTE_UNUSED. [-Wunused-function]
llvm-svn: 230169
2015-02-22 09:58:19 +00:00
NAKAMURA Takumi
5cfb2b228d RewriteStatepointsForGC.cpp: Fix for -Asserts. [-Wunused-variable]
llvm-svn: 230168
2015-02-22 09:58:13 +00:00
NAKAMURA Takumi
fbfdfed2c5 LowerBitSets.cpp: Prune incorrect \param(s). [-Wdocumentation]
\param should be used as itemized.

llvm-svn: 230167
2015-02-22 09:51:42 +00:00
Sanjoy Das
c86a1c4788 IRCE: generalize InductiveRangeCheck::computeSafeIterationSpace to
work with a non-canonical induction variable.

This is currently a non-functional change because we only ever call
computeSafeIterationSpace on a canonical induction variable; but the
generalization will be useful in a later commit.

llvm-svn: 230151
2015-02-21 22:20:22 +00:00
Sanjoy Das
20241e74ac IRCE: use SCEVs instead of llvm::Value's for intermediate
calculations.  Semantically non-functional change.

This gets rid of some of the SCEV -> Value -> SCEV round tripping and
the Construct(SMin|SMax)Of and MaybeSimplify helper routines.

llvm-svn: 230150
2015-02-21 22:07:32 +00:00
Philip Reames
fe3f4cccba [PlaceSafepoints] Adjust enablement logic to default to off and be GC configurable per GC
Previously, this pass ran over every function in the Module if added to the pass order.  With this change, it runs only over those with a GC attribute where the GC explicitly opts in.  A GC can also choose which of entry safepoint polls, backedge safepoint polls, and call safepoints it wants.  I hope to get these exposed as checks on the GCStrategy at some point, but for now, the checks are manual string comparisons.

llvm-svn: 230097
2015-02-21 00:09:09 +00:00
David Blaikie
37ca96fd40 Remove some unnecessary unreachables in favor of (sometimes implicit) assertions
Also simplify some else-after-return cases including some standard
algorithm convenience/use.

llvm-svn: 230094
2015-02-20 23:44:24 +00:00
Philip Reames
78a8d31aa0 Hide a bunch of advanced testing options in default opt --help output
These are internal options.  I need to go through, evaluate which are worth keeping and which not.  Many of them should probably be renamed as well.  Until I have time to do that, we can at least stop poluting the standard opt -help output.

llvm-svn: 230088
2015-02-20 23:32:03 +00:00
Philip Reames
3169f227e7 [RewriteStatepointsForGC] Use DenseSet in place of std::set [NFC]
This should be the last cleanup on non-llvm preferred data structures.  I left one use of std::set in an assertion; DenseSet didn't seem to have a tombstone for CallSite defined.  That might be worth fixing, but wasn't worth it for a debug only use.

llvm-svn: 230084
2015-02-20 23:16:52 +00:00
Philip Reames
5288177787 [RewriteStatepointsForGC] Replace std::map with DenseMap
I'd done the work of extracting the typedef in a previous commit, but didn't actually change it.  Hopefully this will make any subtle changes easier to isolate.

llvm-svn: 230081
2015-02-20 22:48:20 +00:00
Philip Reames
fde5927541 [RewriteStatepointsForGC] Cleanup - replace std::vector usage [NFC]
Migrate std::vector usage to a combination of SmallVector and ArrayRef.

llvm-svn: 230079
2015-02-20 22:39:41 +00:00
Philip Reames
70849ca660 [RewriteStatepointsForGC] More style cleanup [NFC]
Use llvm_unreachable where appropriate, use SmallVector where easy to do so, introduce typedefs for planned type migrations.

llvm-svn: 230068
2015-02-20 22:05:18 +00:00
Philip Reames
ec08b709ea [RewriteStatepointsForGC] Remove notion of SafepointBounds [NFC]
The notion of a range of inserted safepoint related code is no longer really applicable.  This survived over from an earlier implementation.  Just saving the inserted gc.statepoint and working from that is far clearer given the current code structure.  Particularly when invokable statepoints get involved.

llvm-svn: 230063
2015-02-20 21:34:11 +00:00
Benjamin Kramer
787373e72c LoopRotate: When reconstructing loop simplify form don't split edges from indirectbrs.
Yet another chapter in the endless story. While this looks like we leave
the loop in a non-canonical state this replicates the logic in
LoopSimplify so it doesn't diverge from the canonical form in any way.

PR21968

llvm-svn: 230058
2015-02-20 20:49:25 +00:00
Peter Collingbourne
68aaa34960 Introduce bitset metadata format and bitset lowering pass.
This patch introduces a new mechanism that allows IR modules to co-operatively
build pointer sets corresponding to addresses within a given set of
globals. One particular use case for this is to allow a C++ program to
efficiently verify (at each call site) that a vtable pointer is in the set
of valid vtable pointers for the class or its derived classes. One way of
doing this is for a toolchain component to build, for each class, a bit set
that maps to the memory region allocated for the vtables, such that each 1
bit in the bit set maps to a valid vtable for that class, and lay out the
vtables next to each other, to minimize the total size of the bit sets.

The patch introduces a metadata format for representing pointer sets, an
'@llvm.bitset.test' intrinsic and an LTO lowering pass that lays out the globals
and builds the bitsets, and documents the new feature.

Differential Revision: http://reviews.llvm.org/D7288

llvm-svn: 230054
2015-02-20 20:30:47 +00:00
Philip Reames
10e2fab7a3 [GC, RewriteStatepointsForGC] Style cleanup and bug fix
When doing style cleanup, I noticed a minor bug in this code.  If we have a pointer that we think is unused after a statepoint and thus doesn't need relocation, we store a null pointer into the alloca we're about to promote.  This helps turn a mistake in liveness analysis into an easily debuggable crash.  It turned out this code had never been updated to handle invoke statepoints.  

There's no test for this.  Without a bug in liveness, it appears impossible to make this trigger in a way which is visible in the resulting IR.  We might store the null, but when promoting the alloca, there will be no uses and thus nothing to test against.  Suggestions on how to test are very welcome.

llvm-svn: 230047
2015-02-20 19:51:56 +00:00
Reid Kleckner
89bd75e03b Use unreachable instead of assert(false) to silence MSVC warning
llvm-svn: 230045
2015-02-20 19:46:02 +00:00
Philip Reames
30e186d0b4 [GC] Style cleanup for RewriteStatepointForGC (1 of many) [NFC]
Starting to update variable naming and types to match LLVM style.  This will be an incremental process to minimize the chance of breakage as I work.  Step one, rename member variables to LLVM CamelCase and use llvm's ADT.  Much more to come.

llvm-svn: 230042
2015-02-20 19:26:04 +00:00
Philip Reames
3c4461b7c6 Bugfix for 229954
Before calling Function::getGC to test for enablement, we need to make sure there's actually a GC at all via Function::hasGC.  Otherwise, we'd crash on functions without a GC.  Thankfully, this only mattered if you manually scheduled the pass, but still, oops. :(

llvm-svn: 230040
2015-02-20 18:56:14 +00:00
Benjamin Kramer
3cdf060580 RewriteStatepointsForGC: Move details into anonymous namespaces. NFC.
While there reduce the number of duplicated std::map lookups.

llvm-svn: 230012
2015-02-20 14:00:58 +00:00
Benjamin Kramer
ee7dc15827 Wrap recursive function only used in assert in #ifndef NDEBUG.
Avoids unused function warnings in Release builds.

llvm-svn: 230009
2015-02-20 13:15:49 +00:00
Nick Lewycky
2d860570ee Fix build in release mode, four cases of -Wunused-variable.
llvm-svn: 229976
2015-02-20 07:14:02 +00:00
Hal Finkel
09a36680d7 [InstCombine] Remove unnecessary variable indexing into single-element arrays
This change addresses a deficiency pointed out in PR22629. To copy from the bug
report:

[from the bug report]

Consider this code:

int f(int x) {
  int a[] = {12};
  return a[x];
}

GCC knows to optimize this to

movl     $12, %eax
ret

The code generated by recent Clang at -O3 is:

movslq   %edi, %rax
movl     .L_ZZ1fiE1a(,%rax,4), %eax
retq

.L_ZZ1fiE1a:
  .long    12                      # 0xc

[end from the bug report]

This definitely seems worth fixing. I've also seen this kind of code before (as
the base case of generic vector wrapper templates with one element).

The general idea is to look at the GEP feeding a load or a store, which has
some variable as its first non-zero index, and determine if that index must be
zero (or else an out-of-bounds access would occur). We can do this for allocas
and globals with constant initializers where we know the maximum size of the
underlying object. When we find such a GEP, we create a new one for the memory
access with that first variable index replaced with a constant zero.

Even if we can't eliminate the memory access (and sometimes we can't), it is
still useful because it removes unnecessary indexing calculations.

llvm-svn: 229959
2015-02-20 03:05:53 +00:00
Philip Reames
d1cf30e1b4 Adjust enablement of RewriteStatepointsForGC
When back merging the changes in 229945 I noticed that I forgot to mark the test cases with the appropriate GC.  We want the rewriting to be off by default (even when manually added to the pass order), not on-by default.  To keep the current test working, mark them as using the statepoint-example GC and whitelist that GC.  

Longer term, we need a better selection mechanism here for both actual usage and testing.  As I migrate more tests to the in tree version of this pass, I will probably need to update the enable/disable logic as well. 

llvm-svn: 229954
2015-02-20 02:34:49 +00:00
Philip Reames
623c8019b3 Add a pass for constructing gc.statepoint sequences w/explicit relocations
This patch consists of a single pass whose only purpose is to visit previous inserted gc.statepoints which do not have gc.relocates inserted yet, and insert them. This can be used either immediately after IR generation to perform 'early safepoint insertion' or late in the pass order to perform 'late insertion'.

This patch is setting the stage for work to continue in tree.  In particular, there are known naming and style violations in the current patch.  I'll try to get those resolved over the next week or so.  As I touch each area to make style changes, I need to make sure we have adequate testing in place.  As part of the cleanup, I will be cleaning up a collection of test cases we have out of tree and submitting them upstream. The tests included in this change are very basic and mostly to provide examples of usage.

The pass has several main subproblems it needs to address:
- First, it has identify any live pointers. In the current code, the use of address spaces to distinguish pointers to GC managed objects is hard coded, but this will become parametrizable in the near future.  Note that the current change doesn't actually contain a useful liveness analysis.  It was seperated into a followup change as the code wasn't ready to be shared.  Instead, the current implementation just considers any dominating def of appropriate pointer type to be live.
- Second, it has to identify base pointers for each live pointer. This is a fairly straight forward data flow algorithm. 
- Third, the information in the previous steps is used to actually introduce rewrites. Rather than trying to do this by hand, we simply re-purpose the code behind Mem2Reg to do this for us.

llvm-svn: 229945
2015-02-20 01:06:44 +00:00
Kostya Serebryany
b0d93d0f23 [sanitizer] when dumping the basic block trace, also dump the module names. Patch by Laszlo Szekeres
llvm-svn: 229940
2015-02-20 00:30:44 +00:00
Michael Gottesman
802d429449 [objc-arc-contract] We can not move retains over instructions which can not conservatively be proven to not decrement the retain's RCIdentity.
I also cleaned up the code to make it more understandable for mere mortals.

<rdar://problem/19853758>

llvm-svn: 229937
2015-02-20 00:02:49 +00:00
Michael Gottesman
e59c07a4e6 [objc-arc] Add the predicate CanDecrementRefCount.
This is different from CanAlterRefCount since CanDecrementRefCount is
attempting to prove specifically whether or not an instruction can
decrement instead of the more general question of whether it can
decrement or increment.

llvm-svn: 229936
2015-02-20 00:02:45 +00:00
Benjamin Kramer
f845cfd0f0 SSAUpdater: Use range-based for. NFC.
llvm-svn: 229908
2015-02-19 20:04:02 +00:00
Michael Gottesman
2c0c5cbd62 [objc-arc] Convert the bodies of ARCInstKind predicates into covered switches.
This is much better than the previous manner of just using
short-curcuiting booleans from:

1. A "naive" efficiency perspective: we do not have to rely on the
compiler to change the short circuiting boolean operations into a
switch.
2. An understanding perspective by making the implicit behavior of
negative predicates explicit.
3. A maintainability perspective through the covered switch flag making
it easy to know where to update code when adding new ARCInstKinds.

llvm-svn: 229906
2015-02-19 19:51:36 +00:00
Michael Gottesman
de9ce7c8a1 [objc-arc] Change the InstructionClass to be an enum class called ARCInstKind.
I also renamed ObjCARCUtil.cpp -> ARCInstKind.cpp. That file only contained
items related to ARCInstKind anyways.

llvm-svn: 229905
2015-02-19 19:51:32 +00:00
Adam Nemet
eb3a2a0c5f [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229899
2015-02-19 19:15:21 +00:00
Adam Nemet
c3fb49ac6a [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229897
2015-02-19 19:15:15 +00:00
Adam Nemet
a19312773c [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229896
2015-02-19 19:15:13 +00:00
Adam Nemet
a98e624489 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229894
2015-02-19 19:15:07 +00:00
Adam Nemet
6041d14ab3 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229893
2015-02-19 19:15:04 +00:00
Adam Nemet
2205a68ebc [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229892
2015-02-19 19:15:00 +00:00
Adam Nemet
95a7235607 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229891
2015-02-19 19:14:56 +00:00
Adam Nemet
e77467312e [LoopAccesses] Make VectorizerParams global + fix for cyclic dep
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This commits also has the fix (D7731) to the break dependence cycle
between the analysis and vector libraries.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229890
2015-02-19 19:14:52 +00:00
Adam Nemet
ef703ef4d8 Revert "Reformat."
This reverts commit r229651.

I'd like to ultimately revert r229650 but this reformat stands in the
way.  I'll reformat the affected files once the the loop-access pass is
fully committed.

llvm-svn: 229889
2015-02-19 19:14:34 +00:00
Benjamin Kramer
deb8f2934a LSR: Move set instead of copying. NFC.
llvm-svn: 229871
2015-02-19 17:19:43 +00:00
Benjamin Kramer
c0850fa665 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Michael Gottesman
86672937bd [objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC.
The RCIdentity root ("Reference Count Identity Root") of a value V is a
dominating value U for which retaining or releasing U is equivalent to
retaining or releasing V. In other words, ARC operations on V are
equivalent to ARC operations on U.

This is a useful property to ascertain since we can use this in the ARC
optimizer to make it easier to match up ARC operations by always mapping
ARC operations to RCIdentityRoots instead of pointers themselves. Then
we perform pairing of retains, releases which are applied to the same
RCIdentityRoot.

In general, the two ways that we see RCIdentical values in ObjC are via:

  1. PointerCasts
  2. Forwarding Calls that return their argument verbatim.

As such in ObjC, two RCIdentical pointers must always point to the same
memory location.

Previously this concept was implicit in the code and various methods
that dealt with this concept were given functional names that did not
conform to any name in the "ARC" model. This often times resulted in
code that was hard for the non-ARC acquanted to understand resulting in
unhappiness and confusion.

llvm-svn: 229796
2015-02-19 00:42:38 +00:00
Michael Gottesman
b9380fca86 [objc-arc-contract] Rename contractRelease => tryToContractReleaseIntoStoreStrong.
NFC. Makes it clearer what this method is actually supposed to do.

llvm-svn: 229795
2015-02-19 00:42:34 +00:00
Michael Gottesman
5352c179b7 [objc-arc-contract] Refactor out tryToPeepholeInstruction into its own method. NFC.
The main method of ObjCARCContract is really large and busy. By refactoring this
out, it becomes easier to reason about.

llvm-svn: 229794
2015-02-19 00:42:30 +00:00
Michael Gottesman
28626ad221 [objc-arc-contract] Reorganize the code a bit and make the debug output easier to read.
llvm-svn: 229793
2015-02-19 00:42:27 +00:00
Sanjoy Das
48dc5bb962 Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if
computing the trip count overflows.  This change also gets rid of the
backedge check in the prologue loop and the extra check for
overflowing trip-count.

Differential Revision: http://reviews.llvm.org/D7715

llvm-svn: 229731
2015-02-18 19:32:25 +00:00
Andrew Kaylor
eca1819627 Adding implementation to outline C++ catch handlers for native Windows 64 exception handling.
Differential Revision: http://reviews.llvm.org/D7363

llvm-svn: 229715
2015-02-18 18:31:51 +00:00
Mohit K. Bhakkad
ac187bf468 [MSan][MIPS] VarArgHelper for MIPS64
Reviewers: Reviewers: eugenis, kcc, samsonov, petarj

Subscribers: dsanders, sagar, llvm-commits

Differential Revision: http://reviews.llvm.org/D7182

llvm-svn: 229667
2015-02-18 11:41:24 +00:00
NAKAMURA Takumi
e99052dc84 Reformat.
llvm-svn: 229651
2015-02-18 08:36:14 +00:00
NAKAMURA Takumi
922c5e2986 Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector.
r229622: "[LoopAccesses] Make VectorizerParams global"
  r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it"
  r229624: "[LoopAccesses] Cache the result of canVectorizeMemory"
  r229626: "[LoopAccesses] Create the analysis pass"
  r229628: "[LoopAccesses] Change debug messages from LV to LAA"
  r229630: "[LoopAccesses] Add canAnalyzeLoop"
  r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport"
  r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport"
  r229633: "[LoopAccesses] Add -analyze support"
  r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference"
  r229638: "Analysis: fix buildbots"

llvm-svn: 229650
2015-02-18 08:34:47 +00:00
Craig Topper
886ea644c2 [X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2.
llvm-svn: 229641
2015-02-18 06:24:49 +00:00
Craig Topper
398dc737fa [X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles.
llvm-svn: 229640
2015-02-18 06:24:44 +00:00
Adam Nemet
ec70942f90 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229634
2015-02-18 03:44:33 +00:00
Adam Nemet
213b3dce3c [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229632
2015-02-18 03:44:25 +00:00
Adam Nemet
2126adeebf [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229631
2015-02-18 03:44:20 +00:00
Adam Nemet
d74d7f9eaa [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229628
2015-02-18 03:43:37 +00:00
Adam Nemet
b645eb3a09 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229626
2015-02-18 03:43:24 +00:00
Adam Nemet
1c221f2fa8 [LoopAccesses] Make blockNeedsPredication static
blockNeedsPredication is in LoopAccess in order to share it with the
vectorizer.  It's a utility needed by LoopAccess not strictly provided
by it but it's a good place to share it.  This makes the function static
so that it no longer required to create an LoopAccessInfo instance in
order to access it from LV.

This was actually causing problems because it would have required
creating LAI much earlier that LV::canVectorizeMemory().

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229625
2015-02-18 03:43:19 +00:00
Adam Nemet
bad3bd2042 [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229624
2015-02-18 03:42:57 +00:00
Adam Nemet
7664875dc2 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229623
2015-02-18 03:42:50 +00:00
Adam Nemet
24750973e2 [LoopAccesses] Make VectorizerParams global
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229622
2015-02-18 03:42:43 +00:00
Adam Nemet
292fb2e17f [LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo
LoopAccessAnalysis will be used as the name of the pass.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229621
2015-02-18 03:42:35 +00:00
Akira Hatanaka
913155d842 [InstCombine] Do not insert a GEP instruction before a landingpad instruction.
InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the
insertion point, which caused the verifier to complain when a GEP was inserted
before a landingpad instruction. This commit fixes it to use getFirstInsertionPt
instead.

rdar://problem/19394964

llvm-svn: 229619
2015-02-18 03:30:11 +00:00
Hal Finkel
c078e835f2 [BDCE] Don't forget uses of root instructions seen before the instruction itself
When visiting the initial list of "root" instructions (those which must always
be alive), for those that are integer-valued (such as invokes returning an
integer), we mark their bits as (initially) all dead (we might, obviously, find
uses of those bits later, but all bits are assumed dead until proven
otherwise). Don't do so, however, if we're already seen a use of those bits by
another root instruction (such as a store).

Fixes a miscompile of the sanitizer unit tests on x86_64.

Also, add a debug line for visiting the root instructions, and remove a debug
line which tried to print instructions being removed (printing dead
instructions is dangerous, and can sometimes crash).

llvm-svn: 229618
2015-02-18 03:12:28 +00:00
Benjamin Kramer
21bee91af5 Prefer SmallVector::append/insert over push_back loops.
Same functionality, but hoists the vector growth out of the loop.

llvm-svn: 229500
2015-02-17 15:29:18 +00:00
Elena Demikhovsky
b1743d071c Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613

llvm-svn: 229495
2015-02-17 13:10:05 +00:00
Hal Finkel
c9890f4fe1 [BDCE] Add a bit-tracking DCE pass
BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the
"aggressive DCE" pass), with the added capability to track dead bits of integer
valued instructions and remove those instructions when all of the bits are
dead.

Currently, it does not actually do this all-bits-dead removal, but rather
replaces the instruction's uses with a constant zero, and lets instcombine (and
the later run of ADCE) do the rest. Because we essentially get a run of ADCE
"for free" while tracking the dead bits, we also do what ADCE does and removes
actually-dead instructions as well (this includes instructions newly trivially
dead because all bits were dead, but not all such instructions can be removed).

The motivation for this is a case like:

int __attribute__((const)) foo(int i);
int bar(int x) {
  x |= (4 & foo(5));
  x |= (8 & foo(3));
  x |= (16 & foo(2));
  x |= (32 & foo(1));
  x |= (64 & foo(0));
  x |= (128& foo(4));
  return x >> 4;
}

As it turns out, if you order the bit-field insertions so that all of the dead
ones come last, then instcombine will remove them. However, if you pick some
other order (such as the one above), the fact that some of the calls to foo()
are useless is not locally obvious, and we don't remove them (without this
pass).

I did a quick compile-time overhead check using sqlite from the test suite
(Release+Asserts). BDCE took ~0.4% of the compilation time (making it about
twice as expensive as ADCE).

I've not looked at why yet, but we eliminate instructions due to having
all-dead bits in:
External/SPEC/CFP2006/447.dealII/447.dealII
External/SPEC/CINT2006/400.perlbench/400.perlbench
External/SPEC/CINT2006/403.gcc/403.gcc
MultiSource/Applications/ClamAV/clamscan
MultiSource/Benchmarks/7zip/7zip-benchmark

llvm-svn: 229462
2015-02-17 01:36:59 +00:00
Mehdi Amini
ce11b626e7 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229437
2015-02-16 21:47:54 +00:00
James Molloy
c4b7d913d4 Run LICM as part of the cleanup phase from the scalar optimizer.
Things like LoopUnrolling can produce loop invariant values - make sure
we pick them up.

llvm-svn: 229419
2015-02-16 18:59:54 +00:00
Hal Finkel
4546f3de43 [ADCE] Don't indent inside an anonymous namespace
To be consistent with what clang-format does, don't add extra indentation
inside an anonymous namespace. NFC.

llvm-svn: 229412
2015-02-16 18:08:00 +00:00
James Molloy
317bb7473f [LoopReroll] Relax some assumptions a little.
We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.

llvm-svn: 229406
2015-02-16 17:02:00 +00:00
James Molloy
382c1caece [LoopReroll] Don't crash on dead code
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.

llvm-svn: 229405
2015-02-16 17:01:52 +00:00
Evgeniy Stepanov
5ad40ba7e0 [asan] Reuse a common function.
Do not reimplement RoundUpToAlignment.

llvm-svn: 229397
2015-02-16 14:49:37 +00:00
Aaron Ballman
0b45511a2e Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
llvm-svn: 229340
2015-02-15 22:54:22 +00:00
Hal Finkel
afd400b12e [ADCE] Convert another loop for a range-based for
We can use a range-based for for the operands loop too; NFC.

llvm-svn: 229319
2015-02-15 15:51:25 +00:00
Hal Finkel
bc23b1a680 [ADCE] Use inst_range and range-based fors
Convert a few loops to range-based fors; NFC.

llvm-svn: 229318
2015-02-15 15:51:23 +00:00
Hal Finkel
20dadad9f5 [ADCE] Fix formatting of pointer types
We prefer to put the * with the variable, not with the type; NFC.

llvm-svn: 229317
2015-02-15 15:47:52 +00:00
Hal Finkel
f67e71db4d [ADCE] Fix capitalization of another local variable
Bring another local variable in compliance with our naming conventions, NFC.

llvm-svn: 229316
2015-02-15 15:45:30 +00:00
Hal Finkel
6660cfbeba [ADCE] Fix capitalization of some local variables
Bring some local variables in compliance with our naming conventions, NFC.

llvm-svn: 229315
2015-02-15 15:45:28 +00:00
Elena Demikhovsky
79bb080d9c Enabled cost calculation for masked memory operations.
We already have implementation for cost calculation for
masked memory operations. I just call it from the loop vectorizer.

llvm-svn: 229290
2015-02-15 08:08:48 +00:00
Ramkumar Ramachandra
af4f23c6ae InstCombine: propagate deref via new addDereferenceableAttr
The "dereferenceable" attribute cannot be added via .addAttribute(),
since it also expects a size in bytes. AttrBuilder#addAttribute or
AttributeSet#addAttribute is wrapped by classes Function, InvokeInst,
and CallInst. Add corresponding wrappers to
AttrBuilder#addDereferenceableAttr.

Having done this, propagate the dereferenceable attribute via
gc.relocate, adding a test to exercise it. Note that -datalayout is
required during execution over and above -instcombine, because
InstCombine only optionally requires DataLayoutPass.

Differential Revision: http://reviews.llvm.org/D7510

llvm-svn: 229265
2015-02-14 19:37:54 +00:00
Andrea Di Biagio
d0ba37a3e6 [optnone] Skip pass Constant Hoisting on optnone functions.
Added test CodeGen/X86/constant-hoisting-optnone.ll to verify that
pass Constant Hoisting is not run on optnone functions.

llvm-svn: 229258
2015-02-14 15:11:48 +00:00
Duncan P. N. Exon Smith
a0ef805386 Transforms: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229202
2015-02-14 01:11:29 +00:00
Philip Reames
45adc98469 [InstCombine] When canonicalizing gep indices, prefer zext when possible
If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead.  This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?).  We already apply a similar transform in DAGCombine, this just extends that to the IR level.

This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. 

Differential Revision: http://reviews.llvm.org/D7255

llvm-svn: 229189
2015-02-14 00:05:36 +00:00