1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00
Commit Graph

11730 Commits

Author SHA1 Message Date
Benjamin Kramer
43d91888f7 InstCombine: Strength reduce sadd.with.overflow into a regular nsw add if we can prove that it cannot overflow.
PR20194

llvm-svn: 212331
2014-07-04 10:22:21 +00:00
Gerolf Hoflehner
c42416bda3 Run interprocedural const prop before global optimizer
Exposes more constant globals that can be removed by
the global optimizer. A specific example is the removal
of the static global block address array in
clang/test/CodeGen/indirect-goto.c. This change impacts only
lower optimization levels. With LTO interprocedural
const prop runs already before global opt.

llvm-svn: 212284
2014-07-03 19:28:15 +00:00
Evgeniy Stepanov
053da684f6 [msan] Stop propagating shadow in blacklisted functions.
With this change all values passed through blacklisted functions
become fully initialized. Previous behavior was to initialize all
loads in blacklisted functions, but apply normal shadow propagation
logic for all other operation.

This makes blacklist applicable in a wider range of situations.

It also makes code for blacklisted functions a lot shorter, which
works as yet another workaround for PR17409.

llvm-svn: 212268
2014-07-03 11:56:30 +00:00
Evgeniy Stepanov
61fc97dfc2 Revert of r212265.
llvm-svn: 212266
2014-07-03 11:35:08 +00:00
Evgeniy Stepanov
321b8fd6cf [msan] Stop propagating shadow in blacklisted functions.
With this change all values passed through blacklisted functions
become fully initialized. Previous behavior was to initialize all
loads in blacklisted functions, but apply normal shadow propagation
logic for all other operation.

This makes blacklist applicable in a wider range of situations.

It also makes code for blacklisted functions a lot shorter, which
works as yet another workaround for PR17409.

llvm-svn: 212265
2014-07-03 11:18:48 +00:00
Marcello Maggioni
f4e8a4110d Minor stylistic fix in SimplifyCFG (test commit)
llvm-svn: 212259
2014-07-03 08:29:06 +00:00
Alexey Samsonov
58b70a4353 Remove non-static field initializer to appease MSVC
llvm-svn: 212212
2014-07-02 20:25:42 +00:00
David Blaikie
f6bdd89397 Constify the Function pointers in the result of makeSubprogramMap
These don't need to be mutable and callers being added soon in CodeGen
won't have access to non-const Module&.

llvm-svn: 212202
2014-07-02 18:30:05 +00:00
Alexey Samsonov
036373fe61 [ASan] Print exact source location of global variables in error reports.
See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the
original feature request.

Introduce llvm.asan.globals metadata, which Clang (or any other frontend)
may use to report extra information about global variables to ASan
instrumentation pass in the backend. This metadata replaces
llvm.asan.dynamically_initialized_globals that was used to detect init-order
bugs. llvm.asan.globals contains the following data for each global:
  1) source location (file/line/column info);
  2) whether it is dynamically initialized;
  3) whether it is blacklisted (shouldn't be instrumented).

Source location data is then emitted in the binary and can be picked up
by ASan runtime in case it needs to print error report involving some global.
For example:

  0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40

These source locations are printed even if the binary doesn't have any
debug info.

This is an ABI-breaking change. ASan initialization is renamed to
__asan_init_v4(). Pre-built libraries compiled with older Clang will not work
with the fresh runtime.

llvm-svn: 212188
2014-07-02 16:54:41 +00:00
David Majnemer
68ed1a9119 InstCombine: Optimize x/INT_MIN to x==INT_MIN
The result of x/INT_MIN is either 0 or 1, we can just use an icmp
instead.

llvm-svn: 212167
2014-07-02 06:42:13 +00:00
David Majnemer
5449bfbb6f InstCombine: Don't turn -(x/INT_MIN) -> x/INT_MIN
It is not safe to negate the smallest signed integer, doing so yields
the same number back.

This fixes PR20186.

llvm-svn: 212164
2014-07-02 06:07:09 +00:00
Reid Kleckner
83d46a1307 Optimize InstCombine stack memory consumption
This patch reduces the stack memory consumption of the InstCombine
function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions
could overflow the stack because of excessive recursiveness.

For example, in a case like this:

%0 = alloca [50025 x i32], align 4
%1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0
store i32 0,                         i32* %1
%2 = getelementptr inbounds          i32* %1, i64 1
store i32 1,                         i32* %2
%3 = getelementptr inbounds          i32* %2, i64 1
store i32 2,                         i32* %3
%4 = getelementptr inbounds          i32* %3, i64 1
store i32 3,                         i32* %4
%5 = getelementptr inbounds          i32* %4, i64 1
store i32 4,                         i32* %5
%6 = getelementptr inbounds          i32* %5, i64 1
store i32 5,                         i32* %6
...

This piece of code crashes llvm when trying to apply instcombine on
desktop. On embedded devices this could happen with a much lower limit
of recursiveness.  Some instructions (getelementptr and bitcasts) make
the function recursively call itself on their uses, which is what makes
the example above consume so much stack (it becomes a recursive
depth-first tree visit with a very big depth).

The patch changes the algorithm to be semantically equivalent, but
iterative instead of recursive and the visiting order to be from a
depth-first visit to a breadth-first visit (visit all the instructions
of the current level before the ones of the next one).

Now if a lot of memory is required a heap allocation is done instead of
the the stack allocation, avoiding the possible crash.

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4355

Patch by Marcello Maggioni!  We don't generally commit large stress test
that look for out of memory conditions, so I didn't request that one be
added to the patch.

llvm-svn: 212133
2014-07-01 21:36:20 +00:00
David Blaikie
f1747799d3 DebugInfo: Keep track of subprograms who's arguments have been promoted.
Matching behavior with DeadArgumentElimination (and leveraging some
now-common infrastructure), keep track of the function from debug info
metadata if arguments are promoted.

This may produce interesting debug info - since the arguments may be
missing or of different types... but at least backtraces, inlining, etc,
will be correct.

llvm-svn: 212128
2014-07-01 21:13:37 +00:00
David Blaikie
f4df27716a DebugInfo: Provide a utility for building a mapping from llvm::Function*s to llvm::DISubprograms
Update DeadArgumentElimintation to use this, with the intent of reusing
the functionality for ArgumentPromotion as well.

llvm-svn: 212122
2014-07-01 20:05:26 +00:00
David Majnemer
bbb6d0ba1d GlobalOpt: Don't swap private for internal linkage
There were transforms whose *intent* was to downgrade the linkage of
external objects to have internal linkage.

However, it fired on things with private linkage as well.

llvm-svn: 212104
2014-07-01 15:26:50 +00:00
David Majnemer
63f98c453a GlobalOpt: Handle non-zero offsets for aliases
An alias with an aliasee of a non-zero GEP is not trivially replacable
with it's aliasee.

llvm-svn: 212079
2014-07-01 00:30:56 +00:00
David Blaikie
c5256f5830 DebugInfo: Preserve debug location information when transforming a call into an invoke during inlining.
This both improves basic debug info quality, but also fixes a larger
hole whenever we inline a call/invoke without a location (debug info for
the entire inlining is lost and other badness that the debug info
emission code is currently working around but shouldn't have to).

llvm-svn: 212065
2014-06-30 20:30:39 +00:00
Reid Kleckner
4fd46c38c6 msan: Stop stripping the 'tail' modifier off of calls
This probably isn't necessary since msan started to unpoison the return
value shadow memory before all calls.

llvm-svn: 212061
2014-06-30 20:12:27 +00:00
David Majnemer
abf7854d05 IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.

COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.

This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.

Differential Revision: http://reviews.llvm.org/D4178

llvm-svn: 211920
2014-06-27 18:19:56 +00:00
Dinesh Dwivedi
73e3709b2c Added instruction combine to transform few more negative values addition to subtraction (Part 3)
This patch enables transforms for

(x + (~(y | c) + 1) --> x - (y | c) if c is odd

Differential Revision: http://reviews.llvm.org/D4210

llvm-svn: 211881
2014-06-27 07:47:35 +00:00
David Blaikie
232726db30 ArgumentPromotion: Propagate debug locations on calls for which arguments are promoted.
llvm-svn: 211872
2014-06-27 05:32:09 +00:00
Alp Toker
97022b0c1f Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.

llvm-svn: 211814
2014-06-26 22:52:05 +00:00
Arnold Schwaighofer
7e23314405 GVN: Preserve invariant.load metadata
If both instructions to be replaced are marked invariant the resulting
instruction is invariant.

rdar://13358910

Fix by Erik Eckstein!

llvm-svn: 211801
2014-06-26 19:51:19 +00:00
Dinesh Dwivedi
9d122cf780 This patch removed duplicate code for matching patterns
which are now handled in SimplifyUsingDistributiveLaws() 
(after r211261)

Differential Revision: http://reviews.llvm.org/D4253

llvm-svn: 211768
2014-06-26 08:57:33 +00:00
Dinesh Dwivedi
b98a2e9f49 Added instruction combine to transform few more negative values addition to subtraction (Part 2)
This patch enables transforms for

(x + (~(y | c) + 1)   -->   x - (y | c) if c is even

Differential Revision: http://reviews.llvm.org/D4209

llvm-svn: 211765
2014-06-26 05:40:22 +00:00
David Majnemer
96d60954b8 GlobalOpt: Don't optimize thread_local for initializers
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.

llvm-svn: 211762
2014-06-26 03:02:19 +00:00
Hans Wennborg
0b74da6314 Don't build switch tables for dllimport and TLS variables in GEPs
This is a follow-up to r211331, which failed to notice that we were
returning early from ValidLookupTableConstant for GEPs.

llvm-svn: 211753
2014-06-26 00:30:52 +00:00
Alp Toker
fd9ead3b6f Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.

small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.

This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.

The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.

llvm-svn: 211749
2014-06-26 00:00:48 +00:00
Tyler Nowicki
feb5a3355b Add Rpass-missed and Rpass-analysis reports to the loop vectorizer. The remarks give the vector width of vectorized loops and a brief analysis of loops that fail to be vectorized. For example, an analysis will be generated for loops containing control flow that cannot be simplified to a select. The optimization remarks also give the debug location of expressions that cannot be vectorized, for example the location of an unvectorizable call.
Reviewed by: Arnold Schwaighofer

llvm-svn: 211721
2014-06-25 17:50:15 +00:00
Eli Bendersky
def2619060 Rename loop unrolling and loop vectorizer metadata to have a common prefix.
[LLVM part]

These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix.  Metadata name
changes:

llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*

This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata. 

Patch by Mark Heffernan.

llvm-svn: 211710
2014-06-25 15:41:00 +00:00
Evgeniy Stepanov
cda29aab74 [msan] Fix bad interaction between with-calls mode and chained origin tracking.
Origin history should only be recorded for uninitialized values, because it is
meaningless otherwise. This change moves __msan_chain_origin to the runtime
library side and makes it conditional on the corresponding shadow value.

Previous code was correct, but _very_ inefficient.

llvm-svn: 211700
2014-06-25 14:41:57 +00:00
Evgeniy Stepanov
5dad33bfe8 Factor out part of LICM::sink into a helper function.
llvm-svn: 211678
2014-06-25 09:17:21 +00:00
Evgeniy Stepanov
aeb724c213 [LICM] Don't create more than one copy of an instruction per loop exit block when sinking.
Fixes exponential compilation complexity in PR19835, caused by
LICM::sink not handling the following pattern well:

f = op g
e = op f, g
d = op e
c = op d, e
b = op c
a = op b, c

When an instruction with N uses is sunk, each of its operands gets N
new uses (all of them - phi nodes). In the example above, if a had 1
use, c would have 2, e would have 4, and g would have 8.

llvm-svn: 211673
2014-06-25 07:54:58 +00:00
Benjamin Kramer
66a50c1c4d InstCombine: Disable umul.with.overflow recognition for vectors.
It doesn't make a lot on most targets and the code isn't ready for it. PR20113.

llvm-svn: 211583
2014-06-24 10:47:52 +00:00
Benjamin Kramer
65c1072e77 InstCombine: Don't try to reorder shuffles where the mask is a ConstantExpr.
We can't analyze the individual values of a vector expression. PR20114.

llvm-svn: 211581
2014-06-24 10:38:10 +00:00
David Majnemer
02900750e0 GlobalOpt: Don't optimize dllimport for initializers
Referencing a dllimport variable requires actually instructions, not
just a relocation.  This fixes PR19955.

Differential Revision: http://reviews.llvm.org/D4249

llvm-svn: 211571
2014-06-24 06:53:45 +00:00
Arnold Schwaighofer
8bc898f489 LoopVectorizer: Fix a dominance issue
The induction variables start value needs to be defined before we branch
(overflow check) to the scalar preheader where we used it.

llvm-svn: 211460
2014-06-22 03:38:59 +00:00
Stepan Dyatkovskiy
6c57127cdc MergeFunctions Pass, removed DenseMap helpers.
Patch removes rest part of code related to old implementation.

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

This one was the final patch.

llvm-svn: 211457
2014-06-22 01:53:30 +00:00
Stepan Dyatkovskiy
0431a0d1a0 MergeFunctions Pass, updated header comments.
Added short description for new comparison algorithm, that introduces
total ordering among functions set.

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211456
2014-06-22 00:57:09 +00:00
Stepan Dyatkovskiy
13371efcb8 MergeFunctions Pass, FnSet has been replaced with FnTree.
Patch activates new implementation.
So from now, merging process should take time O(N*log(N)).
Where N size of module (we are free to measure it in
functions or in instructions). Internally FnTree represents
binary tree. So every lookup operation takes O(log(N)) time.

It is still not the last patch in series, we also have to
clean-up pass from old code, and update pass comments.

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211445
2014-06-21 20:54:36 +00:00
Stepan Dyatkovskiy
e5ac34b416 MergeFunctions Pass, removed unused methods from old implementation.
Patch removed next old FunctionComparator methods:
    * enumerate
    * isEquivalentOperation
    * isEquivalentGEP
    * isEquivalentType 
    
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211444
2014-06-21 20:13:24 +00:00
Stepan Dyatkovskiy
569e92615e MergeFunctions, doSanityCheck: fixed body comments.
llvm-svn: 211443
2014-06-21 19:07:51 +00:00
Stepan Dyatkovskiy
f2a0775dc2 MergeFunctions Pass, introduced sanity check, that checks order relation,
introduced among functions set.
    
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211442
2014-06-21 18:58:11 +00:00
Stepan Dyatkovskiy
844ace0874 MergeFunctions Pass, introduced total ordering among top-level comparison
methods.
    
Patch changes return type of FunctionComparator::compare() and
FunctionComparator::compare(const BasicBlock*, const BasicBlock*)
methods from bool (equal or not) to {-1, 0, 1} (less, equal, great).
    
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 211437
2014-06-21 17:55:51 +00:00
Benjamin Kramer
7821fb4e74 LoopUnrollRuntime: Check for overflow in the trip count calculation.
Fixes PR19823.

llvm-svn: 211436
2014-06-21 13:46:25 +00:00
Richard Trieu
b7d5af56cb Add back functionality removed in r210497.
Instead of asserting, output a message stating that a null pointer was found.

llvm-svn: 211430
2014-06-21 02:43:02 +00:00
Stepan Dyatkovskiy
9b141122e2 Commited patch from Björn Steinbrink:
Summary:
Different range metadata can lead to different optimizations in later
passes, possibly breaking the semantics of the merged function. So range
metadata must be taken into consideration when comparing Load
instructions.

Thanks!

llvm-svn: 211391
2014-06-20 19:11:56 +00:00
Karthik Bhat
fb8456f1af Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 

llvm-svn: 211339
2014-06-20 04:32:48 +00:00
Hans Wennborg
3869faa1a3 Don't build switch lookup tables for dllimport or TLS variables
We would previously put dllimport variables in switch lookup tables, which
doesn't work because the address cannot be used in a constant initializer.
This is basically the same problem that we have in PR19955.

Putting TLS variables in switch tables also desn't work, because the
address of such a variable is not constant.

Differential Revision: http://reviews.llvm.org/D4220

llvm-svn: 211331
2014-06-20 00:38:12 +00:00
Dinesh Dwivedi
681d2465fa Updated comments as suggested by Rafael. Thanks.
llvm-svn: 211268
2014-06-19 14:11:53 +00:00
Dinesh Dwivedi
9d6bf38387 Added instruction combine to transform few more negative values addition to subtraction (Part 1)
This patch enables transforms for following patterns.
  (x + (~(y & c) + 1)   -->   x - (y & c)
  (x + (~((y >> z) & c) + 1)   -->   x - ((y>>z) & c)

Differential Revision: http://reviews.llvm.org/D3733

llvm-svn: 211266
2014-06-19 10:36:52 +00:00
Dinesh Dwivedi
9b9db6e172 Refactored and updated SimplifyUsingDistributiveLaws() to
* Find factorization opportunities using identity values.
 * Find factorization opportunities by treating shl(X, C) as mul (X, shl(C))
 * Keep NSW flag while simplifying instruction using factorization.

This fixes PR19263.

Differential Revision: http://reviews.llvm.org/D3799

llvm-svn: 211261
2014-06-19 08:29:18 +00:00
David Majnemer
a95def3a31 InstCombine: Stop two transforms dueling
InstCombineMulDivRem has:
// Canonicalize (X+C1)*CI -> X*CI+C1*CI.

InstCombineAddSub has:
// W*X + Y*Z --> W * (X+Z)  iff W == Y

These two transforms could fight with each other if C1*CI would not fold
away to something simpler than a ConstantExpr mul.

The InstCombineMulDivRem transform only acted on ConstantInts until
r199602 when it was changed to operate on all Constants in order to
let it fire on ConstantVectors.

To fix this, make this transform more careful by checking to see if we
actually folded away C1*CI.

This fixes PR20079.

llvm-svn: 211258
2014-06-19 07:14:33 +00:00
Nick Lewycky
051f63ab97 Move optimization of some cases of (A & C1)|(B & C2) from instcombine to instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs.
llvm-svn: 211252
2014-06-19 03:51:46 +00:00
Nick Lewycky
acc06f02dc Remove redundant code in InstCombineShift, no functionality change because instsimplify already does this and instcombine calls instsimplify a few lines above. Patch by Suyog Sarda!
llvm-svn: 211250
2014-06-19 03:28:28 +00:00
Matt Arsenault
b82983ef6a R600/SI: Add intrinsics for various math instructions.
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.

llvm-svn: 211247
2014-06-19 01:19:19 +00:00
Evgeniy Stepanov
5b86f69879 [msan] Handle X86 *.psad.* and *.pmadd.* intrinsics.
llvm-svn: 211156
2014-06-18 12:02:29 +00:00
Dinesh Dwivedi
3f5165cd6a Fixed jump threading going to infinite loop.
This patch add code to remove unreachable blocks from function
as they may cause jump threading to stuck in infinite loop.

Differential Revision: http://reviews.llvm.org/D3991

llvm-svn: 211103
2014-06-17 14:34:19 +00:00
Evgeniy Stepanov
fc4a06728f [msan] Fix a comment.
llvm-svn: 211094
2014-06-17 11:26:00 +00:00
Evgeniy Stepanov
98ddd4a0cc [msan] Fix handling of multiplication by a constant with a number of trailing zeroes.
Multiplication by an integer with a number of trailing zero bits leaves
the same number of lower bits of the result initialized to zero.
This change makes MSan take this into account in the case of multiplication by
a compile-time constant.

We don't handle the general, non-constant, case because
(a) it's not going to be cheap (computation-wise);
(b) multiplication by a partially uninitialized value in user code is
    a bad idea anyway.

Constant case must be handled because it appears from LLVM optimization of a
completely valid user code, as the test case in compiler-rt demonstrates.

llvm-svn: 211092
2014-06-17 09:23:12 +00:00
Jingyue Wu
089dd5d55e [InstCombine] mark ADD with nuw if no unsigned overflow
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".

Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll

Test Plan: make check-all

Reviewers: eliben, rafael, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D4144

llvm-svn: 211084
2014-06-17 00:42:07 +00:00
Duncan P. N. Exon Smith
dbf049d50f SROA: Only split loads on byte boundaries
r199771 accidently broke the logic that makes sure that SROA only splits
load on byte boundaries.  If such a split happens, some bits get lost
when reassembling loads of wider types, causing data corruption.

Move the width check up to reject such splits early, avoiding the
corruption.  Fixes PR19250.

Patch by: Björn Steinbrink <bsteinbr@gmail.com>

llvm-svn: 211082
2014-06-17 00:19:35 +00:00
Eli Bendersky
2adb3a762b Teach LoopUnrollPass to respect loop unrolling hints in metadata.
[This is resubmitting r210721, which was reverted due to suspected breakage
which turned out to be unrelated].

Some extra review comments were addressed. See D4090 and D4147 for more details.

The Clang change that produces this metadata was committed in r210667

Patch by Mark Heffernan.

llvm-svn: 211076
2014-06-16 23:53:02 +00:00
Jim Grosbach
2272906641 LowerSwitch: track bounding range for the condition tree.
When LowerSwitch transforms a switch instruction into a tree of ifs it
is actually performing a binary search into the various case ranges, to
see if the current value falls into one cases range of values.

So, if we have a program with something like this:

switch (a) {
case 0:
  do0();
  break;
case 1:
  do1();
  break;
case 2:
  do2();
  break;
default:
  break;
}

the code produced is something like this:

  if (a < 1) {
    if (a == 0) {
      do0();
    }
  } else {
    if (a < 2) {
      if (a == 1) {
        do1();
      }
    } else {
      if (a == 2) {
        do2();
      }
    }
  }

This code is inefficient because the check (a == 1) to execute do1() is
not needed.

The reason is that because we already checked that (a >= 1) initially by
checking that also  (a < 2) we basically already inferred that (a == 1)
without the need of an extra basic block spawned to check if actually (a
== 1).

The patch addresses this problem by keeping track of already
checked bounds in the LowerSwitch algorithm, so that when the time
arrives to produce a Leaf Block that checks the equality with the case
value / range the algorithm can decide if that block is really needed
depending on the already checked bounds .

For example, the above with "a = 1" would work like this:

the bounds start as LB: NONE , UB: NONE
as (a < 1) is emitted the bounds for the else path become LB: 1 UB:
NONE. This happens because by failing the test (a < 1) we know that the
value "a" cannot be smaller than 1 if we enter the else branch.
After the emitting the check (a < 2) the bounds in the if branch become
LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then
the upper bound becomes 2 - 1 = 1.

When it is time to emit the leaf block for "case 1:" we notice that 1
can be squeezed exactly in between the LB and UB, which means that if we
arrived to that block there is no need to emit a block that checks if (a
== 1).

Patch by: Marcello Maggioni <hayarms@gmail.com>

llvm-svn: 211038
2014-06-16 16:55:20 +00:00
Jingyue Wu
ae39e54823 Canonicalize addrspacecast ConstExpr between different pointer types
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.

Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.

Piggyback a minor refactor in InstCombineCasts.cpp

Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll

llvm-svn: 211004
2014-06-15 21:40:57 +00:00
Nick Lewycky
fd813dfe75 Remove extra whitespace in function declaration. No functionality change.
llvm-svn: 210965
2014-06-14 03:48:29 +00:00
Jiangning Liu
c7a7a14ca1 Move GlobalMerge from Transform to CodeGen.
This patch is to move GlobalMerge pass from Transform/Scalar                                                           
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.

llvm-svn: 210951
2014-06-13 22:57:59 +00:00
Alexey Samsonov
55fd79efac Remove top-level Clang -fsanitize= flags for optional ASan features.
Init-order and use-after-return modes can currently be enabled
by runtime flags. use-after-scope mode is not really working at the
moment.

The only problem I see is that users won't be able to disable extra
instrumentation for init-order and use-after-scope by a top-level Clang flag.
But this instrumentation was implicitly enabled for quite a while and
we didn't hear from users hurt by it.

llvm-svn: 210924
2014-06-13 17:53:44 +00:00
Tim Northover
d7c66da372 SCCP: update for cmpxchg returning { iN, i1 } now.
I accidentally missed this one since its use looked OK locally.

llvm-svn: 210909
2014-06-13 14:54:09 +00:00
Tim Northover
b9ec29d7c5 IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.

As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.

At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.

By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.

Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.

Summary for out of tree users:
------------------------------

+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.

llvm-svn: 210903
2014-06-13 14:24:07 +00:00
Rafael Espindola
98710599c1 Remove 'using std::errro_code' from lib.
llvm-svn: 210871
2014-06-13 02:24:39 +00:00
Rafael Espindola
e0e308ff6d Don't use 'using std::error_code' in include/llvm.
This should make sure that most new uses use the std prefix.

llvm-svn: 210835
2014-06-12 21:46:39 +00:00
Duncan P. N. Exon Smith
b330f79c8f GVN: Enable value forwarding for calloc
Enable value forwarding for loads from `calloc()` without an intervening
store.

This change extends GVN to handle the following case:

    %1 = tail call noalias i8* @calloc(i64 1, i64 4)
    %2 = bitcast i8* %1 to i32*
    ; This load is trivially constant zero
    %3 = load i32* %2, align 4

This is analogous to the handling for `malloc()` in the same places.
`malloc()` returns `undef`; `calloc()` returns a zero value.  Note that
it is correct to return zero even for out of bounds GEPs since the
result of such a GEP would be undefined.

Patch by Philip Reames!

llvm-svn: 210828
2014-06-12 21:16:19 +00:00
Eli Bendersky
b86a37b84e Revert r210721 as it causes breakage in internal builds (and possibly GDB).
llvm-svn: 210807
2014-06-12 18:05:39 +00:00
Rafael Espindola
38dc624425 Remove system_error.h.
This is a minimal change to remove the header. I will remove the occurrences
of "using std::error_code" in a followup patch.

llvm-svn: 210803
2014-06-12 17:38:55 +00:00
Dinesh Dwivedi
496b5db08d This removes TODO added in http://reviews.llvm.org/D3658
The patch transforms

ABS(NABS(X)) -> ABS(X)
NABS(ABS(X)) -> NABS(X)

Differential Revision: http://reviews.llvm.org/D4040

llvm-svn: 210782
2014-06-12 14:06:00 +00:00
Eli Bendersky
8185a02b68 Teach LoopUnrollPass to respect loop unrolling hints in metadata.
See http://reviews.llvm.org/D4090 for more details.

The Clang change that produces this metadata was committed in r210667

Patch by Mark Heffernan.

llvm-svn: 210721
2014-06-11 23:15:35 +00:00
Jiangning Liu
a490614c0a Create macro INITIALIZE_TM_PASS.
Pass initialization requires to initialize TargetMachine for back-end
specific passes. This commit creates a new macro INITIALIZE_TM_PASS to
simplify this kind of initialization.

llvm-svn: 210641
2014-06-11 07:04:37 +00:00
Jiangning Liu
4291d4247a Global merge for global symbols.
This commit is to improve global merge pass and support global symbol merge.
The global symbol merge is not enabled by default. For aarch64, we need some
more back-end fix to make it really benifit ADRP CSE.

llvm-svn: 210640
2014-06-11 06:44:53 +00:00
Jiangning Liu
87d06ae1f2 Rename global-merge to enable-global-merge.
llvm-svn: 210639
2014-06-11 06:35:26 +00:00
Eric Christopher
532c5f4bd6 We already have a reference to the TargetMachine, use that.
llvm-svn: 210580
2014-06-10 20:39:39 +00:00
Richard Trieu
8c7b353cd7 Removing an "if (!this)" check from two print methods. The condition will
never be true in a well-defined context.  The checking for null pointers
has been moved into the caller logic so it does not rely on undefined behavior.

llvm-svn: 210497
2014-06-09 22:53:16 +00:00
Matt Arsenault
0864675a02 Look through addrspacecasts when turning ptr comparisons into
index comparisons.

llvm-svn: 210488
2014-06-09 19:20:29 +00:00
Evgeniy Stepanov
7578333c12 [msan] Workaround for invalid origins in shufflevector.
Makes origin propagation ignore literal undef operands, and,
in general, any operand we don't have origin for.

https://code.google.com/p/memory-sanitizer/issues/detail?id=56

llvm-svn: 210472
2014-06-09 14:29:34 +00:00
Evgeniy Stepanov
8af6942193 Fix line numbers for code inlined from __nodebug__ functions.
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.

With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.

Fixes PR19001.

llvm-svn: 210459
2014-06-09 09:09:19 +00:00
Evgeniy Stepanov
d17ada7988 [msan] Fix vector pack intrinsic handling.
This fixes a crash on MMX intrinsics, as well as a corner case in handling of
all unsigned pack intrinsics.

PR19953.

llvm-svn: 210454
2014-06-09 08:40:16 +00:00
Jingyue Wu
d561d0a14f [SeparateConstOffsetFromGEP] inbounds zext => sext for better splitting
For each array index that is in the form of zext(a), convert it to sext(a)
if we can prove zext(a) <= max signed value of typeof(a). The conversion
helps to split zext(x + y) into sext(x) + sext(y).

Reviewed in http://reviews.llvm.org/D4060

llvm-svn: 210444
2014-06-08 23:49:34 +00:00
Craig Topper
b00824c629 [C++11] Use 'nullptr'.
llvm-svn: 210442
2014-06-08 22:29:17 +00:00
Jingyue Wu
eb29e86a73 [SeparateConstOffsetFromGEP] Fix an illegitimate optimization on zext
zext(a + b) != zext(a) + zext(b) even if a + b >= 0 && b >= 0.

e.g., a = i4 0b1111, b = i4 0b0001
zext a + b to i8 = zext 0b0000 to i8 = 0b00000000
(zext a to i8) + (zext b to i8) = 0b00001111 + 0b00000001 = 0b00010000

llvm-svn: 210439
2014-06-08 20:19:38 +00:00
Jingyue Wu
c4c2a9ced2 Refactor canonicalizing array indices to a helper function
No functionality changes.

llvm-svn: 210438
2014-06-08 20:15:45 +00:00
Rafael Espindola
78f79310a7 Revert 209903 and 210040.
The messages were

 "PR19753: Optimize comparisons with "ashr exact" of a constanst."
 "Added support to optimize comparisons with "lshr exact" of a constant."

They were not correctly handling signed/unsigned operation differences,
causing pr19958.

llvm-svn: 210393
2014-06-07 04:12:35 +00:00
Jingyue Wu
d4bc86a7e0 InstCombine: Canonicalize addrspacecast between different element types
addrspacecast X addrspace(M)* to Y addrspace(N)*

-->

bitcast X addrspace(M)* to Y addrspace(M)*
addrspacecast Y addrspace(M)* to Y addrspace(N)*

Updat all affected tests and add several new tests in addrspacecast.ll.

This patch is based on http://reviews.llvm.org/D2186 (authored by Matt
Arsenault) with fixes and more tests.

llvm-svn: 210375
2014-06-06 21:52:55 +00:00
Michael Zolotukhin
9d707e2cbe [SLP] Enable vectorization of GEP expressions.
The use cases look like the following:
    x->a = y->a + 10
    x->b = y->b + 12

llvm-svn: 210342
2014-06-06 15:34:24 +00:00
Dinesh Dwivedi
75092058d5 Added select flavour for ABS and NEG(ABS)
This patch can identify 
  ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X
  ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X
  NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X
  NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X
  
and can transform
  ABS(ABS(X)) -> ABS(X)
  NABS(NABS(X)) -> NABS(X)
  
Differential Revision: http://reviews.llvm.org/D3658

llvm-svn: 210312
2014-06-06 06:54:45 +00:00
Karthik Bhat
d233c05860 Fix PR19657 (scalar loads not combined into vector load)
If we have common uses on separate paths in the tree; process the one with greater common depth first.
This makes sure that we do not assume we need to extract a load when it is actually going to be part of a vectorized tree.

Review: http://reviews.llvm.org/D3800
llvm-svn: 210310
2014-06-06 06:20:08 +00:00
Jingyue Wu
490f50746a Fixed several correctness issues in SeparateConstOffsetFromGEP
Most issues are on mishandling s/zext.

Fixes:

1. When rebuilding new indices, s/zext should be distributed to
sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not
sext(a + b) + 5. This also affects the logic of recursively looking for a
constant offset, we need to include s/zext into the context of the searching.

2. Function find should return the bitwidth of the constant offset instead of
always sign-extending it to i64.

3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP
indices to pointer-size before computing the address. Therefore, gep base,
zext(a + b) != gep base, a + b

Improvements:

1. Add an optimization for splitting sext(a + b): if a + b is proven
non-negative (e.g., used as an index of an inbound GEP) and one of a, b is
non-negative, sext(a + b) = sext(a) + sext(b)

2. Function Distributable checks whether both sext and zext can be distributed
to operands of a binary operator. This helps us split zext(sext(a + b)) to
zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow.

Refactoring:

Merge some common logic of handling add/sub/or in find.

Testing:

Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes
we made.

llvm-svn: 210291
2014-06-05 22:07:33 +00:00
Bill Schmidt
5def9d253a [PPC64LE] Correct vperm -> shuffle transform for little endian
As discussed in cfe commit r210279, the correct little-endian
semantics for the vec_perm Altivec interfaces are implemented by
reversing the order of the input vectors and complementing the permute
control vector.  This converts the desired permute from little endian
element order into the big endian element order that the underlying
PowerPC vperm instruction uses.  This is represented with a
ppc_altivec_vperm intrinsic function.

The instruction combining pass contains code to convert a
ppc_altivec_vperm intrinsic into a vector shuffle operation when the
intrinsic has a permute control vector (mask) that is a constant.
However, the vector shuffle operation assumes that vector elements are
in natural order for their endianness, so for little endian code we
will get the wrong result with the existing transformation.

This patch reverses the semantic change to vec_perm that was performed
in altivec.h by once again swapping the input operands and
complementing the permute control vector, returning the element
ordering to little endian.

The correctness of this code is tested by the new perm.c test added in
a previous patch, and by other tests in the test suite that fail
without this patch.

llvm-svn: 210282
2014-06-05 19:46:04 +00:00
Tom Roeder
37dc34eee6 Removing spurious dependency of IPO on JumpInstrTables
llvm-svn: 210281
2014-06-05 19:43:57 +00:00
Tom Roeder
740d86dc79 Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.

This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.

llvm-svn: 210280
2014-06-05 19:29:43 +00:00
Evgeniy Stepanov
73fd57d33c [asancov] Fix coverage line info some more.
Now it should always point to the opening brace of the function (in
-asan-coverage=1 mode).

llvm-svn: 210266
2014-06-05 14:34:45 +00:00
Nick Lewycky
f952ebc0df Fix coverage for files with global constructors again. Adds a testcase to the commit from r206671, as requested by David Blaikie.
llvm-svn: 210239
2014-06-05 04:31:43 +00:00
Nick Lewycky
8d366da71c Explain why we skip DbgInfoIntrinsics when looking at line numbers in .gcno file emission.
llvm-svn: 210218
2014-06-04 21:47:19 +00:00
Rafael Espindola
0746266d63 Add a Constant version of stripPointerCasts.
Thanks to rnk for the suggestion.

llvm-svn: 210205
2014-06-04 19:01:48 +00:00
Rafael Espindola
133baba536 Clauses in a landingpad are always Constant. Use a stricter type.
llvm-svn: 210203
2014-06-04 18:51:31 +00:00
Rafael Espindola
4518a1dc81 InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
    signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
    the other operand has a known-zero bit in a more significant place than it
    (not including the sign bit) the ripple may go up to and fill the zero, but
    won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 210186
2014-06-04 15:39:14 +00:00
Evgeniy Stepanov
09fc9f9f08 [asan] Fix coverage instrumentation with -asan-globals=0.
llvm-svn: 210103
2014-06-03 14:16:00 +00:00
Nick Lewycky
6c356b865d Ignore line numbers on debug intrinsics. Add an assert to ensure that we aren't emitting line number zero, the .gcno format uses this to indicate that the next field is a filename.
llvm-svn: 210068
2014-06-03 04:25:36 +00:00
Rafael Espindola
87cd774844 Allow alias to point to an arbitrary ConstantExpr.
This  patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.

This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like

@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
                                 i32 ptrtoint (i32* @bar to i32)) to i32*)

An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).

Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.

llvm-svn: 210062
2014-06-03 02:41:57 +00:00
Rafael Espindola
ca50d60283 Add back commit r210029.
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.

llvm-svn: 210052
2014-06-02 22:01:04 +00:00
Rafael Espindola
68e7702970 Revert "Add the nsw flag when we detect that an add will not signed overflow."
This reverts commit r210029.

It was not correctly handling cases where LHS and RHS had multiple but different
sign bits.

llvm-svn: 210048
2014-06-02 21:12:19 +00:00
Rafael Espindola
affcd78e1b Added support to optimize comparisons with "lshr exact" of a constant.
Patch by Rahul Jain.

llvm-svn: 210040
2014-06-02 19:19:04 +00:00
Alexey Samsonov
2ce8c2f26f Remove sanitizer blacklist from ASan/TSan/MSan function passes.
Instrumentation passes now use attributes
address_safety/thread_safety/memory_safety which are added by Clang frontend.
Clang parses the blacklist file and adds the attributes accordingly.

Currently blacklist is still used in ASan module pass to disable instrumentation
for certain global variables. We should fix this as well by collecting the
set of globals we're going to instrument in Clang and passing it to ASan
in metadata (as we already do for dynamically-initialized globals and init-order
checking).

This change also removes -tsan-blacklist and -msan-blacklist LLVM commandline
flags in favor of -fsanitize-blacklist= Clang flag.

llvm-svn: 210038
2014-06-02 18:08:27 +00:00
Rafael Espindola
6457c5ef17 Add the nsw flag when we detect that an add will not signed overflow.
We already had a function for checking this, we were just using it only in
specialized cases.

llvm-svn: 210029
2014-06-02 14:32:58 +00:00
Evgeniy Stepanov
f8c69caa5e [msan] Remove an out-of-date comment.
MSan is no longer an "early prototype".

llvm-svn: 210023
2014-06-02 12:58:08 +00:00
Evgeniy Stepanov
d9731c7abd [msan] Handle x86 vector pack intrinsics.
llvm-svn: 210020
2014-06-02 12:31:44 +00:00
Dinesh Dwivedi
e95c2e918a Added inst combine tarnsform for (1 << X) & C pattrens where C is (some PowerOf2 - 1)
This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt
  "((1 << X) & 7) == 0" ==> "X > 2"
  "((1 << X) & 7) != 0" ==> "X < 3".

Differential Revision: http://reviews.llvm.org/D3678

llvm-svn: 210007
2014-06-02 07:57:24 +00:00
Dinesh Dwivedi
11044281aa Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Differential Revision: http://reviews.llvm.org/D3777

llvm-svn: 210006
2014-06-02 07:24:36 +00:00
Benjamin Kramer
9a25b19dba [Reassociate] Similar to "X + -X" -> "0", added code to handle "X + ~X" -> "-1".
Handle "X + ~X" -> "-1" in the function Value *Reassociate::OptimizeAdd(Instruction *I, SmallVectorImpl<ValueEntry> &Ops);
This patch implements:
TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1".

Patch by Rahul Jain!

Differential Revision: http://reviews.llvm.org/D3835

llvm-svn: 209973
2014-05-31 15:01:54 +00:00
Alexey Samsonov
b4f9b9e167 [ASan] Behave the same for functions w/o sanitize_address attribute and blacklisted functions
llvm-svn: 209946
2014-05-31 00:33:05 +00:00
Alexey Samsonov
b0ff4d0ab1 [TSan] Behave the same for functions w/o sanitize_thread attribute and blacklisted functions
llvm-svn: 209939
2014-05-31 00:11:37 +00:00
Matt Arsenault
09fc34b10a Make bitcast, extractelement, and insertelement considered cheap for speculation.
This helps more branches into selects. On R600,
vectors are cheap and anything that helps
remove branches is very good.

llvm-svn: 209914
2014-05-30 18:34:43 +00:00
Rafael Espindola
db0bd4b30f PR19753: Optimize comparisons with "ashr exact" of a constanst.
Patch by suyog sarda.

llvm-svn: 209903
2014-05-30 15:54:32 +00:00
Karthik Bhat
d6622171c7 Allow vectorization of intrinsics such as powi,cttz and ctlz in Loop and SLP Vectorizer.
This patch adds support to vectorize intrinsics such as powi, cttz and ctlz in Vectorizer. These intrinsics are different from other
intrinsics as second argument to these function must be same in order to vectorize them and it should be represented as a scalar.
Review: http://reviews.llvm.org/D3851#inline-32769 and http://reviews.llvm.org/D3937#inline-32857

llvm-svn: 209873
2014-05-30 04:31:24 +00:00
Nick Lewycky
cf5b9e086e When analyzing params/args for readnone/readonly, don't forget to consider that a pointer argument may be passed through a callsite to the return, and that we may need to analyze it. Fixes a bug reported on llvm-dev: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073098.html
llvm-svn: 209870
2014-05-30 02:31:27 +00:00
Chandler Carruth
6ba62ce28b And fix my fix to sink down through the type at the right time. My
original fix would actually trigger the *exact* same crasher as the
original bug for a different reason. Awesomesauce.

Working on test cases now, but wanted to get bots healthier.

llvm-svn: 209860
2014-05-29 23:21:12 +00:00
Chandler Carruth
b7d4a92bec Fix one bug in the latest incarnation of r209843 -- combining GEPs
across PHI nodes. The code was computing the Idxs from the 'GEP'
variable's indices when what it wanted was Op1's indices. This caused an
ASan heap-overflow for me that pin pointed the issue when Op1 had more
indices than GEP did. =] I'll let Louis add a specific test case for
this if he wants.

llvm-svn: 209857
2014-05-29 23:05:52 +00:00
Arnold Schwaighofer
9de10db908 LoopVectorizer: Add a check that the backedge taken count + 1 does not overflow
The loop vectorizer instantiates be-taken-count + 1 as the loop iteration count.
If this expression overflows the generated code was invalid.

In case of overflow the code now jumps to the scalar loop.

Fixes PR17288.

llvm-svn: 209854
2014-05-29 22:10:01 +00:00
Louis Gerbarg
7777715988 Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209843
2014-05-29 20:29:47 +00:00
Alexey Samsonov
36e820894f Use range-based for loops in ASan, TSan and MSan
llvm-svn: 209834
2014-05-29 18:40:48 +00:00
Rafael Espindola
ad02b4340d Revert "Revert "Revert "InstCombine: Improvement to check if signed addition overflows."""
This reverts commit r209776.

It was miscompiling llvm::SelectionDAGISel::MorphNode.

llvm-svn: 209817
2014-05-29 14:39:16 +00:00
Dinesh Dwivedi
1561b10679 LCSSA should be performed on the outermost affected loop while unrolling loop.
During loop-unroll, loop exits from the current loop may end up in in different
outer loop. This requires to re-form LCSSA recursively for one level down from
the outer most loop where loop exits are landed during unroll. This fixes PR18861.

Differential Revision: http://reviews.llvm.org/D2976

llvm-svn: 209796
2014-05-29 06:47:23 +00:00
Michael J. Spencer
1510dc5700 Add LoadCombine pass.
This pass is disabled by default. Use -combine-loads to enable in -O[1-3]

Differential revision: http://reviews.llvm.org/D3580

llvm-svn: 209791
2014-05-29 01:55:07 +00:00
Alexey Samsonov
89c1e0d5c4 [ASan] Hoist blacklisting globals from init-order checking to Clang.
Clang knows about the sanitizer blacklist and it makes no sense to
add global to the list of llvm.asan.dynamically_initialized_globals if it
will be blacklisted in the instrumentation pass anyway. Instead, we should
do as much blacklisting as possible (if not all) in the frontend.

llvm-svn: 209790
2014-05-29 01:44:13 +00:00
Alexey Samsonov
b815f467ac Fix typo in variable name
llvm-svn: 209784
2014-05-29 01:10:14 +00:00
Alexey Samsonov
bf11fe8de4 [ASan] Use llvm.global_ctors to insert init-order checking calls into ASan runtime.
Don't assume that dynamically initialized globals are all initialized from
_GLOBAL__<module_name>I_ function. Instead, scan the llvm.global_ctors and
insert poison/unpoison calls to each function there.

Patch by Nico Weber!

llvm-svn: 209780
2014-05-29 00:51:15 +00:00
Rafael Espindola
3c5e2aa2f0 Revert "Revert "InstCombine: Improvement to check if signed addition overflows.""
This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure

llvm-svn: 209776
2014-05-28 21:43:52 +00:00
Rafael Espindola
fa5165f5d9 Revert "Add support for combining GEPs across PHI nodes"
This reverts commit r209755.

it was the real cause of the libc++ build failure.

llvm-svn: 209775
2014-05-28 21:41:21 +00:00
Rafael Espindola
fecca3eb77 Revert "InstCombine: Improvement to check if signed addition overflows."
This reverts commit r209746.

It looks it is causing a crash while building libcxx. I am trying to get a
reduced testcase.

llvm-svn: 209762
2014-05-28 18:48:10 +00:00
Louis Gerbarg
55c91dae6c Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209755
2014-05-28 17:38:31 +00:00
Rafael Espindola
b4860cc965 InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
   signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
   the other operand has a known-zero bit in a more significant place than it
   (not including the sign bit) the ripple may go up to and fill the zero, but
   won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 209746
2014-05-28 15:30:40 +00:00
Evgeniy Stepanov
6d6f0eca22 [asancov] Don't emit extra runtime calls when compiling without coverage.
llvm-svn: 209721
2014-05-28 09:26:46 +00:00
Jingyue Wu
2c842fe49c Distribute sext/zext to the operands of and/or/xor
This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can
extract a constant offset from "s/zext and/or/xor A, B".

Added a new test @ext_or to verify this enhancement.

Refactoring the code, I also extracted some common logic to function
Distributable. 

llvm-svn: 209670
2014-05-27 18:00:00 +00:00
Filipe Cabecinhas
a6159dbf1c Post-commit fixes for r209643
Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio
Fixed the argument order to select (the mask semantics to blendv* are the
inverse of select) and fixed the tests
Added parenthesis to the assert condition
Ran clang-format

llvm-svn: 209667
2014-05-27 16:54:33 +00:00
Evgeniy Stepanov
f09fa832df [asancov] Emit an initializer passing number of coverage code locations in each module.
llvm-svn: 209654
2014-05-27 12:39:31 +00:00
Daniel Jasper
f7e05eb5c0 Fix bad assert.
llvm-svn: 209648
2014-05-27 09:55:37 +00:00
Filipe Cabecinhas
7df1221986 Convert some X86 blendv* intrinsics into IR.
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.

This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.

Both the transformation and the lowering of its result to asm got shiny
new tests.

The transformation is a bit convoluted because of blendvp[sd]'s
definition:

Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.

I will send an email to llvm-dev to discuss if we want to change this or
not.

Reviewers: grosbach, delena, nadav

Differential Revision: http://reviews.llvm.org/D3859

llvm-svn: 209643
2014-05-27 03:42:20 +00:00
Kostya Serebryany
828b647bdc [asan] decrease asan-instrumentation-with-call-threshold from 10000 to 7000, see PR17409
llvm-svn: 209623
2014-05-26 11:57:16 +00:00
Owen Anderson
8782b6f52b Make the LoopRotate pass's maximum header size configurable both programmatically
and via the command line, mirroring similar functionality in LoopUnroll.  In
situations where clients used custom unrolling thresholds, their intent could
previously be foiled by LoopRotate having a hardcoded threshold.

llvm-svn: 209617
2014-05-26 08:58:51 +00:00
Peter Collingbourne
e368465e8d Add an extension point for peephole optimizers.
This extension point allows adding passes that perform peephole optimizations
similar to the instruction combiner. These passes will be inserted after
each instance of the instruction combiner pass.

Differential Revision: http://reviews.llvm.org/D3905

llvm-svn: 209595
2014-05-25 10:27:02 +00:00
Tim Northover
ca0f4dc4f0 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00