1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00
Commit Graph

11638 Commits

Author SHA1 Message Date
Serge Pavlov
978a422aed Fix type of shuffle resulted from shuffle merge.
This fix resolves PR19730.

llvm-svn: 208666
2014-05-13 06:07:21 +00:00
Serge Pavlov
1ff1d49a09 Fix type of shuffle obtained from reordering with binary operation
In transformation:
    BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef)
type of the undef argument must be same as type of BinOp.

llvm-svn: 208531
2014-05-12 10:11:27 +00:00
Serge Pavlov
8e3b52d51f Fix reordering of shuffles and binary operations
Do not apply transformation:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))

if operands v1 and v2 are of different size.
This change fixes PR19717, which was caused by r208488.
    

llvm-svn: 208518
2014-05-12 05:44:53 +00:00
Benjamin Kramer
bbd9bf5d58 SLPVectorizer: Instead of just performing CSE on dead blocks ignore them completely.
Turns out that there is a very cheap way of testing whether a block is dead,
just look it up in the DomTree. We have to do this anyways so just ignore
unreachable blocks before sorting by domination. This restores a proper
ordering for std::stable_sort when dead code is present.

Covered by existing tests & buildbots running in STL debug mode (MSVC).

llvm-svn: 208492
2014-05-11 10:28:58 +00:00
Serge Pavlov
d043cc92e5 Reorder shuffle and binary operation.
This patch enables transformations:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
    BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)

They allow to eliminate extra shuffles in some cases.

Differential Revision: http://reviews.llvm.org/D3525

llvm-svn: 208488
2014-05-11 08:46:12 +00:00
Benjamin Kramer
2039813d0e SLPVectorizer: When sorting by domination for CSE don't assert on unreachable code.
There is no total ordering if the CFG is disconnected. We don't care if we
catch all CSE opportunities in dead code either so just exclude ignore them in
the assert.

PR19646

llvm-svn: 208461
2014-05-09 23:28:49 +00:00
Louis Gerbarg
011e17086c Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost
Since ExtractValue is not included in ComputeSpeculationCost CFGs containing
ExtractValueInsts cannot be simplified. In particular this interacts with
InstCombineCompare's tendency to insert add.with.overflow intrinsics for
certain idiomatic math operations, preventing optimization.

This patch adds ExtractValue to the ComputeSpeculationCost. Test case included

rdar://14853450

llvm-svn: 208434
2014-05-09 17:02:46 +00:00
Rafael Espindola
421423af16 Use auto and clang-format this snippet.
llvm-svn: 208421
2014-05-09 16:01:06 +00:00
Nick Lewycky
4fd21ddfc8 Improve wording to make it sounds more like a change than an analysis.
llvm-svn: 208370
2014-05-08 23:04:46 +00:00
Michael Zolotukhin
de5b0a206d [InstCombine] Some cleanup in optimization of redundant insertvalue instructions.
And one more test added.

llvm-svn: 208355
2014-05-08 19:50:24 +00:00
Richard Smith
90c10bd484 Simplify and fix incorrect comment. No functionality change.
llvm-svn: 208272
2014-05-08 01:08:43 +00:00
Duncan P. N. Exon Smith
f5258e4d2c GlobalValue: Assert symbols with local linkage have default visibility
The change to ExtractGV.cpp has no functionality change except to avoid
the asserts.  Existing testcases already cover this, so I didn't add a
new one.

llvm-svn: 208264
2014-05-07 23:00:22 +00:00
Chandler Carruth
36ad1ec2b0 Tidy up whitespace with clang-format prior to making significant
changes.

llvm-svn: 208229
2014-05-07 17:36:59 +00:00
Michael Zolotukhin
5fe056ff21 [InstCombine] Add optimization of redundant insertvalue instructions.
rdar://problem/11861387

llvm-svn: 208214
2014-05-07 14:30:18 +00:00
Evgeniy Stepanov
c02f1a9f96 [msan] Fix -fsanitize=memory -fno-integrated-as.
llvm-svn: 208211
2014-05-07 14:10:51 +00:00
Stepan Dyatkovskiy
328bcb73da MergeFunctions Pass, introduced total ordering among values.
This is a third patch of patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

This patch description:
Being comparing functions we need to compare values we meet at left and
right sides.
Its easy to sort things out for external values. It just should be
the same value at left and right.
But for local values (those were introduced inside function body)
we have to ensure they were introduced at exactly the same place,
and plays the same role.

In short, patch introduces values serial numbering and comparison routine.
The last one compares two values by their serial numbers.

llvm-svn: 208189
2014-05-07 11:11:39 +00:00
Zinovy Nis
ce225593e1 [BUG][REFACTOR]
1) Fix for printing debug locations for absolute paths.
2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel.

Differential Revision: http://reviews.llvm.org/D3513

llvm-svn: 208177
2014-05-07 09:51:22 +00:00
Stepan Dyatkovskiy
7984a44afe Second patch of patch series that improves MergeFunctions performance time from O(N*N) to
O(N*log(N)). The idea is to introduce total ordering among functions set.
It allows to build binary tree and perform function look-up procedure in O(log(N)) time. 

This patch description:
Introduced total ordering among constants implemented in cmpConstants method.
Method performs lexicographical comparison between constants represented as
hypothetical numbers of next format:
<bitcastability-trait><raw-bit-contents>

Please, read cmpConstants declaration comments for more details.

llvm-svn: 208173
2014-05-07 09:05:10 +00:00
Nico Weber
01b4c1bca8 Fix ASan init function detection after clang r208128.
llvm-svn: 208141
2014-05-06 23:17:26 +00:00
Richard Smith
55381ccf0a Re-commit r208025, reverted in r208030, with a fix for a conformance issue
which GCC detects and Clang does not!

llvm-svn: 208033
2014-05-06 01:44:26 +00:00
Richard Smith
b38145eb67 Revert r208025, which made buildbots unhappy for unknown reasons.
llvm-svn: 208030
2014-05-06 01:26:00 +00:00
Richard Smith
e9d2d57a7c Add llvm::function_ref (and a couple of uses of it), representing a type-erased reference to a callable object.
llvm-svn: 208025
2014-05-06 01:01:29 +00:00
Nick Lewycky
75f47267ef Detabify.
llvm-svn: 208019
2014-05-06 00:46:20 +00:00
Nick Lewycky
d7c0e22d5f Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'.
The number of tail call to loop conversions remains the same (1618 by my count).

The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly.

llvm-svn: 208017
2014-05-05 23:59:03 +00:00
Yi Jiang
8e66fa3904 Reapply: Add slp vectorization to LTO passes. The bug it exposed has been fixed by r207983. <radar://16641956>
llvm-svn: 208013
2014-05-05 23:14:46 +00:00
Yi Jiang
86dccb7e97 Always set alignment of vectorized LD/ST in SLP-Vectorizer. <rdar://problem/16812145>
llvm-svn: 207983
2014-05-05 17:59:14 +00:00
Duncan P. N. Exon Smith
a7513de2e4 LTO: -internalize sets visibility to default
Visibility is meaningless when the linkage is local.  Change
`-internalize` to reset the visibility to `default`.

<rdar://problem/16141113>

llvm-svn: 207979
2014-05-05 17:40:44 +00:00
Timur Iskhodzhanov
16ebb61afc [ASan/Win] Fix issue 305 -- don't instrument .CRT initializer/terminator callbacks
See https://code.google.com/p/address-sanitizer/issues/detail?id=305
Reviewed at http://reviews.llvm.org/D3607

llvm-svn: 207968
2014-05-05 14:28:38 +00:00
Benjamin Kramer
8125aa2cd4 LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to limit unrolling.
Otherwise we use the same threshold as for complete unrolling, which is
way too high. This made us unroll any loop smaller than 150 instructions
by 8 times, but only if someone specified -march=core2 or better,
which happens to be the default on darwin.

llvm-svn: 207940
2014-05-04 19:12:38 +00:00
Arnold Schwaighofer
d752a09f8b SLPVectorizer: Bring back the insertelement patch (r205965) with fixes
When can't assume a vectorized tree is rooted in an instruction. The IRBuilder
could have constant folded it. When we rebuild the build_vector (the series of
InsertElement instructions) use the last original InsertElement instruction. The
vectorized tree root is guaranteed to be before it.

Also, we can't assume that the n-th InsertElement inserts the n-th element into
a vector.

This reverts r207746 which reverted the revert of the revert of r205018 or so.

Fixes the test case in PR19621.

llvm-svn: 207939
2014-05-04 17:10:15 +00:00
Benjamin Kramer
fbeb105fa6 SLPVectorizer: Lazily allocate the map for block numbering.
There is no point in creating it if we're not going to vectorize
anything. Creating the map is expensive as it creates large values.
No functionality change.

llvm-svn: 207916
2014-05-03 15:50:37 +00:00
Karthik Bhat
4591b173c7 Vectorize intrinsic math function calls in SLPVectorizer.
This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer.
Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559

llvm-svn: 207901
2014-05-03 09:59:54 +00:00
Eric Christopher
b8d435f9a6 Clean up constructor logic and member access for LoopVectorizeHints.
There are public functions that mutate various members as well as
another private member already, so make all the members private to
avoid the discontinuity and add accessors for the values. Should
be no functional change.

llvm-svn: 207868
2014-05-02 20:40:04 +00:00
Nico Weber
fdced47a40 Teach GlobalDCE how to remove empty global_ctor entries.
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.

GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.

Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.

llvm-svn: 207856
2014-05-02 18:35:25 +00:00
Akira Hatanaka
f64b00f1d7 [GVN] Pass the phi-translated address of a load instead of the untranslated
address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where
PRE is applied to a load that is not partially redundant.

<rdar://problem/16638765>.

llvm-svn: 207853
2014-05-02 17:59:17 +00:00
Nick Lewycky
e4f0d18fe0 Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 times in a bootstrap of clang.
llvm-svn: 207828
2014-05-02 04:11:45 +00:00
Benjamin Kramer
040af9d6b4 Update and sort CMakeLists.
llvm-svn: 207785
2014-05-01 18:59:11 +00:00
Eli Bendersky
0602e236ae Add an optimization that does CSE in a group of similar GEPs.
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.

The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.

Review: http://reviews.llvm.org/D3462

Patch by Jingyue Wu.

llvm-svn: 207783
2014-05-01 18:38:36 +00:00
Chandler Carruth
fa8592dc61 Revert r205965, which essentially reverts r205018 for the second time.
=[

Turns out that this was the root cause of PR19621. We found a crasher
only recently (likely due to improvements elsewhere in the SLP
vectorizer) but the reduced test case failed all the way back to here.
I've confirmed that reverting this patch both fixes the reduced test
case in PR19621 and the actual source file that led to it, so it seems
to really be rooted here. I've replied to the commit thread with
discussion of my (feeble) attempts to debug this. Didn't make it very
far, so reverting now that we have a good test case so that things can
get back to healthy while the debugging carries on.

llvm-svn: 207746
2014-05-01 11:24:11 +00:00
Gerolf Hoflehner
aec10987f2 Patch for function cloning to inline all blocks whose address is taken
Not all address taken blocks get inlined. The reason is
that a blocks new address is known only when it is cloned. But e.g.
a branch instruction in a different block could need that address earlier
while it gets cloned. The solution is to collect the set of all
blocks that can potentially get inlined and compute a new block address
up front. Then clone and cleanup.

rdar://16427209

llvm-svn: 207713
2014-04-30 22:05:02 +00:00
Yi Jiang
5faccf9d45 Revert r207571 - Add slp vectorization to LTO passes
llvm-svn: 207693
2014-04-30 19:27:24 +00:00
Carlo Kok
3c208bd22a [IPO/MergeFunctions] changes so it doesn't try to bitcast a struct return type but instead recreates it with insert/extract value.
llvm-svn: 207679
2014-04-30 17:53:04 +00:00
Benjamin Kramer
805d786f28 Add a <tuple> include to more files that aren't getting it transitively on MSVC.
llvm-svn: 207617
2014-04-30 07:21:01 +00:00
NAKAMURA Takumi
45cffbdde3 ConstantHoisting.cpp: Add <tuple> for std::tie, since r207593 removed FileSystem.h, it includes <tuple>.
llvm-svn: 207614
2014-04-30 06:44:50 +00:00
Jim Grosbach
c9daf08519 Tidy up.
llvm-svn: 207585
2014-04-29 22:41:58 +00:00
Jim Grosbach
d4614a7a4c Spelling.
llvm-svn: 207584
2014-04-29 22:41:55 +00:00
Rafael Espindola
f399f03ccc Also handle ConstantAggregateZero when optimizing vpermilvar*.
llvm-svn: 207582
2014-04-29 22:20:40 +00:00
Rafael Espindola
b4468db535 Remove tabs.
Sorry, new machine and I forgot to change the editor setting.

llvm-svn: 207578
2014-04-29 21:02:37 +00:00
Rafael Espindola
7372093fcc Two fixes to the vpermilvar optimization.
The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect.
The _256 variants have indexes into the individual 128 bit lanes and in all
cases it also has to mask out unused bits.

llvm-svn: 207577
2014-04-29 20:41:54 +00:00
Diego Novillo
210bb92cec Fix vectorization remarks.
This patch changes the vectorization remarks to also inform when
vectorization is possible but not beneficial.

Added tests to exercise some loop remarks.

llvm-svn: 207574
2014-04-29 20:06:10 +00:00
Yi Jiang
f658582852 Continue slp vectorization even the BB already has vectorized store radar://16641956
llvm-svn: 207572
2014-04-29 19:37:20 +00:00
Yi Jiang
1c75ac2f65 Add slp vectorization to LTO passes
llvm-svn: 207571
2014-04-29 19:35:39 +00:00
Adam Nemet
109f756317 Reapply r207271 without the testcase
PR19608 was filed to find a suitable testcase.

llvm-svn: 207569
2014-04-29 18:25:28 +00:00
Diego Novillo
81a83e3d17 Add optimization remarks to the loop unroller and vectorizer.
Summary:
This calls emitOptimizationRemark from the loop unroller and vectorizer
at the point where they make a positive transformation. For the
vectorizer, it reports vectorization and interleave factors. For the
loop unroller, it reports all the different supported types of
unrolling.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3456

llvm-svn: 207528
2014-04-29 14:27:31 +00:00
Zinovy Nis
26320a9150 [BUG] Fix -Wunused-variable warning in Release mode. Thnx to Kostya Serebryany for pointing.
llvm-svn: 207516
2014-04-29 09:45:08 +00:00
Kostya Serebryany
6630aee422 fix -Wunused-variable warning in Release mode
llvm-svn: 207514
2014-04-29 09:33:02 +00:00
Zinovy Nis
a7f05bffeb [OPENMP][LV][D3423] Respect Hints.Force meta-data for loops in LoopVectorizer
llvm-svn: 207512
2014-04-29 08:55:11 +00:00
Michael Zolotukhin
354ce4fae7 Fix a typo in comment
llvm-svn: 207499
2014-04-29 07:35:33 +00:00
Chandler Carruth
718a254f79 Revert r207271 for now. This commit introduced a test case that ran
clang directly from the LLVM test suite! That doesn't work. I've
followed up on the review thread to try and get a viable solution sorted
out, but trying to get the tree clean here.

llvm-svn: 207462
2014-04-28 23:07:49 +00:00
Hans Wennborg
6405294846 InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)
llvm-svn: 207426
2014-04-28 17:40:03 +00:00
Chandler Carruth
d81a0d614d Fix rampant quadratic behavior in UpdatePHINodes. The operation of
mapping from a basic block to an incoming value, either for removal or
just lookup, is linear in the number of predecessors, and we were doing
this for every entry in the 'Preds' list which is in many cases almost
all of them!

Unfortunately, the fixes are quite ugly. PHI nodes just don't make this
operation easy. The efficient way to fix this is to have a clever
'remove_if' operation on PHI nodes that lets us do a single pass over
all the incoming values of the original PHI node, extracting the ones we
care about. Then we could quickly construct the new phi node from this
list. This would remove the remaining underlying quadratic movement of
unrelated incoming values and the need for silly backwards looping to
"minimize" how often we hit the quadratic case.

This is the last obvious fix for PR19499. It shaves another 20% off the
compile time for me, and while UpdatePHINodes remains in the profile,
most of the time is now stemming from the well known inefficiencies of
LVI and jump threading.

llvm-svn: 207409
2014-04-28 10:37:30 +00:00
Craig Topper
b663bffa27 [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Gerolf Hoflehner
13cf626de6 RecursivelyDeleteTriviallyDeadInstructions() could remove
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

Repaired r207302.

llvm-svn: 207309
2014-04-26 05:58:11 +00:00
Gerolf Hoflehner
c780c786a3 Restore CloneFunction.cpp which got accidently
overwritten by previous backout of r207303

llvm-svn: 207308
2014-04-26 05:43:41 +00:00
Gerolf Hoflehner
99a3e5b9f5 Revert commit r207302 since build failures
have been reported.

llvm-svn: 207303
2014-04-26 02:03:17 +00:00
Gerolf Hoflehner
54bf53d166 RecursivelyDeleteTriviallyDeadInstructions() could remove
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

llvm-svn: 207302
2014-04-26 01:19:16 +00:00
Andrea Di Biagio
815dfb7574 [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift
right intrinsics.

A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.

llvm-svn: 207299
2014-04-26 01:03:22 +00:00
Adrian Prantl
ca946260a4 Unbreak the gdb buildbot by not lowering dbg.declare intrinsics for arrays.
llvm-svn: 207284
2014-04-25 23:00:25 +00:00
Adam Nemet
c4071d355a [LoopStrengthReduce] Don't trim formula that uses a subset of required registers
Consider this use from the new testcase:

  LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32
    reg({1000,+,-1}<nw><%for.body>)
    -3003 + reg({3,+,3}<nw><%for.body>)
    -1001 + reg({1,+,1}<nuw><nsw><%for.body>)
    -1000 + reg({0,+,1}<nw><%for.body>)
    -3000 + reg({0,+,3}<nuw><%for.body>)
    reg({-1000,+,1}<nw><%for.body>)
    reg({-3000,+,3}<nsw><%for.body>)

This is the last use we consider for a solution in SolveRecurse, so CurRegs is
a large set.  (CurRegs is the set of registers that are needed by the
previously visited uses in the in-progress solution.)

ReqRegs is {
  {3,+,3}<nw><%for.body>,
  {1,+,1}<nuw><nsw><%for.body>
}

This is the intersection of the regs used by any of the formulas for the
current use and CurRegs.

Now, the code requires a formula to contain *all* these regs (the comment is
simply wrong), otherwise the formula is immediately disqualified.  Obviously,
no formula for this use contains two regs so they will all get disqualified.

The fix modifies the check to allow the formula in this case.  The idea is
that neither of these formulae is introducing any new registers which is the
point of this early pruning as far as I understand.

In terms of set arithmetic, we now allow formulas whose used regs are a subset
of the required regs not just the other way around.

There are few more loops in the test-suite that are now successfully LSRed.  I
have benchmarked those and found very minimal change.

Fixes <rdar://problem/13965777>

llvm-svn: 207271
2014-04-25 21:02:21 +00:00
Adrian Prantl
7566e72bb8 This reapplies r207235 with an additional bugfixes caught by the msan
buildbot - do not insert debug intrinsics before phi nodes.

Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207269
2014-04-25 20:49:25 +00:00
Duncan P. N. Exon Smith
ea68e6a3d5 SCC: Change clients to use const, NFC
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit.  First, update the users.

<rdar://problem/14292693>

llvm-svn: 207252
2014-04-25 18:24:50 +00:00
Adrian Prantl
319db7c542 Revert "This reapplies r207130 with an additional testcase+and a missing check for"
This reverts commit 207235 to investigate msan buildbot breakage.

llvm-svn: 207250
2014-04-25 18:18:09 +00:00
Manman Ren
5d62183fae [inline cold threshold] Command line argument for inline threshold will
override the default cold threshold.

When we use command line argument to set the inline threshold, the default
cold threshold will not be used. This is in line with how we use
OptSizeThreshold. When we want a higher threshold for all functions, we
do not have to set both inline threshold and cold threshold.

llvm-svn: 207245
2014-04-25 17:34:55 +00:00
Adrian Prantl
c3a2bdef85 Reapply r207135 without modifications.
Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.

rdar://problem/14874886, rdar://problem/16679936

llvm-svn: 207236
2014-04-25 17:01:04 +00:00
Adrian Prantl
7f9d1e9fd6 This reapplies r207130 with an additional testcase+and a missing check for
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207235
2014-04-25 17:01:00 +00:00
Craig Topper
c0a2a29f4e [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Karthik Bhat
a6070a9b75 Allow vectorization of bit intrinsics in BB Vectorizer.
This patch adds support for vectorization of  bit intrinsics such as bswap,ctpop,ctlz,cttz.

llvm-svn: 207174
2014-04-25 03:33:48 +00:00
Adrian Prantl
0338f80f17 Revert "This reapplies r207130 with an additional testcase+and a missing check for"
Typo in testcase.

llvm-svn: 207166
2014-04-25 00:42:50 +00:00
Adrian Prantl
bf019d19e9 This reapplies r207130 with an additional testcase+and a missing check for
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207165
2014-04-25 00:38:40 +00:00
Adrian Prantl
0b669e8f79 Revert "Debug info for optimized code: Support variables that are on the stack and"
This reverts commit 207130 for buildbot breakage.

llvm-svn: 207162
2014-04-25 00:04:49 +00:00
Adrian Prantl
ff50aa7bca Revert "Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location"
This reverts commit 207130 for buildbot breakage.

llvm-svn: 207159
2014-04-24 23:53:29 +00:00
Adrian Prantl
daee7e4c17 Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.

rdar://problem/14874886, rdar://problem/16679936

llvm-svn: 207135
2014-04-24 18:44:15 +00:00
Adrian Prantl
807e5d8a9a Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine-intrinsics testcase and included source


rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207130
2014-04-24 17:41:45 +00:00
Karthik Bhat
fd6c53ce06 Allow vectorization of few missed llvm intrinsic calls in BBVectorizor by handling them in isVectorizableIntrinsic function.
llvm-svn: 207085
2014-04-24 07:29:55 +00:00
Michael J. Spencer
65065bf94f [InstCombine][x86] Constant fold psll intrinsics.
This excludes avx512 as I don't have hardware to verify. It excludes _dq
variants because they are represented in the IR as <{2,4} x i64> when it's
actually a byte shift of the entire i{128,265}.

This also excludes _dq_bs as they aren't at all supported by the backend.
There are also no corresponding instructions in the ISA. I have no idea why
they exist...

llvm-svn: 207058
2014-04-24 00:58:18 +00:00
Filipe Cabecinhas
696e2aae90 Optimize some special cases for SSE4a insertqi
Summary:
Since the upper 64 bits of the destination register are undefined when
performing this operation, we can substitute it and let the optimizer
figure out that only a copy is needed.

Also added range merging, if an instruction copies a range that can be
merged with a previous copied range.

Added test cases for both optimizations.

Reviewers: grosbach, nadav

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3357

llvm-svn: 207055
2014-04-24 00:38:14 +00:00
Matt Arsenault
bc017b7e11 Handle addrspacecast when looking at memcpys from globals
llvm-svn: 207054
2014-04-24 00:01:09 +00:00
Matt Arsenault
b37a455d1d Remove more default address space argument usage.
These places are inconsequential in practice.

llvm-svn: 207021
2014-04-23 20:58:57 +00:00
Matt Arsenault
71fe4a9cad Don't use default address space arguments in GlobalOpt
llvm-svn: 207019
2014-04-23 20:36:10 +00:00
Alexander Potapenko
eb29afd291 [ASan] Move the shadow range on 32-bit iOS (and iOS Simulator)
to 0x40000000-0x60000000 to avoid address space clash with system libraries.
The solution has been proposed by tahabekireren@gmail.com in https://code.google.com/p/address-sanitizer/issues/detail?id=210
This is also known to fix some Chromium iOS tests.

llvm-svn: 207002
2014-04-23 17:14:45 +00:00
Matt Arsenault
457de98b62 Remove dead code in instcombine.
Don't replace shifts greater than the type with the maximum shift.

This isn't hit anywhere in the tests, and somewhere else is replacing
these with undef.

llvm-svn: 207000
2014-04-23 16:48:40 +00:00
Evgeniy Stepanov
87ddc3340b Fix handling of missing DataLayout in sanitizers.
Pass::doInitialization is supposed to return False when it did not
change the program, not when a fatal error occurs.

llvm-svn: 206975
2014-04-23 12:51:32 +00:00
Alexander Musman
721e76d2fa [LV] Statistics numbers for LoopVectorize introduced: a number of analyzed loops & a number of vectorized loops.
Use -stats to see how many loops were analyzed for possible vectorization and how many of them were actually vectorized.
Patch by Zinovy Nis

Differential Revision: http://reviews.llvm.org/D3438

llvm-svn: 206956
2014-04-23 08:40:37 +00:00
Juergen Ributzka
e6dc2fa469 [Constant Hoisting] Materialize the constant before the cloned cast instruction.
In the case where the constant comes from a cloned cast instruction, the
materialization code has to go before the cloned cast instruction.

This commit fixes the method that finds the materialization insertion point
by making it aware of this case.

This fixes <rdar://problem/15532441>

llvm-svn: 206913
2014-04-22 18:06:58 +00:00
Juergen Ributzka
7287943879 [Constant Hoisting] Print the instructions in the correct order for debugging. No functional change.
llvm-svn: 206912
2014-04-22 18:06:51 +00:00
Kostya Serebryany
291de5fd50 [asan] Support outline instrumentation for wide types and delete dead code, patch by Yuri Gribov
llvm-svn: 206883
2014-04-22 11:19:45 +00:00
Chandler Carruth
6f9ba6a633 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
Chandler Carruth
15c7b91ac2 [Modules] Make Support/Debug.h modular. This requires it to not change
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.

This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:

- Header files that need to provide a DEBUG_TYPE for some inline code
  can do so by defining the macro before their inline code and undef-ing
  it afterward so the macro does not escape.

- We no longer have rampant ODR violations due to including headers with
  different DEBUG_TYPE definitions. This may be mostly an academic
  violation today, but with modules these types of violations are easy
  to check for and potentially very relevant.

Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.

The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.

llvm-svn: 206822
2014-04-21 22:55:11 +00:00
Rafael Espindola
5bfe46aee5 Simplify a vpermil* with constant mask.
With a constant mask a vpermil* is just a shufflevector. This patch implements
that simplification. This allows us to produce denser code. It should also
allow more folding down the line.

llvm-svn: 206801
2014-04-21 22:06:04 +00:00
David Blaikie
091f1b9444 Use unique_ptr to handle GlobalOpt's Evaluator members
llvm-svn: 206790
2014-04-21 20:49:36 +00:00
Reid Kleckner
5013d7ca2d Fix PR7272 in -tailcallelim instead of the inliner
The -tailcallelim pass should be checking if byval or inalloca args can
be captured before marking calls as tail calls.  This was the real root
cause of PR7272.

With a better fix in place, revert the inliner change from r105255.  The
test case it introduced still passes and has been moved to
test/Transforms/Inline/byval-tail-call.ll.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3403

llvm-svn: 206789
2014-04-21 20:48:47 +00:00
David Blaikie
057cb4407d Simplify expression that was explicitly naming an operator overload in a call.
llvm-svn: 206788
2014-04-21 20:43:51 +00:00
David Blaikie
f04826c7d5 Use unique_ptr to handle ownership of GCOVFunctions in GCOVProfiler.
llvm-svn: 206786
2014-04-21 20:41:55 +00:00
Chandler Carruth
e07407deea [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of the
header files and into the cpp files.

These files will require more touches as the header files actually use
DEBUG(). Eventually, I'll have to introduce a matched #define and #undef
of DEBUG_TYPE for the header files, but that comes as step N of many to
clean all of this up.

llvm-svn: 206777
2014-04-21 19:51:41 +00:00
Evgeniy Stepanov
b8b4d1d879 [msan] Enable out-of-line instrumentation for large functions by default.
llvm-svn: 206759
2014-04-21 15:04:05 +00:00
Kostya Serebryany
0ca459b956 [asan] add a run-time flag detect_container_overflow=true/false
llvm-svn: 206756
2014-04-21 14:35:00 +00:00
Kostya Serebryany
8369b857f6 [asan] instead of inserting inline instrumentation around memset/memcpy/memmove, replace the intrinsic with __asan_memset/etc. This makes the memset/etc handling more complete and consistent with what we do in msan. It may slowdown some cases (when the intrinsic was actually inlined) and speedup other cases (when it was not inlined)
llvm-svn: 206746
2014-04-21 11:50:42 +00:00
Kostya Serebryany
25679a433d [asan] temporary disable generating __asan_loadN/__asan_storeN
llvm-svn: 206741
2014-04-21 10:28:13 +00:00
Kostya Serebryany
0405013a8c [asan] insert __asan_loadN/__asan_storeN as out-lined asan checks, llvm part
llvm-svn: 206734
2014-04-21 07:10:43 +00:00
Alp Toker
faee7c31dd Remove some empty statements
Cleanup only.

llvm-svn: 206710
2014-04-19 23:56:35 +00:00
Nick Lewycky
bd9ff641e7 Check whether functions have any lines associated before emitting coverage info for them. This isn't just a size/time saving, gcov may crash on these.
llvm-svn: 206671
2014-04-18 23:32:28 +00:00
Evgeniy Stepanov
de38078fd6 [msan] Add -msan-instrumentation-with-call-threshold.
This flag replaces inline instrumentation for checks and origin stores with
calls into MSan runtime library. This is a workaround for PR17409.

Disabled by default.

llvm-svn: 206585
2014-04-18 12:17:20 +00:00
Kostya Serebryany
2b02920109 [asan] one more workaround for PR17409: don't do BB-level coverage instrumentation if there are more than N (=1500) basic blocks. This makes ASanCoverage work on libjpeg_turbo/jchuff.c used by Chrome, which has 1824 BBs
llvm-svn: 206564
2014-04-18 08:02:42 +00:00
Duncan P. N. Exon Smith
7063f2846a PMBuilder: Expose an option to disable tail calls
Adds API to allow frontends to disable tail calls in PassManagerBuilder.

<rdar://problem/16050591>

llvm-svn: 206542
2014-04-18 01:05:15 +00:00
Diego Novillo
45811c5ea3 Fix bug 19437 - Only add discriminators for DWARF 4 and above.
Summary:
This prevents the discriminator generation pass from triggering if
the DWARF version being used in the module is prior to 4.

Reviewers: echristo, dblaikie

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3413

llvm-svn: 206507
2014-04-17 22:33:50 +00:00
Nuno Lopes
4a36b584a3 remove some dead code
lib/Analysis/IPA/InlineCost.cpp         |   18 ------------------
 lib/Analysis/RegionPass.cpp             |    1 -
 lib/Analysis/TypeBasedAliasAnalysis.cpp |    1 -
 lib/Transforms/Scalar/LoopUnswitch.cpp  |   21 ---------------------
 lib/Transforms/Utils/LCSSA.cpp          |    2 --
 lib/Transforms/Utils/LoopSimplify.cpp   |    6 ------
 utils/TableGen/AsmWriterEmitter.cpp     |   13 -------------
 utils/TableGen/DFAPacketizerEmitter.cpp |    7 -------
 utils/TableGen/IntrinsicEmitter.cpp     |    2 --
 9 files changed, 71 deletions(-)

llvm-svn: 206506
2014-04-17 22:26:44 +00:00
NAKAMURA Takumi
51c35adf06 Inliner::OptimizationRemark: Fix crash in clang/test/Frontend/optimization-remark.c on some hosts, including --vg.
DebugLoc in Callsite would not live after Inliner. It should be copied before Inliner.

llvm-svn: 206459
2014-04-17 12:22:14 +00:00
Kostya Serebryany
9bec638044 [asan] add two new hidden compile-time flags for asan: asan-instrumentation-with-call-threshold and asan-memory-access-callback-prefix. This is part of the workaround for PR17409 (instrument huge functions with callbacks instead of inlined code). These flags will also help us experiment with kasan (kernel-asan) and clang
llvm-svn: 206383
2014-04-16 12:12:19 +00:00
Julien Lerouge
dd5842a2e5 Add lifetime markers for allocas created to hold byval arguments, make them
appear in the InlineFunctionInfo.

llvm-svn: 206308
2014-04-15 18:06:46 +00:00
Julien Lerouge
9ecdd0ce5b Split byval argument initialization so the memcpy(s) are injected at the
beginning of the first new block after inlining.

llvm-svn: 206307
2014-04-15 18:01:54 +00:00
Duncan P. N. Exon Smith
571c11d959 LTO: Add more loop simplification passes to LTO
Similar to r202051, add missing loop simplification passes to the LTO
optimization pipeline.

Patch by Rafael Espindola.

llvm-svn: 206306
2014-04-15 17:48:15 +00:00
Duncan P. N. Exon Smith
58154f2238 verify-di: Implement DebugInfoVerifier
Implement DebugInfoVerifier, which steals verification relying on
DebugInfoFinder from Verifier.

  - Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps
    DebugInfoVerifier.  Uses -verify-di command-line flag.

  - Change verifyModule() to invoke DebugInfoVerifier as well as
    Verifier.

  - Add a call to createDebugInfoVerifierPass() wherever there was a
    call to createVerifierPass().

This implementation as a module pass should sidestep efficiency issues,
allowing us to turn debug info verification back on.

<rdar://problem/15500563>

llvm-svn: 206300
2014-04-15 16:27:38 +00:00
Alexey Bataev
135cfee77c D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadata
llvm-svn: 206266
2014-04-15 09:37:30 +00:00
Matt Arsenault
2a6aada789 Revert "Revert r206045, "Fix shift by constants for vector.""
Fix cases where the Value itself is used, and not the constant value.

llvm-svn: 206214
2014-04-14 21:50:37 +00:00
NAKAMURA Takumi
1a21e608ca Whitespace.
llvm-svn: 206154
2014-04-14 07:03:13 +00:00
NAKAMURA Takumi
c6fb0494ea Revert r206045, "Fix shift by constants for vector."
It broke some builders, at least, i686.

llvm-svn: 206153
2014-04-14 07:02:57 +00:00
Serge Pavlov
c145d314a1 Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that.
llvm-svn: 206144
2014-04-14 02:20:19 +00:00
Serge Pavlov
816d014c52 Recognize test for overflow in integer multiplication.
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:

    %mul32 = trunc i64 %mul64 to i32
    %zext = zext i32 %mul32 to i64
    %overflow = icmp ne i64 %mul64, %zext
or
    %overflow = icmp ugt i64 %mul64 , 0xffffffff

then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.

Differential Revision: http://llvm-reviews.chandlerc.com/D2814

llvm-svn: 206137
2014-04-13 18:23:41 +00:00
Matt Arsenault
c399a3f659 Fix shift by constants for vector.
ashr <N x iM>, <N x iM> M -> undef

llvm-svn: 206045
2014-04-11 17:57:53 +00:00
David Blaikie
1573e6e09f Implement depth_first and inverse_depth_first range factory functions.
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?

Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).

Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)

llvm-svn: 206016
2014-04-11 01:50:01 +00:00
Arnold Schwaighofer
c65ae6074a Reapply "SLPVectorizer: Ignore users that are insertelements we can reschedule them"
This commit reapplies 205018. After 205855 we should correctly vectorize
intrinsics.

llvm-svn: 205965
2014-04-10 13:41:35 +00:00
Alp Toker
111bd28e59 Fix some doc and comment typos
llvm-svn: 205899
2014-04-09 14:47:27 +00:00
Arnold Schwaighofer
1a503c9322 SLPVectorizer: Only vectorize intrinsics whose operands are widened equally
The vectorizer only knows how to vectorize intrinics by widening all operands by
the same factor.

Patch by Tyler Nowicki!

llvm-svn: 205855
2014-04-09 14:20:47 +00:00
Diego Novillo
224c7b79fe Add support for optimization reports.
Summary:
This patch adds backend support for -Rpass=, which indicates the name
of the optimization pass that should emit remarks stating when it
made a transformation to the code.

Pass names are taken from their DEBUG_NAME definitions.

When emitting an optimization report diagnostic, the lack of debug
information causes the diagnostic to use "<unknown>:0:0" as the
location string.

This is the back end counterpart for

http://llvm-reviews.chandlerc.com/D3226

Reviewers: qcolombet

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3227

llvm-svn: 205774
2014-04-08 16:42:34 +00:00
Eric Christopher
23b79cb873 Add NDEBUG markers around debug only function.
llvm-svn: 205706
2014-04-07 12:46:30 +00:00
Eric Christopher
06a9cfdefa Add debug location information to the vectorizer debug statements.
Patch by Zinovy Nis.

llvm-svn: 205705
2014-04-07 12:32:17 +00:00
David Blaikie
e0b9857e92 Fixing typo.
Differential Revision: http://reviews.llvm.org/D3154

llvm-svn: 205674
2014-04-05 20:30:31 +00:00
Eli Bendersky
be453afe99 Fix PR19270 - type mismatch caused by invalid optimization.
Patch by Jingyue Wu.

llvm-svn: 205547
2014-04-03 17:51:58 +00:00
Juergen Ributzka
da301c01ab Revert "[Constant Hoisting] Lazily compute the idom and cache the result."
This code is no longer usefull, because we only compute and use the
IDom once. There is no benefit in caching it anymore.

llvm-svn: 205498
2014-04-03 01:38:47 +00:00
Duncan P. N. Exon Smith
7f2af7c18b Revert "Reapply "LTO: add API to set strategy for -internalize""
This reverts commit r199244.

Conflicts:
	include/llvm-c/lto.h
	include/llvm/LTO/LTOCodeGenerator.h
	lib/LTO/LTOCodeGenerator.cpp

llvm-svn: 205471
2014-04-02 22:05:57 +00:00
Tim Northover
466b3a39e1 SLPVectorizer: compare entire intrinsic for SLP compatibility.
Some Intrinsics are overloaded to the extent that return type equality (all
that's been checked up to now) does not guarantee that the arguments are the
same. In these cases SLP vectorizer should not recurse into the operands, which
can be achieved by comparing them as "Function *" rather than simply the ID.

llvm-svn: 205424
2014-04-02 14:39:02 +00:00
Hal Finkel
5a327230eb [LoopVectorizer] Count dependencies of consecutive pointers as uniforms
For the purpose of calculating the cost of the loop at various vectorization
factors, we need to count dependencies of consecutive pointers as uniforms
(which means that the VF = 1 cost is used for all overall VF values).

For example, the TSVC benchmark function s173 has:
  ...
  %3 = add nsw i64 %indvars.iv, 16000
  %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3
  ...
and we must realize that the add will be a scalar in order to correctly deduce
it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all
dependencies of a consecutive pointer must be a scalar (uniform), and so we
simply need to add all consecutive pointers to the worklist that currently
detects collects uniforms.

Fixes PR19296.

llvm-svn: 205387
2014-04-02 02:34:49 +00:00
Hal Finkel
b3f2a21eed Add some additional fields to TTI::UnrollingPreferences
In preparation for an upcoming commit implementing unrolling preferences for
x86, this adds additional fields to the UnrollingPreferences structure:

 - PartialThreshold and PartialOptSizeThreshold - Like Threshold and
   OptSizeThreshold, but used when not fully unrolling. These are necessary
   because we need different thresholds for full unrolling from those used when
   partially unrolling (the full unrolling thresholds are generally going to be
   larger).

 - MaxCount - A cap on the unrolling factor when partially unrolling. This can
   be used by a target to prevent the unrolled loop from exceeding some
   resource limit independent of the loop size (such as number of branches).

There should be no functionality change for any in-tree targets.

llvm-svn: 205347
2014-04-01 18:50:30 +00:00
Hal Finkel
dc0e116444 Move partial/runtime unrolling late in the pipeline
The generic (concatenation) loop unroller is currently placed early in the
standard optimization pipeline. This is a good place to perform full unrolling,
but not the right place to perform partial/runtime unrolling. However, most
targets don't enable partial/runtime unrolling, so this never mattered.

However, even some x86 cores benefit from partial/runtime unrolling of very
small loops, and follow-up commits will enable this. First, we need to move
partial/runtime unrolling late in the optimization pipeline (importantly, this
is after SLP and loop vectorization, as vectorization can drastically change
the size of a loop), while keeping the full unrolling where it is now. This
change does just that.

llvm-svn: 205264
2014-03-31 23:23:51 +00:00
Arnold Schwaighofer
219f6a43e0 Revert "SLPVectorizer: Ignore users that are insertelements we can reschedule them"
This reverts commit r205018.

Conflicts:
	lib/Transforms/Vectorize/SLPVectorizer.cpp
	test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll

This is breaking libclc build.

llvm-svn: 205260
2014-03-31 23:05:56 +00:00
Rafael Espindola
18c992ab85 Add a missing break.
Patch by Tobias Güntner.

I tried to write a test, but the only difference is the Changed value that
gets returned. It can be tested with "opt -debug-pass=Executions -functionattrs,
but that doesn't seem worth it.

llvm-svn: 205121
2014-03-30 03:26:17 +00:00
Tim Northover
2f13163a84 ARM64: initial backend import
This adds a second implementation of the AArch64 architecture to LLVM,
accessible in parallel via the "arm64" triple. The plan over the
coming weeks & months is to merge the two into a single backend,
during which time thorough code review should naturally occur.

Everything will be easier with the target in-tree though, hence this
commit.

llvm-svn: 205090
2014-03-29 10:18:08 +00:00
Arnold Schwaighofer
bf6c68c0be SLPVectorizer: Take credit for free extractelement instructions
Extract element instructions that will be removed when vectorzing lower the
cost.

Patch by Arch D. Robison!

llvm-svn: 205020
2014-03-28 17:21:32 +00:00
Arnold Schwaighofer
ffb5e31163 SLPVectorizer: Fix typos
Patch by Arch D. Robison!

llvm-svn: 205019
2014-03-28 17:21:27 +00:00
Arnold Schwaighofer
8510d16f52 SLPVectorizer: Ignore users that are insertelements we can reschedule them
Patch by Arch D. Robison!

llvm-svn: 205018
2014-03-28 17:21:22 +00:00
Erik Verbruggen
11e61b79e5 Revert "InstCombine: merge constants in both operands of icmp."
This reverts commit r204912, and follow-up commit r204948.

This introduced a performance regression, and the fix is not completely
clear yet.

llvm-svn: 205010
2014-03-28 14:50:57 +00:00
Erik Verbruggen
06cf5cbf74 Revert "GVN: merge overflow intrinsics with non-overflow instructions."
This reverts commit r203553, and follow-up commits r203558 and r203574.

I will follow this up on the mailinglist to do it in a way that won't
cause subtle PRE bugs.

llvm-svn: 205009
2014-03-28 14:42:34 +00:00
Adrian Prantl
3324d55193 C++11: convert verbose loops to range-based loops.
llvm-svn: 204981
2014-03-27 23:30:04 +00:00
Reid Kleckner
c826dce075 InstCombine: Don't combine constants on unsigned icmps
Fixes a miscompile introduced in r204912.  It would miscompile code like
(unsigned)(a + -49) <= 5U.  The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.

llvm-svn: 204948
2014-03-27 17:49:27 +00:00
Rafael Espindola
5c8926deed Prevent alias from pointing to weak aliases.
This adds back r204781.

Original message:

Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given

define void @my_func() {
  ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias

We produce without this patch:

        .weak   my_alias
my_alias = my_func
        .globl  my_alias2
my_alias2 = my_alias

That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a

@my_alias = alias void ()* @other_func

would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.

There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.

llvm-svn: 204934
2014-03-27 15:26:56 +00:00
Erik Verbruggen
5e4efd4306 InstCombine: merge constants in both operands of icmp.
Transform:
    icmp X+Cst2, Cst
into:
    icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.

llvm-svn: 204912
2014-03-27 11:16:05 +00:00
Nick Lewycky
a6e0e1eae1 Treat lifetime.start'd memory like we treat freshly alloca'd memory. Patch by Björn Steinbrink!
llvm-svn: 204876
2014-03-26 23:45:15 +00:00
Reid Kleckner
509530b2ae CloneFunction: Clone all attributes, including the CC
Summary:
Tested with a unit test because we don't appear to have any transforms
that use this other than ASan, I think.

Fixes PR17935.

Reviewers: nicholas

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3194

llvm-svn: 204866
2014-03-26 22:26:35 +00:00
Rafael Espindola
63a8ff6883 Revert "Prevent alias from pointing to weak aliases."
This reverts commit r204781.

I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.

llvm-svn: 204784
2014-03-26 06:14:40 +00:00
Rafael Espindola
c9179b8b50 Prevent alias from pointing to weak aliases.
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given

define void @my_func() {
  ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias

We produce without this patch:

        .weak   my_alias
my_alias = my_func
        .globl  my_alias2
my_alias2 = my_alias

That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a

@my_alias = alias void ()* @other_func

would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.

There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.

llvm-svn: 204781
2014-03-26 04:48:47 +00:00
Juergen Ributzka
822c198051 [Constant Hoisting] Make the constant candidate map local to the collectConstantCandidates method.
llvm-svn: 204758
2014-03-25 21:21:10 +00:00
Richard Osborne
fd123c2caf [InstCombine] Don't fold bitcast into store if it would need addrspacecast
Summary:
Previously the code didn't check if the before and after types for the
store were pointers to different address spaces. This resulted in
instcombine using a bitcast to convert between pointers to different
address spaces, causing an assertion due to the invalid cast.

It is not be appropriate to use addrspacecast this case because it is
not guaranteed to be a no-op cast. Instead bail out and do not do the
transformation.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3117

llvm-svn: 204733
2014-03-25 17:21:41 +00:00
Richard Osborne
db5d56840b Reuse earlier variables to make it clear the types involved in the cast.
No functionality change.

llvm-svn: 204732
2014-03-25 17:21:35 +00:00
Evgeniy Stepanov
ad64faed33 [msan] More precise instrumentation of select IR.
Some bits of select result may be initialized even if select condition
is not.

https://code.google.com/p/memory-sanitizer/issues/detail?id=50

llvm-svn: 204716
2014-03-25 13:08:34 +00:00
Andrew Trick
16d04697fd SLP vectorizer: Don't hoist vector extracts of phis.
Extracts coming from phis were being hoisted, while all others were
sunk to their uses. This was inconsistent and didn't seem to serve a
purpose. Changing all extracts to be sunk to uses is a prerequisite
for adding block frequency to the SLP vectorizer's cost model.

I benchmarked the change in isolation (without block frequency). I
only saw noise on x86 and some potentially significant improvements on
ARM. No major regressions is good enough for me.

llvm-svn: 204699
2014-03-25 02:18:47 +00:00
Nuno Lopes
79d18a66ec remove a bunch of unused private methods
found with a smarter version of -Wunused-member-function that I'm playwing with.
Appologies in advance if I removed someone's WIP code.

 include/llvm/CodeGen/MachineSSAUpdater.h            |    1 
 include/llvm/IR/DebugInfo.h                         |    3 
 lib/CodeGen/MachineSSAUpdater.cpp                   |   10 --
 lib/CodeGen/PostRASchedulerList.cpp                 |    1 
 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp    |   10 --
 lib/IR/DebugInfo.cpp                                |   12 --
 lib/MC/MCAsmStreamer.cpp                            |    2 
 lib/Support/YAMLParser.cpp                          |   39 ---------
 lib/TableGen/TGParser.cpp                           |   16 ---
 lib/TableGen/TGParser.h                             |    1 
 lib/Target/AArch64/AArch64TargetTransformInfo.cpp   |    9 --
 lib/Target/ARM/ARMCodeEmitter.cpp                   |   12 --
 lib/Target/ARM/ARMFastISel.cpp                      |   84 --------------------
 lib/Target/Mips/MipsCodeEmitter.cpp                 |   11 --
 lib/Target/Mips/MipsConstantIslandPass.cpp          |   12 --
 lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp              |   21 -----
 lib/Target/NVPTX/NVPTXISelDAGToDAG.h                |    2 
 lib/Target/PowerPC/PPCFastISel.cpp                  |    1 
 lib/Transforms/Instrumentation/AddressSanitizer.cpp |    2 
 lib/Transforms/Instrumentation/BoundsChecking.cpp   |    2 
 lib/Transforms/Instrumentation/MemorySanitizer.cpp  |    1 
 lib/Transforms/Scalar/LoopIdiomRecognize.cpp        |    8 -
 lib/Transforms/Scalar/SCCP.cpp                      |    1 
 utils/TableGen/CodeEmitterGen.cpp                   |    2 
 24 files changed, 2 insertions(+), 261 deletions(-)

llvm-svn: 204560
2014-03-23 17:09:26 +00:00
Lang Hames
75ea6aebf8 Revert r204076 for now - it caused significant regressions in a number of
benchmarks.

<rdar://problem/16368461>

llvm-svn: 204558
2014-03-23 04:22:31 +00:00
Juergen Ributzka
9a985a07a3 [Constant Hoisting] Erase dead cast instructions.
The cleanup code that removes dead cast instructions only removed them from the
basic block, but didn't delete them. This fix erases them now too.

llvm-svn: 204538
2014-03-22 01:49:30 +00:00
Juergen Ributzka
a05de79cbb [Constant Hoisting] Fix multiple entries for the same basic block in PHI nodes.
A PHI node usually has only one value/basic block pair per incoming basic block.
In the case of a switch statement it is possible that a following PHI node may
have more than one such pair per incoming basic block. E.g.:
%0 = phi i64 [ 123456, %case2 ], [ 654321, %Entry ], [ 654321, %Entry ]
This is valid and the verfier doesn't complain, because both values are the
same.

Constant hoisting materializes the constant for each operand separately and the
value is still the same, but the variable names have changed. As a result the
verfier can't recognize anymore that they are the same value and complains.

This fix adds special update code for PHI node in constant hoisting to prevent
this corner case.

This fixes <rdar://problem/16394449>

llvm-svn: 204537
2014-03-22 01:49:27 +00:00
Arnaud A. de Grandmaison
c417a5d328 Remove some dead assignements found by scan-build
llvm-svn: 204526
2014-03-21 21:54:46 +00:00
Tom Stellard
fe1239b8cb Sink: Don't sink static allocas from the entry block
CodeGen treats allocas outside the entry block as dynamically sized
stack objects.

llvm-svn: 204473
2014-03-21 15:51:51 +00:00
Juergen Ributzka
4470e9c92d [Constant Hoisting] Make the constant materialization cost operand dependent
Extend the target hook to take also the operand index into account when
calculating the cost of the constant materialization.

Related to <rdar://problem/16381500>

llvm-svn: 204435
2014-03-21 06:04:45 +00:00
Juergen Ributzka
a65ce371c7 [Constant Hoisting] Lazily compute the idom and cache the result.
Related to <rdar://problem/16381500>

llvm-svn: 204434
2014-03-21 06:04:39 +00:00
Juergen Ributzka
b52c2c678c [Constant Hoisting] Change the algorithm to only track constants for instructions.
Originally the algorithm would search for expensive constants and track their
users, which could be instructions and constant expressions. This change only
tracks the constants for instructions, but constant expressions are indirectly
covered too. If an operand is an constant expression, then we look through the
expression to find anny expensive constants.

The algorithm keep now track of the instruction and the operand index where the
constant is used. This allows more precise hoisting of constant materialization
code for PHI instructions, because we only hoist to the basic block of the
incoming operand. Before we had to find the idom of all PHI operands and hoist
the materialization code there.

This also makes updating of instructions easier. Before we had to keep track of
the original constant, find it in the instructions, and then replace it. Now we
can just simply update the operand.

Related to <rdar://problem/16381500>

llvm-svn: 204433
2014-03-21 06:04:36 +00:00
Juergen Ributzka
2e77fbe182 [Constant Hoisting] Fix capitalization of function names.
llvm-svn: 204432
2014-03-21 06:04:33 +00:00
Juergen Ributzka
60d9807b0e [Constant Hoisting] Replace the MapVector with a separate Map and Vector to keep track of constant candidates.
This simplifies working with the constant candidates and removes the tight
coupling between the map and the vector.

Related to <rdar://problem/16381500>

llvm-svn: 204431
2014-03-21 06:04:30 +00:00
Juergen Ributzka
c55e0f3fc7 Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass."
I will break this up into smaller pieces for review and recommit.

llvm-svn: 204393
2014-03-20 20:17:13 +00:00
Juergen Ributzka
7dae5f7baa [Constant Hoisting] Extend coverage of the constant hoisting pass.
This commit extends the coverage of the constant hoisting pass, adds additonal
debug output and updates the function names according to the style guide.

Related to <rdar://problem/16381500>

llvm-svn: 204389
2014-03-20 19:55:52 +00:00
Mark Seaborn
ade468f2c3 Remove LowerInvoke's obsolete "-enable-correct-eh-support" option
This option caused LowerInvoke to generate code using SJLJ-based
exception handling, but there is no code left that interprets the
jmp_buf stack that the resulting code maintained (llvm.sjljeh.jblist).
This option has been obsolete for a while, and replaced by
SjLjEHPrepare.

This leaves the default behaviour of LowerInvoke, which is to convert
invokes to calls.

Differential Revision: http://llvm-reviews.chandlerc.com/D3136

llvm-svn: 204388
2014-03-20 19:54:47 +00:00
Alexander Potapenko
0eb130e34f [ASan] Do not instrument globals from the llvm.metadata section.
Fixes https://code.google.com/p/address-sanitizer/issues/detail?id=279.

llvm-svn: 204331
2014-03-20 10:48:34 +00:00
Evgeniy Stepanov
17d50b69f6 Set debug info for instructions inserted in SplitBlockAndInsertIfThen.
llvm-svn: 204230
2014-03-19 12:56:38 +00:00
Duncan P. N. Exon Smith
6fc0b70a7b Fix use_iterator crash in ObjCArc from r203364
The use_iterator redesign in r203364 introduced an increment past the
end of a range in -objc-arc-contract.  Added an explicit check for the
end of the range.

<rdar://problem/16333235>

llvm-svn: 204195
2014-03-18 22:32:43 +00:00
Chandler Carruth
536fb8893d [LV] While I'm here, use range based for loops which are so much cleaner
for this kind of walk.

llvm-svn: 204188
2014-03-18 22:00:32 +00:00
Chandler Carruth
9fa85c489d [LV] The actual change I intended to commit in r204148. Sorry for the
noise.

Original commit log:
Replace some dead code with an assert. When I first ported this pass
from a loop pass to a function pass I did so in the naive, recursive
way. It doesn't actually work, we need a worklist instead. When
I switched to the worklist I didn't delete the naive recursion. That
recursion was also buggy because it was dead and never really exercised.

llvm-svn: 204187
2014-03-18 21:58:38 +00:00
Chandler Carruth
4ac3e74751 [LV] Replace some dead code with an assert. When I first ported this
pass from a loop pass to a function pass I did so in the naive,
recursive way. It doesn't actually work, we need a worklist instead.
When I switched to the worklist I didn't delete the naive recursion.
That recursion was also buggy because it was dead and never really
exercised.

llvm-svn: 204184
2014-03-18 21:51:46 +00:00
Evgeniy Stepanov
4e42dcfe00 [msan] Origin tracking with history.
LLVM part of MSan implementation of advanced origin tracking,
when we record not only creation point, but all locations where
an uninitialized value was stored to memory, too.

llvm-svn: 204151
2014-03-18 13:30:56 +00:00
Diego Novillo
119221ccbc Tolerate unmangled names in sample profiles.
Summary:
The compiler does not always generate linkage names. If a function
has been inlined and its body elided, its linkage name may not be
generated.

When the binary executes, the profiler will use its unmangled name
when attributing samples. This results in unmangled names in the
input profile.

We are currently failing hard when this happens. However, in this case
all that happens is that we fail to attribute samples to the inlined
function. While this means fewer optimization opportunities, it should
not cause a compilation failure.

This patch accepts all valid function names, regardless of whether
they were mangled or not.

Reviewers: chandlerc

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3087

llvm-svn: 204142
2014-03-18 12:03:12 +00:00
Evgeniy Stepanov
cfd1cf2b01 [msan] Kill -msan-store-clean-origin flag.
Not only is it slower than the alternative, but also subtly broken.
This commit does not change the default behavior.

llvm-svn: 204131
2014-03-18 09:47:06 +00:00
Alon Mishne
70ba46ff38 [C++11] Change DebugInfoFinder to use range-based loops
Also changes the iterators to return actual DI type over MDNode.

llvm-svn: 204130
2014-03-18 09:41:07 +00:00
Evgeniy Stepanov
7ad8a1f5a2 [msan] Remove unused code.
llvm-svn: 204125
2014-03-18 08:29:42 +00:00
Dan Gohman
b0339af0e1 Use range metadata instead of introducing selects.
When GlobalOpt has determined that a GlobalVariable only ever has two values,
it would convert the GlobalVariable to a boolean, and introduce SelectInsts
at every load, to choose between the two possible values. These SelectInsts
introduce overhead and other unpleasantness.

This patch makes GlobalOpt just add range metadata to loads from such
GlobalVariables instead. This enables the same main optimization (as seen in
test/Transforms/GlobalOpt/integer-bool.ll), without introducing selects.

The main downside is that it doesn't get the memory savings of shrinking such
GlobalVariables, but this is expected to be negligible.

llvm-svn: 204076
2014-03-17 19:57:04 +00:00
Eli Bendersky
631277f3dd Consistent use of the noduplicate attribute.
The "noduplicate" attribute of call instructions is sometimes queried directly
and sometimes through the cannotDuplicate() predicate. This patch streamlines
all queries to use the cannotDuplicate() predicate. It also adds this predicate
to InvokeInst, to mirror what CallInst has.

llvm-svn: 204049
2014-03-17 16:19:07 +00:00
David Blaikie
60ddd2b93c Remove named Twine.
While technically correct, we generally disallow any instance of named
Twines due to their subtlety.

llvm-svn: 204016
2014-03-16 01:36:18 +00:00
Arnaud A. de Grandmaison
4544f80f7c Remove some dead assignements found by scan-build
llvm-svn: 204013
2014-03-15 22:13:15 +00:00
Benjamin Kramer
8e0892b4f3 LSR: Compress a pair (and get rid of the DenseMapInfo for it).
Also convert a horrible hash function to use our hashing infrastructure.
No functionality change.

llvm-svn: 204008
2014-03-15 17:17:48 +00:00
NAKAMURA Takumi
2d8e94aadc SampleProfile.cpp: Fix take #2. The issue was abuse of StringRef here.
llvm-svn: 203996
2014-03-15 01:56:17 +00:00
NAKAMURA Takumi
8cb5938310 SampleProfile.cpp: Quick fix to r203976 about abuse of Twine. The life of Twine was too short.
FIXME: DiagnosticInfoSampleProfile should not hold Twine&.
llvm-svn: 203990
2014-03-15 00:10:12 +00:00
Diego Novillo
6368420655 Re-format SampleProfile.cpp with clang-format. No functional changes.
llvm-svn: 203977
2014-03-14 22:07:18 +00:00
Diego Novillo
a9a26c6236 Use DiagnosticInfo facility.
Summary:
The sample profiler pass emits several error messages. Instead of
just aborting the compiler with report_fatal_error, we can emit
better messages using DiagnosticInfo.

This adds a new sub-class of DiagnosticInfo to handle the sample
profiler.

Reviewers: chandlerc, qcolombet

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3086

llvm-svn: 203976
2014-03-14 21:58:59 +00:00
Alexander Potapenko
a22163b83f [ASan] Fix https://code.google.com/p/address-sanitizer/issues/detail?id=274
by ignoring globals from __TEXT,__cstring,cstring_literals during instrumenation.
Add a regression test.

llvm-svn: 203916
2014-03-14 10:41:49 +00:00
Stepan Dyatkovskiy
2789f696ba MergeFunctions, cmpType: fixed variable names from XXTy1 and XXTy2 to XXTyL and XXTyR.
llvm-svn: 203907
2014-03-14 08:48:52 +00:00
Stepan Dyatkovskiy
0504c3145a MergeFunctions, cmpType: Fixed comments wrapping.
llvm-svn: 203905
2014-03-14 08:17:19 +00:00
Owen Anderson
3a006737fe Fix a bug in InstCombine where we would incorrectly attempt to construct a
bitcast between pointers of two different address spaces if they happened to have
the same pointer size.

llvm-svn: 203862
2014-03-13 22:51:43 +00:00
Evgeniy Stepanov
04442bc559 [msan] Fix handling of byval arguments in VarArg calls.
llvm-svn: 203794
2014-03-13 13:17:11 +00:00
Stepan Dyatkovskiy
47660351ae First patch of patch series that improves MergeFunctions performance time from O(N*N) to
O(N*log(N)). The idea is to introduce total ordering among functions set.
That allows to build binary tree and perform function look-up procedure in O(log(N)) time. 

This patch description:
Introduced total ordering among Type instances. Actually it is improvement for existing
isEquivalentType.
0. Coerce pointer of 0 address space to integer.
1. If left and right types are equal (the same Type* value), return 0 (means equal).
2. If types are of different kind (different type IDs). Return result of type IDs
comparison, treating them as numbers.
3. If types are vectors or integers, return result of its
pointers comparison (casted to numbers).
4. Check whether type ID belongs to the next group: 
* Void 
* Float 
* Double 
* X86_FP80 
* FP128 
* PPC_FP128 
* Label 
* Metadata 
If so, return 0.
5. If left and right are pointers, return result of address space
comparison (numbers comparison).
6. If types are complex.
Then both LEFT and RIGHT will be expanded and their element types will be checked with
the same way. If we get Res != 0 on some stage, return it. Otherwise return 0.
7. For all other cases put llvm_unreachable.

llvm-svn: 203788
2014-03-13 11:54:50 +00:00
Mark Seaborn
91085966b7 Fix typo in comment: "inwoke" -> "invoke"
llvm-svn: 203739
2014-03-13 00:04:17 +00:00
Raul E. Silvera
1c39640e2d Resubmit "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..."
This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8.
The problems previously found have been resolved through other CLs.

llvm-svn: 203707
2014-03-12 20:21:50 +00:00
Hans Wennborg
a2aafbb7b5 Allow switch-to-lookup table for tables with holes by adding bitmask check
This allows us to generate table lookups for code such as:

  unsigned test(unsigned x) {
    switch (x) {
      case 100: return 0;
      case 101: return 1;
      case 103: return 2;
      case 105: return 3;
      case 107: return 4;
      case 109: return 5;
      case 110: return 6;
      default: return f(x);
    }
  }

Since cases 102, 104, etc. are not constants, the lookup table has holes
in those positions. We therefore guard the table lookup with a bitmask check.

Patch by Jasper Neumann!

llvm-svn: 203694
2014-03-12 18:35:40 +00:00
Evan Cheng
f2d3d2bf92 Revert r203488 and r203520.
llvm-svn: 203687
2014-03-12 18:09:37 +00:00
Eli Bendersky
3af4500090 Revive SizeOptLevel-explaining comments that were dropped in r203669
llvm-svn: 203675
2014-03-12 16:44:17 +00:00
Eli Bendersky
fa2b4f20f2 Move duplicated code into a helper function (exposed through overload).
There's a bit of duplicated "magic" code in opt.cpp and Clang's CodeGen that
computes the inliner threshold from opt level and size opt level.

This patch moves the code to a function that lives alongside the inliner itself,
providing a convenient overload to the inliner creation.

A separate patch can be committed to Clang to use this once it's committed to
LLVM. Standalone tools that use the inlining pass can also avoid duplicating
this code and fearing it will go out of sync.

Note: this patch also restructures the conditinal logic of the computation to
be cleaner.

llvm-svn: 203669
2014-03-12 16:12:36 +00:00
Alon Mishne
00d720ff32 Cloning a function now also clones its debug metadata if 'ModuleLevelChanges' is true.
llvm-svn: 203662
2014-03-12 14:42:51 +00:00
Erik Verbruggen
11cc704d2c Fix crash in PRE.
After r203553 overflow intrinsics and their non-intrinsic (normal)
instruction get hashed to the same value. This patch prevents PRE from
moving an instruction into a predecessor block, and trying to add a phi
node that gets two different types (the intrinsic result and the
non-intrinsic result), resulting in a failing assert.

llvm-svn: 203574
2014-03-11 15:07:32 +00:00
Tim Northover
68c567a38a IR: add a second ordering operand to cmpxhg for failure
The syntax for "cmpxchg" should now look something like:

	cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic

where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).

rdar://problem/15996804

llvm-svn: 203559
2014-03-11 10:48:52 +00:00
Erik Verbruggen
c2bf18261b GVN: fix hashing of extractvalue.
My last commit did not add the indexes to the hashed value for
extractvalue. Adding that back in.

llvm-svn: 203558
2014-03-11 10:21:30 +00:00
Erik Verbruggen
638ff95018 GVN: merge overflow intrinsics with non-overflow instructions.
When an overflow intrinsic is followed by a non-overflow instruction,
replace the latter with an extract. For example:

  %sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
  %sadd3 = add i32 %a, %b

Here the add statement will be replaced by an extract.

When an overflow intrinsic follows a non-overflow instruction, a clone
of the intrinsic is inserted before the normal instruction, which makes
it the same as the previous case. Subsequent runs of GVN can then clean
up the duplicate instructions and insert the extract.

This fixes PR8817.

llvm-svn: 203553
2014-03-11 09:36:48 +00:00
Duncan P. N. Exon Smith
f9624311ce Cleanup whitespace
llvm-svn: 203529
2014-03-11 02:44:45 +00:00
Evan Cheng
9a155c5f78 Follow up to r203488. Code clean up to eliminate a lot of copy+paste.
llvm-svn: 203520
2014-03-11 00:24:20 +00:00
Diego Novillo
dd37be24ca Use discriminator information in sample profiles.
Summary:
When the sample profiles include discriminator information,
use the discriminator values to distinguish instruction weights
in different basic blocks.

This modifies the BodySamples mapping to map <line, discriminator> pairs
to weights. Instructions on the same line but different blocks, will
use different discriminator values. This, in turn, means that the blocks
may have different weights.

Other changes in this patch:

- Add tests for positive values of line offset, discriminator and samples.
- Change data types from uint32_t to unsigned and int and do additional
  validation.

Reviewers: chandlerc

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2857

llvm-svn: 203508
2014-03-10 22:41:28 +00:00
Benjamin Kramer
108d24886e MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination.
The testcase is from PR19092, but I think the bug described there is actually a clang issue.

llvm-svn: 203489
2014-03-10 21:05:13 +00:00
Evan Cheng
b0fdca31bc For functions with ARM target specific calling convention, when simplify-libcall
optimize a call to a llvm intrinsic to something that invovles a call to a C
library call, make sure it sets the right calling convention on the call.

e.g.
extern double pow(double, double);
double t(double x) {
  return pow(10, x);
}

Compiles to something like this for AAPCS-VFP:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
  ret double %0
}

declare double @llvm.pow.f64(double, double) #1

Simplify libcall (part of instcombine) will turn the above into:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %__exp10 = call double @__exp10(double %x) #1
  ret double %__exp10
}

declare double @__exp10(double)

The pre-instcombine code works because calls to LLVM builtins are special.
Instruction selection will chose the right calling convention for the call.
However, the code after instcombine is wrong. The call to __exp10 will use
the C calling convention.

I can think of 3 options to fix this.

1. Make "C" calling convention just work since the target should know what CC
   is being used.

   This doesn't work because each function can use different CC with the "pcs"
   attribute.

2. Have Clang add the right CC keyword on the calls to LLVM builtin.

   This will work but it doesn't match the LLVM IR specification which states
   these are "Standard C Library Intrinsics".

3. Fix simplify libcall so the resulting calls to the C routines will have the
   proper CC keyword. e.g.
   %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1

   This works and is the solution I implemented here.

Both solutions #2 and #3 would work. After carefully considering the pros and
cons, I decided to implement #3 for the following reasons.

1. It doesn't change the "spec" of the intrinsics.
2. It's a self-contained fix.

There are a couple of potential downsides.
1. There could be other places in the optimizer that is broken in the same way
   that's not addressed by this.
2. There could be other calling conventions that need to be propagated by
   simplify-libcall that's not handled.

But for now, this is the fix that I'm most comfortable with.

llvm-svn: 203488
2014-03-10 20:49:45 +00:00
Benjamin Kramer
488ab03435 SimplifyCFG: Simplify the weight scaling algorithm.
No change in functionality.

llvm-svn: 203413
2014-03-09 14:42:55 +00:00
Ahmed Charles
e4b10534bd Fix build break.
llvm-svn: 203366
2014-03-09 03:50:36 +00:00
Chandler Carruth
fad39ebe19 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Benjamin Kramer
aaa10dc26a [C++11] Revert uses of lambdas with array_pod_sort.
Looks like GCC implements the lambda->function pointer conversion differently.

llvm-svn: 203294
2014-03-07 21:52:38 +00:00
Benjamin Kramer
f042a6ba0a [C++11] Convert sort predicates into lambdas.
No functionality change.

llvm-svn: 203288
2014-03-07 21:35:39 +00:00
Tim Northover
b74aa030d9 InstCombine: form shuffles from wider range of insert/extractelements
Sequences of insertelement/extractelements are sometimes used to build
vectorsr; this code tries to put them back together into shuffles, but
could only produce a completely uniform shuffle types (<N x T> from two
<N x T> sources).

This should allow shuffles with different numbers of elements on the
input and output sides as well.

llvm-svn: 203229
2014-03-07 10:24:44 +00:00
Ahmed Charles
52ce0c101e Replace OwningPtr<T> with std::unique_ptr<T>.
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.

llvm-svn: 203083
2014-03-06 05:51:42 +00:00
Chandler Carruth
a48d15a676 [Layering] Move InstVisitor.h into the IR library as it is pretty
obviously coupled to the IR.

llvm-svn: 203064
2014-03-06 03:23:41 +00:00
Chandler Carruth
0873afae39 [Layering] Move DebugInfo.h into the IR library where its implementation
already lives.

llvm-svn: 203046
2014-03-06 00:46:21 +00:00
Chandler Carruth
2b135c4e9f [Layering] Move DIBuilder.h into the IR library where its implementation
already lives.

llvm-svn: 203038
2014-03-06 00:22:06 +00:00
Arnold Schwaighofer
adebac793b LoopVectorizer: Preserve fast-math flags
Fixes PR19045.

llvm-svn: 203008
2014-03-05 21:10:47 +00:00
Chandler Carruth
797ae6fd0d [Layering] Move DebugLoc.h into the IR library. The implementation
already lived there and it is where it belongs -- this is the in-memory
debug location representation.

This is just cleanup -- Modules can actually cope with this, but that
doesn't make it right. After chatting with folks that have out-of-tree
stuff, going ahead and moving the rest of the headers seems preferable.

llvm-svn: 202960
2014-03-05 10:30:38 +00:00
Chandler Carruth
0e2a8390e0 [C++11] Make this interface accept const Use pointers and use override
to ensure we don't mess up any of the overrides. Necessary for cleaning
up the Value use iterators and enabling range-based traversing of use
lists.

llvm-svn: 202958
2014-03-05 10:21:48 +00:00
Ahmed Charles
4a96a15754 [C++11] Replace OwningPtr::take() with OwningPtr::release().
llvm-svn: 202957
2014-03-05 10:19:29 +00:00
Craig Topper
a3683ec835 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202953
2014-03-05 09:10:37 +00:00
Chandler Carruth
436597fe00 [Modules] Move the ConstantRange class into the IR library. This is
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.

llvm-svn: 202838
2014-03-04 12:24:34 +00:00
Chandler Carruth
4b66708834 [Modules] Move the PredIteratorCache into the IR library -- it is
hardcoded to use IR BasicBlocks.

llvm-svn: 202835
2014-03-04 12:09:19 +00:00
Chandler Carruth
248195469c [Modules] Move the NoFolder into the IR library as it creates
instructions.

llvm-svn: 202834
2014-03-04 12:05:47 +00:00
Chandler Carruth
b4f244209e [Modules] Move the TargetFolder into the Analysis library. Historically,
this would have been required because of the use of DataLayout, but that
has moved into the IR proper. It is still required because this folder
uses the constant folding in the analysis library (which uses the
datalayout) as the more aggressive basis of its folder.

llvm-svn: 202832
2014-03-04 11:59:06 +00:00
Chandler Carruth
075812f27c [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth
63713e9f95 [Modules] Move ValueMap to the IR library. While this class does not
directly care about the Value class (it is templated so that the key can
be any arbitrary Value subclass), it is in fact concretely tied to the
Value class through the ValueHandle's CallbackVH interface which relies
on the key type being some Value subclass to establish the value handle
chain.

Ironically, the unittest is already in the right library.

llvm-svn: 202824
2014-03-04 11:26:31 +00:00
Chandler Carruth
649f6270aa [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth
d0657fe39f [Modules] Move the LLVM IR pattern match header into the IR library, it
obviously is coupled to the IR.

llvm-svn: 202818
2014-03-04 11:08:18 +00:00
Chandler Carruth
cfb81122cc [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth
0bf5689f06 [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Chandler Carruth
d7b36fdea7 [Modules] Move InstIterator out of the Support library, where it had no
business.

This header includes Function and BasicBlock and directly uses the
interfaces of both classes. It has to do with the IR, it even has that
in the name. =] Put it in the library it belongs to.

This is one step toward making LLVM's Support library survive a C++
modules bootstrap.

llvm-svn: 202814
2014-03-04 10:30:26 +00:00
Chandler Carruth
cd48c56575 [cleanup] Re-sort all the includes with utils/sort_includes.py.
llvm-svn: 202811
2014-03-04 10:07:28 +00:00
Diego Novillo
2ccb22f509 Pass to emit DWARF path discriminators.
DWARF discriminators are used to distinguish multiple control flow paths
on the same source location. When this happens, instructions across
basic block boundaries will share the same debug location.

This pass detects this situation and creates a new lexical scope to one
of the two instructions. This lexical scope is a child scope of the
original and contains a new discriminator value. This discriminator is
then picked up from MCObjectStreamer::EmitDwarfLocDirective to be
written on the object file.

This fixes http://llvm.org/bugs/show_bug.cgi?id=18270.

llvm-svn: 202752
2014-03-03 20:06:11 +00:00
Benjamin Kramer
6b03dd4034 [C++11] Use std::tie to simplify compare operators.
No functionality change.

llvm-svn: 202751
2014-03-03 19:58:30 +00:00