1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

2509 Commits

Author SHA1 Message Date
Marcello Maggioni
86729369ee Fixup PHIs in LowerSwitch when a Leaf node is not emitted.
This commit fixes bug http://llvm.org/bugs/show_bug.cgi?id=20103.

Thanks to Qwertyuiop for the report and the proposed fix.

llvm-svn: 212802
2014-07-11 10:34:36 +00:00
Mark Heffernan
937bab3688 Partially fix PR20058: reduce compile time for loop unrolling with very high count by reducing calls to SE->forgetLoop
llvm-svn: 212782
2014-07-10 23:30:06 +00:00
Hal Finkel
661274e401 Feeding isSafeToSpeculativelyExecute its DataLayout pointer
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.

This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.

llvm-svn: 212720
2014-07-10 14:41:31 +00:00
Alexey Samsonov
bea2f2e5d8 Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
  * DFSanABIList in DFSan instrumentation pass.
  * SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).

Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.

No functionality change.

llvm-svn: 212643
2014-07-09 19:40:08 +00:00
Benjamin Kramer
0ad234df3b Fix some Twine locals.
Two of those are use after frees. Found by clang-tidy, fixed by me.

llvm-svn: 212537
2014-07-08 14:55:06 +00:00
Sanjay Patel
e29d90184d Fix for PR17073 ( http://llvm.org/pr17073 ), simplifycfg illegally hoists an operation in a phi node that can trap.
This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those.

The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected.

llvm-svn: 212490
2014-07-07 21:19:00 +00:00
Sanjay Patel
81fb1ce2eb fixed some typos in comments
llvm-svn: 212423
2014-07-06 23:10:24 +00:00
Rafael Espindola
858b9e1423 Update the MemoryBuffer API to use ErrorOr.
llvm-svn: 212405
2014-07-06 17:43:13 +00:00
Marcello Maggioni
f4e8a4110d Minor stylistic fix in SimplifyCFG (test commit)
llvm-svn: 212259
2014-07-03 08:29:06 +00:00
David Blaikie
c5256f5830 DebugInfo: Preserve debug location information when transforming a call into an invoke during inlining.
This both improves basic debug info quality, but also fixes a larger
hole whenever we inline a call/invoke without a location (debug info for
the entire inlining is lost and other badness that the debug info
emission code is currently working around but shouldn't have to).

llvm-svn: 212065
2014-06-30 20:30:39 +00:00
Alp Toker
97022b0c1f Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.

llvm-svn: 211814
2014-06-26 22:52:05 +00:00
Hans Wennborg
0b74da6314 Don't build switch tables for dllimport and TLS variables in GEPs
This is a follow-up to r211331, which failed to notice that we were
returning early from ValidLookupTableConstant for GEPs.

llvm-svn: 211753
2014-06-26 00:30:52 +00:00
Alp Toker
fd9ead3b6f Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.

small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.

This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.

The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.

llvm-svn: 211749
2014-06-26 00:00:48 +00:00
Benjamin Kramer
7821fb4e74 LoopUnrollRuntime: Check for overflow in the trip count calculation.
Fixes PR19823.

llvm-svn: 211436
2014-06-21 13:46:25 +00:00
Hans Wennborg
3869faa1a3 Don't build switch lookup tables for dllimport or TLS variables
We would previously put dllimport variables in switch lookup tables, which
doesn't work because the address cannot be used in a constant initializer.
This is basically the same problem that we have in PR19955.

Putting TLS variables in switch tables also desn't work, because the
address of such a variable is not constant.

Differential Revision: http://reviews.llvm.org/D4220

llvm-svn: 211331
2014-06-20 00:38:12 +00:00
Jim Grosbach
2272906641 LowerSwitch: track bounding range for the condition tree.
When LowerSwitch transforms a switch instruction into a tree of ifs it
is actually performing a binary search into the various case ranges, to
see if the current value falls into one cases range of values.

So, if we have a program with something like this:

switch (a) {
case 0:
  do0();
  break;
case 1:
  do1();
  break;
case 2:
  do2();
  break;
default:
  break;
}

the code produced is something like this:

  if (a < 1) {
    if (a == 0) {
      do0();
    }
  } else {
    if (a < 2) {
      if (a == 1) {
        do1();
      }
    } else {
      if (a == 2) {
        do2();
      }
    }
  }

This code is inefficient because the check (a == 1) to execute do1() is
not needed.

The reason is that because we already checked that (a >= 1) initially by
checking that also  (a < 2) we basically already inferred that (a == 1)
without the need of an extra basic block spawned to check if actually (a
== 1).

The patch addresses this problem by keeping track of already
checked bounds in the LowerSwitch algorithm, so that when the time
arrives to produce a Leaf Block that checks the equality with the case
value / range the algorithm can decide if that block is really needed
depending on the already checked bounds .

For example, the above with "a = 1" would work like this:

the bounds start as LB: NONE , UB: NONE
as (a < 1) is emitted the bounds for the else path become LB: 1 UB:
NONE. This happens because by failing the test (a < 1) we know that the
value "a" cannot be smaller than 1 if we enter the else branch.
After the emitting the check (a < 2) the bounds in the if branch become
LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then
the upper bound becomes 2 - 1 = 1.

When it is time to emit the leaf block for "case 1:" we notice that 1
can be squeezed exactly in between the LB and UB, which means that if we
arrived to that block there is no need to emit a block that checks if (a
== 1).

Patch by: Marcello Maggioni <hayarms@gmail.com>

llvm-svn: 211038
2014-06-16 16:55:20 +00:00
Rafael Espindola
98710599c1 Remove 'using std::errro_code' from lib.
llvm-svn: 210871
2014-06-13 02:24:39 +00:00
Rafael Espindola
e0e308ff6d Don't use 'using std::error_code' in include/llvm.
This should make sure that most new uses use the std prefix.

llvm-svn: 210835
2014-06-12 21:46:39 +00:00
Rafael Espindola
38dc624425 Remove system_error.h.
This is a minimal change to remove the header. I will remove the occurrences
of "using std::error_code" in a followup patch.

llvm-svn: 210803
2014-06-12 17:38:55 +00:00
Evgeniy Stepanov
8af6942193 Fix line numbers for code inlined from __nodebug__ functions.
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.

With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.

Fixes PR19001.

llvm-svn: 210459
2014-06-09 09:09:19 +00:00
Rafael Espindola
87cd774844 Allow alias to point to an arbitrary ConstantExpr.
This  patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.

This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like

@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
                                 i32 ptrtoint (i32* @bar to i32)) to i32*)

An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).

Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.

llvm-svn: 210062
2014-06-03 02:41:57 +00:00
Matt Arsenault
09fc34b10a Make bitcast, extractelement, and insertelement considered cheap for speculation.
This helps more branches into selects. On R600,
vectors are cheap and anything that helps
remove branches is very good.

llvm-svn: 209914
2014-05-30 18:34:43 +00:00
Dinesh Dwivedi
1561b10679 LCSSA should be performed on the outermost affected loop while unrolling loop.
During loop-unroll, loop exits from the current loop may end up in in different
outer loop. This requires to re-form LCSSA recursively for one level down from
the outer most loop where loop exits are landed during unroll. This fixes PR18861.

Differential Revision: http://reviews.llvm.org/D2976

llvm-svn: 209796
2014-05-29 06:47:23 +00:00
Diego Novillo
8e5a2fdc0e Add support for missed and analysis optimization remarks.
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.

-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.

-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.

The patch also:

1- Adds support in the inliner for the two new remarks and a
   test case.

2- Moves emitOptimizationRemark* functions to the llvm namespace.

3- Adds an LLVMContext argument instead of making them member functions
   of LLVMContext.

Reviewers: qcolombet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3682

llvm-svn: 209442
2014-05-22 14:19:46 +00:00
Eric Christopher
2c8f65cf87 Revert "Patch for function cloning to inline all blocks whose address is taken"
as it was causing build failures in ruby.

This reverts commit r207713.

llvm-svn: 209135
2014-05-19 16:04:10 +00:00
Rafael Espindola
63b50aa4fa Use create methods since msvc doesn't handle delegating constructors.
llvm-svn: 209076
2014-05-17 21:29:57 +00:00
Rafael Espindola
abf16ae0ea Reduce abuse of default values in the GlobalAlias constructor.
This is in preparation for adding an optional offset.

llvm-svn: 209073
2014-05-17 19:57:46 +00:00
Reid Kleckner
fd1a7dfe04 Add comdat key field to llvm.global_ctors and llvm.global_dtors
This allows us to put dynamic initializers for weak data into the same
comdat group as the data being initialized.  This is necessary for MSVC
ABI compatibility.  Once we have comdats for guard variables, we can use
the combination to help GlobalOpt fire more often for weak data with
guarded initialization on other platforms.

Reviewers: nlewycky

Differential Revision: http://reviews.llvm.org/D3499

llvm-svn: 209015
2014-05-16 20:39:27 +00:00
Rafael Espindola
c5f7a8c70e Fix most of PR10367.
This patch changes the design of GlobalAlias so that it doesn't take a
ConstantExpr anymore. It now points directly to a GlobalObject, but its type is
independent of the aliasee type.

To avoid changing all alias related tests in this patches, I kept the common
syntax

@foo = alias i32* @bar

to mean the same as now. The cases that used to use cast now use the more
general syntax

@foo = alias i16, i32* @bar.

Note that GlobalAlias now behaves a bit more like GlobalVariable. We
know that its type is always a pointer, so we omit the '*'.

For the bitcode, a nice surprise is that we were writing both identical types
already, so the format change is minimal. Auto upgrade is handled by looking
through the casts and no new fields are needed for now. New bitcode will
simply have different types for Alias and Aliasee.

One last interesting point in the patch is that replaceAllUsesWith becomes
smart enough to avoid putting a ConstantExpr in the aliasee. This seems better
than checking and updating every caller.

A followup patch will delete getAliasedGlobal now that it is redundant. Another
patch will add support for an explicit offset.

llvm-svn: 209007
2014-05-16 19:35:39 +00:00
Rafael Espindola
7cf05b14cd Change the GlobalAlias constructor to look a bit more like GlobalVariable.
This is part of the fix for pr10367. A GlobalAlias always has a pointer type,
so just have the constructor build the type.

llvm-svn: 208983
2014-05-16 13:34:04 +00:00
Reid Kleckner
15714d9732 Don't insert lifetime.end markers between a musttail call and ret
The allocas going out of scope are immediately killed by the return
instruction.

This is a resend of r208912, which was committed accidentally.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3792

llvm-svn: 208920
2014-05-15 21:10:46 +00:00
Reid Kleckner
61107d51d3 Revert "Don't insert lifetime.end markers between a musttail call and ret"
This reverts commit r208912.

It was committed accidentally without review.

llvm-svn: 208914
2014-05-15 20:41:05 +00:00
Reid Kleckner
e91e809df4 Remove unused variable in inliner
We have to iterate over all the calls that were inlined to find out if
any were musttail.

Sink another variable down to where its used.

llvm-svn: 208913
2014-05-15 20:39:42 +00:00
Reid Kleckner
1b333b64a3 Don't insert lifetime.end markers between a musttail call and ret
The allocas going out of scope are immediately killed by the return
instruction.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3630

llvm-svn: 208912
2014-05-15 20:39:13 +00:00
Reid Kleckner
f2dda46680 Teach the inliner how to preserve musttail invariants
The interesting case is what happens when you inline a musttail call
through a musttail call site.  In this case, we can't break perfect
forwarding or allow any stack growth.

Instead of merging control flow from the inlined return instruction
after a musttail call into the body of the caller, leave the inlined
return instruction in the caller so that the musttail call stays in the
tail position.

More work is required in http://reviews.llvm.org/D3630 to handle the
case where the inlined function has dynamic allocas or byval arguments.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3491

llvm-svn: 208910
2014-05-15 20:11:28 +00:00
Jay Foad
e0eac700cb Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Rafael Espindola
e6465aeb2d Split GlobalValue into GlobalValue and GlobalObject.
This allows code to statically accept a Function or a GlobalVariable, but
not an alias. This is already a cleanup by itself IMHO, but the main
reason for it is that it gives a lot more confidence that the refactoring to fix
the design of GlobalAlias is correct. That will be a followup patch.

llvm-svn: 208716
2014-05-13 18:45:48 +00:00
Louis Gerbarg
011e17086c Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost
Since ExtractValue is not included in ComputeSpeculationCost CFGs containing
ExtractValueInsts cannot be simplified. In particular this interacts with
InstCombineCompare's tendency to insert add.with.overflow intrinsics for
certain idiomatic math operations, preventing optimization.

This patch adds ExtractValue to the ComputeSpeculationCost. Test case included

rdar://14853450

llvm-svn: 208434
2014-05-09 17:02:46 +00:00
Rafael Espindola
421423af16 Use auto and clang-format this snippet.
llvm-svn: 208421
2014-05-09 16:01:06 +00:00
Richard Smith
55381ccf0a Re-commit r208025, reverted in r208030, with a fix for a conformance issue
which GCC detects and Clang does not!

llvm-svn: 208033
2014-05-06 01:44:26 +00:00
Richard Smith
b38145eb67 Revert r208025, which made buildbots unhappy for unknown reasons.
llvm-svn: 208030
2014-05-06 01:26:00 +00:00
Richard Smith
e9d2d57a7c Add llvm::function_ref (and a couple of uses of it), representing a type-erased reference to a callable object.
llvm-svn: 208025
2014-05-06 01:01:29 +00:00
Nico Weber
fdced47a40 Teach GlobalDCE how to remove empty global_ctor entries.
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.

GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.

Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.

llvm-svn: 207856
2014-05-02 18:35:25 +00:00
Nick Lewycky
e4f0d18fe0 Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 times in a bootstrap of clang.
llvm-svn: 207828
2014-05-02 04:11:45 +00:00
Gerolf Hoflehner
aec10987f2 Patch for function cloning to inline all blocks whose address is taken
Not all address taken blocks get inlined. The reason is
that a blocks new address is known only when it is cloned. But e.g.
a branch instruction in a different block could need that address earlier
while it gets cloned. The solution is to collect the set of all
blocks that can potentially get inlined and compute a new block address
up front. Then clone and cleanup.

rdar://16427209

llvm-svn: 207713
2014-04-30 22:05:02 +00:00
Diego Novillo
81a83e3d17 Add optimization remarks to the loop unroller and vectorizer.
Summary:
This calls emitOptimizationRemark from the loop unroller and vectorizer
at the point where they make a positive transformation. For the
vectorizer, it reports vectorization and interleave factors. For the
loop unroller, it reports all the different supported types of
unrolling.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3456

llvm-svn: 207528
2014-04-29 14:27:31 +00:00
Michael Zolotukhin
354ce4fae7 Fix a typo in comment
llvm-svn: 207499
2014-04-29 07:35:33 +00:00
Chandler Carruth
d81a0d614d Fix rampant quadratic behavior in UpdatePHINodes. The operation of
mapping from a basic block to an incoming value, either for removal or
just lookup, is linear in the number of predecessors, and we were doing
this for every entry in the 'Preds' list which is in many cases almost
all of them!

Unfortunately, the fixes are quite ugly. PHI nodes just don't make this
operation easy. The efficient way to fix this is to have a clever
'remove_if' operation on PHI nodes that lets us do a single pass over
all the incoming values of the original PHI node, extracting the ones we
care about. Then we could quickly construct the new phi node from this
list. This would remove the remaining underlying quadratic movement of
unrelated incoming values and the need for silly backwards looping to
"minimize" how often we hit the quadratic case.

This is the last obvious fix for PR19499. It shaves another 20% off the
compile time for me, and while UpdatePHINodes remains in the profile,
most of the time is now stemming from the well known inefficiencies of
LVI and jump threading.

llvm-svn: 207409
2014-04-28 10:37:30 +00:00
Craig Topper
b663bffa27 [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Gerolf Hoflehner
13cf626de6 RecursivelyDeleteTriviallyDeadInstructions() could remove
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

Repaired r207302.

llvm-svn: 207309
2014-04-26 05:58:11 +00:00
Gerolf Hoflehner
c780c786a3 Restore CloneFunction.cpp which got accidently
overwritten by previous backout of r207303

llvm-svn: 207308
2014-04-26 05:43:41 +00:00
Gerolf Hoflehner
99a3e5b9f5 Revert commit r207302 since build failures
have been reported.

llvm-svn: 207303
2014-04-26 02:03:17 +00:00
Gerolf Hoflehner
54bf53d166 RecursivelyDeleteTriviallyDeadInstructions() could remove
more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

llvm-svn: 207302
2014-04-26 01:19:16 +00:00
Adrian Prantl
ca946260a4 Unbreak the gdb buildbot by not lowering dbg.declare intrinsics for arrays.
llvm-svn: 207284
2014-04-25 23:00:25 +00:00
Adrian Prantl
7566e72bb8 This reapplies r207235 with an additional bugfixes caught by the msan
buildbot - do not insert debug intrinsics before phi nodes.

Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207269
2014-04-25 20:49:25 +00:00
Adrian Prantl
319db7c542 Revert "This reapplies r207130 with an additional testcase+and a missing check for"
This reverts commit 207235 to investigate msan buildbot breakage.

llvm-svn: 207250
2014-04-25 18:18:09 +00:00
Adrian Prantl
c3a2bdef85 Reapply r207135 without modifications.
Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.

rdar://problem/14874886, rdar://problem/16679936

llvm-svn: 207236
2014-04-25 17:01:04 +00:00
Adrian Prantl
7f9d1e9fd6 This reapplies r207130 with an additional testcase+and a missing check for
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207235
2014-04-25 17:01:00 +00:00
Craig Topper
c0a2a29f4e [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Adrian Prantl
0338f80f17 Revert "This reapplies r207130 with an additional testcase+and a missing check for"
Typo in testcase.

llvm-svn: 207166
2014-04-25 00:42:50 +00:00
Adrian Prantl
bf019d19e9 This reapplies r207130 with an additional testcase+and a missing check for
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source

rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207165
2014-04-25 00:38:40 +00:00
Adrian Prantl
0b669e8f79 Revert "Debug info for optimized code: Support variables that are on the stack and"
This reverts commit 207130 for buildbot breakage.

llvm-svn: 207162
2014-04-25 00:04:49 +00:00
Adrian Prantl
ff50aa7bca Revert "Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location"
This reverts commit 207130 for buildbot breakage.

llvm-svn: 207159
2014-04-24 23:53:29 +00:00
Adrian Prantl
daee7e4c17 Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location
of the dbg.value. This gets rid of tons of redundant variable DIEs in
subscopes.

rdar://problem/14874886, rdar://problem/16679936

llvm-svn: 207135
2014-04-24 18:44:15 +00:00
Adrian Prantl
807e5d8a9a Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.

Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.

This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine-intrinsics testcase and included source


rdar://problem/16679879
http://reviews.llvm.org/D3374

llvm-svn: 207130
2014-04-24 17:41:45 +00:00
Matt Arsenault
b37a455d1d Remove more default address space argument usage.
These places are inconsequential in practice.

llvm-svn: 207021
2014-04-23 20:58:57 +00:00
Chandler Carruth
6f9ba6a633 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
Chandler Carruth
15c7b91ac2 [Modules] Make Support/Debug.h modular. This requires it to not change
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.

This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:

- Header files that need to provide a DEBUG_TYPE for some inline code
  can do so by defining the macro before their inline code and undef-ing
  it afterward so the macro does not escape.

- We no longer have rampant ODR violations due to including headers with
  different DEBUG_TYPE definitions. This may be mostly an academic
  violation today, but with modules these types of violations are easy
  to check for and potentially very relevant.

Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.

The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.

llvm-svn: 206822
2014-04-21 22:55:11 +00:00
Reid Kleckner
5013d7ca2d Fix PR7272 in -tailcallelim instead of the inliner
The -tailcallelim pass should be checking if byval or inalloca args can
be captured before marking calls as tail calls.  This was the real root
cause of PR7272.

With a better fix in place, revert the inliner change from r105255.  The
test case it introduced still passes and has been moved to
test/Transforms/Inline/byval-tail-call.ll.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3403

llvm-svn: 206789
2014-04-21 20:48:47 +00:00
Diego Novillo
45811c5ea3 Fix bug 19437 - Only add discriminators for DWARF 4 and above.
Summary:
This prevents the discriminator generation pass from triggering if
the DWARF version being used in the module is prior to 4.

Reviewers: echristo, dblaikie

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3413

llvm-svn: 206507
2014-04-17 22:33:50 +00:00
Nuno Lopes
4a36b584a3 remove some dead code
lib/Analysis/IPA/InlineCost.cpp         |   18 ------------------
 lib/Analysis/RegionPass.cpp             |    1 -
 lib/Analysis/TypeBasedAliasAnalysis.cpp |    1 -
 lib/Transforms/Scalar/LoopUnswitch.cpp  |   21 ---------------------
 lib/Transforms/Utils/LCSSA.cpp          |    2 --
 lib/Transforms/Utils/LoopSimplify.cpp   |    6 ------
 utils/TableGen/AsmWriterEmitter.cpp     |   13 -------------
 utils/TableGen/DFAPacketizerEmitter.cpp |    7 -------
 utils/TableGen/IntrinsicEmitter.cpp     |    2 --
 9 files changed, 71 deletions(-)

llvm-svn: 206506
2014-04-17 22:26:44 +00:00
Julien Lerouge
dd5842a2e5 Add lifetime markers for allocas created to hold byval arguments, make them
appear in the InlineFunctionInfo.

llvm-svn: 206308
2014-04-15 18:06:46 +00:00
Julien Lerouge
9ecdd0ce5b Split byval argument initialization so the memcpy(s) are injected at the
beginning of the first new block after inlining.

llvm-svn: 206307
2014-04-15 18:01:54 +00:00
David Blaikie
1573e6e09f Implement depth_first and inverse_depth_first range factory functions.
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?

Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).

Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)

llvm-svn: 206016
2014-04-11 01:50:01 +00:00
Adrian Prantl
3324d55193 C++11: convert verbose loops to range-based loops.
llvm-svn: 204981
2014-03-27 23:30:04 +00:00
Reid Kleckner
509530b2ae CloneFunction: Clone all attributes, including the CC
Summary:
Tested with a unit test because we don't appear to have any transforms
that use this other than ASan, I think.

Fixes PR17935.

Reviewers: nicholas

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3194

llvm-svn: 204866
2014-03-26 22:26:35 +00:00
Mark Seaborn
ade468f2c3 Remove LowerInvoke's obsolete "-enable-correct-eh-support" option
This option caused LowerInvoke to generate code using SJLJ-based
exception handling, but there is no code left that interprets the
jmp_buf stack that the resulting code maintained (llvm.sjljeh.jblist).
This option has been obsolete for a while, and replaced by
SjLjEHPrepare.

This leaves the default behaviour of LowerInvoke, which is to convert
invokes to calls.

Differential Revision: http://llvm-reviews.chandlerc.com/D3136

llvm-svn: 204388
2014-03-20 19:54:47 +00:00
Evgeniy Stepanov
17d50b69f6 Set debug info for instructions inserted in SplitBlockAndInsertIfThen.
llvm-svn: 204230
2014-03-19 12:56:38 +00:00
Alon Mishne
70ba46ff38 [C++11] Change DebugInfoFinder to use range-based loops
Also changes the iterators to return actual DI type over MDNode.

llvm-svn: 204130
2014-03-18 09:41:07 +00:00
Hans Wennborg
a2aafbb7b5 Allow switch-to-lookup table for tables with holes by adding bitmask check
This allows us to generate table lookups for code such as:

  unsigned test(unsigned x) {
    switch (x) {
      case 100: return 0;
      case 101: return 1;
      case 103: return 2;
      case 105: return 3;
      case 107: return 4;
      case 109: return 5;
      case 110: return 6;
      default: return f(x);
    }
  }

Since cases 102, 104, etc. are not constants, the lookup table has holes
in those positions. We therefore guard the table lookup with a bitmask check.

Patch by Jasper Neumann!

llvm-svn: 203694
2014-03-12 18:35:40 +00:00
Evan Cheng
f2d3d2bf92 Revert r203488 and r203520.
llvm-svn: 203687
2014-03-12 18:09:37 +00:00
Alon Mishne
00d720ff32 Cloning a function now also clones its debug metadata if 'ModuleLevelChanges' is true.
llvm-svn: 203662
2014-03-12 14:42:51 +00:00
Evan Cheng
9a155c5f78 Follow up to r203488. Code clean up to eliminate a lot of copy+paste.
llvm-svn: 203520
2014-03-11 00:24:20 +00:00
Evan Cheng
b0fdca31bc For functions with ARM target specific calling convention, when simplify-libcall
optimize a call to a llvm intrinsic to something that invovles a call to a C
library call, make sure it sets the right calling convention on the call.

e.g.
extern double pow(double, double);
double t(double x) {
  return pow(10, x);
}

Compiles to something like this for AAPCS-VFP:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x)
  ret double %0
}

declare double @llvm.pow.f64(double, double) #1

Simplify libcall (part of instcombine) will turn the above into:
define arm_aapcs_vfpcc double @t(double %x) #0 {
entry:
  %__exp10 = call double @__exp10(double %x) #1
  ret double %__exp10
}

declare double @__exp10(double)

The pre-instcombine code works because calls to LLVM builtins are special.
Instruction selection will chose the right calling convention for the call.
However, the code after instcombine is wrong. The call to __exp10 will use
the C calling convention.

I can think of 3 options to fix this.

1. Make "C" calling convention just work since the target should know what CC
   is being used.

   This doesn't work because each function can use different CC with the "pcs"
   attribute.

2. Have Clang add the right CC keyword on the calls to LLVM builtin.

   This will work but it doesn't match the LLVM IR specification which states
   these are "Standard C Library Intrinsics".

3. Fix simplify libcall so the resulting calls to the C routines will have the
   proper CC keyword. e.g.
   %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1

   This works and is the solution I implemented here.

Both solutions #2 and #3 would work. After carefully considering the pros and
cons, I decided to implement #3 for the following reasons.

1. It doesn't change the "spec" of the intrinsics.
2. It's a self-contained fix.

There are a couple of potential downsides.
1. There could be other places in the optimizer that is broken in the same way
   that's not addressed by this.
2. There could be other calling conventions that need to be propagated by
   simplify-libcall that's not handled.

But for now, this is the fix that I'm most comfortable with.

llvm-svn: 203488
2014-03-10 20:49:45 +00:00
Benjamin Kramer
488ab03435 SimplifyCFG: Simplify the weight scaling algorithm.
No change in functionality.

llvm-svn: 203413
2014-03-09 14:42:55 +00:00
Chandler Carruth
fad39ebe19 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

llvm-svn: 203364
2014-03-09 03:16:01 +00:00
Ahmed Charles
52ce0c101e Replace OwningPtr<T> with std::unique_ptr<T>.
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.

llvm-svn: 203083
2014-03-06 05:51:42 +00:00
Chandler Carruth
0873afae39 [Layering] Move DebugInfo.h into the IR library where its implementation
already lives.

llvm-svn: 203046
2014-03-06 00:46:21 +00:00
Chandler Carruth
2b135c4e9f [Layering] Move DIBuilder.h into the IR library where its implementation
already lives.

llvm-svn: 203038
2014-03-06 00:22:06 +00:00
Ahmed Charles
4a96a15754 [C++11] Replace OwningPtr::take() with OwningPtr::release().
llvm-svn: 202957
2014-03-05 10:19:29 +00:00
Craig Topper
a3683ec835 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 202953
2014-03-05 09:10:37 +00:00
Chandler Carruth
436597fe00 [Modules] Move the ConstantRange class into the IR library. This is
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.

llvm-svn: 202838
2014-03-04 12:24:34 +00:00
Chandler Carruth
4b66708834 [Modules] Move the PredIteratorCache into the IR library -- it is
hardcoded to use IR BasicBlocks.

llvm-svn: 202835
2014-03-04 12:09:19 +00:00
Chandler Carruth
248195469c [Modules] Move the NoFolder into the IR library as it creates
instructions.

llvm-svn: 202834
2014-03-04 12:05:47 +00:00
Chandler Carruth
075812f27c [Modules] Move CFG.h to the IR library as it defines graph traits over
IR types.

llvm-svn: 202827
2014-03-04 11:45:46 +00:00
Chandler Carruth
649f6270aa [Modules] Move ValueHandle into the IR library where Value itself lives.
Move the test for this class into the IR unittests as well.

This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.

llvm-svn: 202821
2014-03-04 11:17:44 +00:00
Chandler Carruth
d0657fe39f [Modules] Move the LLVM IR pattern match header into the IR library, it
obviously is coupled to the IR.

llvm-svn: 202818
2014-03-04 11:08:18 +00:00
Chandler Carruth
cfb81122cc [Modules] Move CallSite into the IR library where it belogs. It is
abstracting between a CallInst and an InvokeInst, both of which are IR
concepts.

llvm-svn: 202816
2014-03-04 11:01:28 +00:00
Chandler Carruth
0bf5689f06 [Modules] Move GetElementPtrTypeIterator into the IR library. As its
name might indicate, it is an iterator over the types in an instruction
in the IR.... You see where this is going.

Another step of modularizing the support library.

llvm-svn: 202815
2014-03-04 10:40:04 +00:00
Chandler Carruth
cd48c56575 [cleanup] Re-sort all the includes with utils/sort_includes.py.
llvm-svn: 202811
2014-03-04 10:07:28 +00:00
Diego Novillo
2ccb22f509 Pass to emit DWARF path discriminators.
DWARF discriminators are used to distinguish multiple control flow paths
on the same source location. When this happens, instructions across
basic block boundaries will share the same debug location.

This pass detects this situation and creates a new lexical scope to one
of the two instructions. This lexical scope is a child scope of the
original and contains a new discriminator value. This discriminator is
then picked up from MCObjectStreamer::EmitDwarfLocDirective to be
written on the object file.

This fixes http://llvm.org/bugs/show_bug.cgi?id=18270.

llvm-svn: 202752
2014-03-03 20:06:11 +00:00
Benjamin Kramer
e4eb1b495f [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Rafael Espindola
32da4bdd4b Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

llvm-svn: 202168
2014-02-25 17:30:31 +00:00
Rafael Espindola
1f7e9d4bed Rename a few more DataLayout variables.
llvm-svn: 201833
2014-02-21 01:53:35 +00:00
Rafael Espindola
83f8550fb2 Rename many DataLayout variables from TD to DL.
I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.

llvm-svn: 201827
2014-02-21 00:06:31 +00:00
Rafael Espindola
d85e4eb0f5 Rename some member variables from TD to DL.
TargetData was renamed DataLayout back in r165242.

llvm-svn: 201581
2014-02-18 15:33:12 +00:00
Chandler Carruth
90e016d8fc [LPM] A terribly simple fix to a terribly complex bug: PR18773.
The crux of the issue is that LCSSA doesn't preserve stateful alias
analyses. Before r200067, LICM didn't cause LCSSA to run in the LTO pass
manager, where LICM runs essentially without any of the other loop
passes. As a consequence the globalmodref-aa pass run before that loop
pass manager was able to survive the loop pass manager and be used by
DSE to eliminate stores in the function called from the loop body in
Adobe-C++/loop_unroll (and similar patterns in other benchmarks).

When LICM was taught to preserve LCSSA it had to require it as well.
This caused it to be run in the loop pass manager and because it did not
preserve AA, the stateful AA was lost. Most of LLVM's AA isn't stateful
and so this didn't manifest in most cases. Also, in most cases LCSSA was
already running, and so there was no interesting change.

The real kicker is that LCSSA by its definition (injecting PHI nodes
only) trivially preserves AA! All we need to do is mark it, and then
everything goes back to working as intended. It probably was blocking
some other weird cases of stateful AA but the only one I have is
a 1000-line IR test case from loop_unroll, so I don't really have a good
test case here.

Hopefully this fixes the regressions on performance that have been seen
since that revision.

llvm-svn: 201104
2014-02-10 19:39:35 +00:00
Benjamin Kramer
700474a946 SimplifyLibCalls: Push TLI through the exp2->ldexp transform.
For the odd case of platforms with exp2 available but not ldexp.

llvm-svn: 200795
2014-02-04 20:27:23 +00:00
Tim Northover
d6fb863f04 OS X: the correct function is __sincospif_stret, not __sincospi_stretf
rdar://problem/13729466

llvm-svn: 200771
2014-02-04 16:28:20 +00:00
Kai Nacke
a3477b4ff6 Add strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls
Add the missing transformation strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls
and remove the ToDo comment.

Reviewer: Duncan P.N. Exan Smith
llvm-svn: 200736
2014-02-04 05:55:16 +00:00
Duncan P. N. Exon Smith
4f1b28340d Lower llvm.expect intrinsic correctly for i1
LowerExpectIntrinsic previously only understood the idiom of an expect
intrinsic followed by a comparison with zero. For llvm.expect.i1, the
comparison would be stripped by the early-cse pass.

Patch by Daniel Micay.

llvm-svn: 200664
2014-02-02 22:43:55 +00:00
Eli Bendersky
62efb50a57 Remove some unused #includes
llvm-svn: 200611
2014-02-01 13:12:54 +00:00
Chandler Carruth
6ba48b6c38 [LPM] Fix PR18643, another scary place where loop transforms failed to
preserve loop simplify of enclosing loops.

The problem here starts with LoopRotation which ends up cloning code out
of the latch into the new preheader it is buidling. This can create
a new edge from the preheader into the exit block of the loop which
breaks LoopSimplify form. The code tries to fix this by splitting the
critical edge between the latch and the exit block to get a new exit
block that only the latch dominates. This sadly isn't sufficient.

The exit block may be an exit block for multiple nested loops. When we
clone an edge from the latch of the inner loop to the new preheader
being built in the outer loop, we create an exiting edge from the outer
loop to this exit block. Despite breaking the LoopSimplify form for the
inner loop, this is fine for the outer loop. However, when we split the
edge from the inner loop to the exit block, we create a new block which
is in neither the inner nor outer loop as the new exit block. This is
a predecessor to the old exit block, and so the split itself takes the
outer loop out of LoopSimplify form. We need to split every edge
entering the exit block from inside a loop nested more deeply than the
exit block in order to preserve all of the loop simplify constraints.

Once we try to do that, a problem with splitting critical edges
surfaces. Previously, we tried a very brute force to update LoopSimplify
form by re-computing it for all exit blocks. We don't need to do this,
and doing this much will sometimes but not always overlap with the
LoopRotate bug fix. Instead, the code needs to specifically handle the
cases which can start to violate LoopSimplify -- they aren't that
common. We need to see if the destination of the split edge was a loop
exit block in simplified form for the loop of the source of the edge.
For this to be true, all the predecessors need to be in the exact same
loop as the source of the edge being split. If the dest block was
originally in this form, we have to split all of the deges back into
this loop to recover it. The old mechanism of doing this was
conservatively correct because at least *one* of the exiting blocks it
rewrote was the DestBB and so the DestBB's predecessors were fixed. But
this is a much more targeted way of doing it. Making it targeted is
important, because ballooning the set of edges touched prevents
LoopRotate from being able to split edges *it* needs to split to
preserve loop simplify in a coherent way -- the critical edge splitting
would sometimes find the other edges in need of splitting but not
others.

Many, *many* thanks for help from Nick reducing these test cases
mightily. And helping lots with the analysis here as this one was quite
tricky to track down.

llvm-svn: 200393
2014-01-29 13:16:53 +00:00
Rafael Espindola
e8856107f0 Fix pr14893.
When simplifycfg moves an instruction, it must drop metadata it doesn't know
is still valid with the preconditions changes. In particular, it must drop
the range and tbaa metadata.

The patch implements this with an utility function to drop all metadata not
in a white list.

llvm-svn: 200322
2014-01-28 16:56:46 +00:00
Chandler Carruth
b19a7319a9 [LPM] Fix PR18616 where the shifts to the loop pass manager to extract
LCSSA from it caused a crasher with the LoopUnroll pass.

This crasher is really nasty. We destroy LCSSA form in a suprising way.
When unrolling a loop into an outer loop, we not only need to restore
LCSSA form for the outer loop, but for all children of the outer loop.
This is somewhat obvious in retrospect, but hey!

While this seems pretty heavy-handed, it's not that bad. Fundamentally,
we only do this when we unroll a loop, which is already a heavyweight
operation. We're unrolling all of these hypothetical inner loops as
well, so their size and complexity is already on the critical path. This
is just adding another pass over them to re-canonicalize.

I have a test case from PR18616 that is great for reproducing this, but
pretty useless to check in as it relies on many 10s of nested empty
loops that get unrolled and deleted in just the right order. =/ What's
worse is that investigating this has exposed another source of failure
that is likely to be even harder to test. I'll try to come up with test
cases for these fixes, but I want to get the fixes into the tree first
as they're causing crashes in the wild.

llvm-svn: 200273
2014-01-28 01:25:38 +00:00
Manman Ren
c3f51e8e54 PGO branch weight: keep halving the weights until they can fit into
uint32.

When folding branches to common destination, the updated branch weights
can exceed uint32 by more than factor of 2. We should keep halving the
weights until they can fit into uint32.

llvm-svn: 200262
2014-01-27 23:39:03 +00:00
Chandler Carruth
3998de34a0 [LPM] Make LCSSA a utility with a FunctionPass that applies it to all
the loops in a function, and teach LICM to work in the presance of
LCSSA.

Previously, LCSSA was a loop pass. That made passes requiring it also be
loop passes and unable to depend on function analysis passes easily. It
also caused outer loops to have a different "canonical" form from inner
loops during analysis. Instead, we go into LCSSA form and preserve it
through the loop pass manager run.

Note that this has the same problem as LoopSimplify that prevents
enabling its verification -- loop passes which run at the end of the loop
pass manager and don't preserve these are valid, but the subsequent loop
pass runs of outer loops that do preserve this pass trigger too much
verification and fail because the inner loop no longer verifies.

The other problem this exposed is that LICM was completely unable to
handle LCSSA form. It didn't preserve it and it actually would give up
on moving instructions in many cases when they were used by an LCSSA phi
node. I've taught LICM to support detecting LCSSA-form PHI nodes and to
hoist and sink around them. This may actually let LICM fire
significantly more because we put everything into LCSSA form to rotate
the loop before running LICM. =/ Now LICM should handle that fine and
preserve it correctly. The down side is that LICM has to require LCSSA
in order to preserve it. This is just a fact of life for LCSSA. It's
entirely possible we should completely remove LCSSA from the optimizer.

The test updates are essentially accomodating LCSSA phi nodes in the
output of LICM, and the fact that we now completely sink every
instruction in ashr-crash below the loop bodies prior to unrolling.

With this change, LCSSA is computed only three times in the pass
pipeline. One of them could be removed (and potentially a SCEV run and
a separate LoopPassManager entirely!) if we had a LoopPass variant of
InstCombine that ran InstCombine on the loop body but refused to combine
away LCSSA PHI nodes. Currently, this also prevents loop unrolling from
being in the same loop pass manager is rotate, LICM, and unswitch.

There is one thing that I *really* don't like -- preserving LCSSA in
LICM is quite expensive. We end up having to re-run LCSSA twice for some
loops after LICM runs because LICM can undo LCSSA both in the current
loop and the parent loop. I don't really see good solutions to this
other than to completely move away from LCSSA and using tools like
SSAUpdater instead.

llvm-svn: 200067
2014-01-25 04:07:24 +00:00
Alp Toker
1c4b33e8e5 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Chandler Carruth
46bbc995de [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
function and a FunctionPass.

This has many benefits. The motivating use case was to be able to
compute function analysis passes *after* running LoopSimplify (to avoid
invalidating them) and then to run other passes which require
LoopSimplify. Specifically passes like unrolling and vectorization are
critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
that they can be profile aware. For the LoopVectorize pass the only
things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
and LCSSA is next on my list.

There are also a bunch of other benefits of doing this:
- It is now very feasible to make more passes *preserve* LoopSimplify
  because they can simply run it after changing a loop. Because
  subsequence passes can assume LoopSimplify is preserved we can reduce
  the runs of this pass to the times when we actually mutate a loop
  structure.
- The new pass manager should be able to more easily support loop passes
  factored in this way.
- We can at long, long last observe that LoopSimplify is preserved
  across SCEV. This *halves* the number of times we run LoopSimplify!!!

Now, getting here wasn't trivial. First off, the interfaces used by
LoopSimplify are all over the map regarding how analysis are updated. We
end up with weird "pass" parameters as a consequence. I'll try to clean
at least some of this up later -- I'll have to have it all clean for the
new pass manager.

Next up I discovered a really frustrating bug. LoopUnroll *claims* to
preserve LoopSimplify. That's actually a lie. But the way the
LoopPassManager ends up running the passes, it always ran LoopSimplify
on the unrolled-into loop, rectifying this oversight before any
verification could kick in and point out that in fact nothing was
preserved. So I've added code to the unroller to *actually* simplify the
surrounding loop when it succeeds at unrolling.

The only functional change in the test suite is that we now catch a case
that was previously missed because SCEV and other loop transforms see
their containing loops as simplified and thus don't miss some
opportunities. One test case has been converted to check that we catch
this case rather than checking that we miss it but at least don't get
the wrong answer.

Note that I have #if-ed out all of the verification logic in
LoopSimplify! This is a temporary workaround while extracting these bits
from the LoopPassManager. Currently, there is no way to have a pass in
the LoopPassManager which preserves LoopSimplify along with one which
does not. The LPM will try to verify on each loop in the nest that
LoopSimplify holds but the now-Function-pass cannot distinguish what
loop is being verified and so must try to verify all of them. The inner
most loop is clearly no longer simplified as there is a pass which
didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
LoopVectorize and some other fixes) I'll be able to re-enable this check
and catch any places where we are still failing to preserve
LoopSimplify. If this causes problems I can back this out and try to
commit *all* of this at once, but so far this seems to work and allow
much more incremental progress.

llvm-svn: 199884
2014-01-23 11:23:19 +00:00
Hans Wennborg
efa9ef0e63 Switch-to-lookup tables: set threshold to 3 cases
There has been an old FIXME to find the right cut-off for when it's worth
analyzing and potentially transforming a switch to a lookup table.

The switches always have two or more cases. I could not measure any speed-up
by transforming a switch with two cases. A switch with three cases gets a nice
speed-up, and I couldn't measure any compile-time regression, so I think this
is the right threshold.

In a Clang self-host, this causes 480 new switches to be transformed,
and reduces the final binary size with 8 KB.

llvm-svn: 199294
2014-01-15 05:00:27 +00:00
Chandler Carruth
98adff6224 [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

llvm-svn: 199104
2014-01-13 13:07:17 +00:00
Chandler Carruth
ee051af6e2 [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082
2014-01-13 09:26:24 +00:00
Hans Wennborg
f5c5f6e123 Switch-to-lookup tables: Don't require a result for the default
case when the lookup table doesn't have any holes.

This means we can build a lookup table for switches like this:

  switch (x) {
    case 0: return 1;
    case 1: return 2;
    case 2: return 3;
    case 3: return 4;
    default: exit(1);
  }

The default case doesn't yield a constant result here, but that doesn't matter,
since a default result is only necessary for filling holes in the lookup table,
and this table doesn't have any holes.

This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB
off the resulting clang binary.

llvm-svn: 199025
2014-01-12 00:44:41 +00:00
Chandler Carruth
87f14b4eec Re-sort all of the includes with ./utils/sort_includes.py so that
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.

Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.

llvm-svn: 198685
2014-01-07 11:48:04 +00:00
Andrew Trick
12dfc32452 Reapply r198478 "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things."
Now with a fix for PR18384: ValueHandleBase::ValueIsDeleted.

We need to invalidate SCEV's loop info when we delete a block, even if no values are hoisted.

llvm-svn: 198631
2014-01-06 19:43:14 +00:00
Alp Toker
2d17611e90 Revert "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things."
This commit was the source of crasher PR18384:

While deleting: label %for.cond127
An asserting value handle still pointed to this value!
UNREACHABLE executed at llvm/lib/IR/Value.cpp:671!

Reverting to get the builders green, feel free to re-land after fixing up.
(Renato has a handy isolated repro if you need it.)

This reverts commit r198478.

llvm-svn: 198503
2014-01-04 17:00:45 +00:00
Andrew Trick
45ef495b91 Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things.
getSCEV for an ashr instruction creates an intermediate zext
expression when it truncates its operand.

The operand is initially inside the loop, so the narrow zext
expression has a non-loop-invariant loop disposition.

LoopSimplify then runs on an outer loop, hoists the ashr operand, and
properly invalidate the SCEVs that are mapped to value.

The SCEV expression for the ashr is now an AddRec with the hoisted
value as the now loop-invariant start value.

The LoopDisposition of this wide value was properly invalidated during
LoopSimplify.

However, if we later get the ashr SCEV again, we again try to create
the intermediate zext expression. We get the same SCEV that we did
earlier, and it is still cached because it was never mapped to a
Value. When we try to create a new AddRec we abort because we're using
the old non-loop-invariant LoopDisposition.

I don't have a solution for this other than to clear LoopDisposition
when LoopSimplify hoists things.

I think the long-term strategy should be to perform LoopSimplify on
all loops before computing SCEV and before running any loop opts on
individual loops. It's possible we may want to rerun LoopSimplify on
individual loops, but it should rarely do anything, so rarely require
invalidating SCEV.

llvm-svn: 198478
2014-01-04 05:52:49 +00:00
Andrew Trick
e7f9f5556d Add support to indvars for optimizing sadd.with.overflow.
Split sadd.with.overflow into add + sadd.with.overflow to allow
analysis and optimization. This should ideally be done after
InstCombine, which can perform code motion (eventually indvars should
run after all canonical instcombines). We want ISEL to recombine the
add and the check, at least on x86.

This is currently under an option for reducing live induction
variables: -liv-reduce. The next step is reducing liveness of IVs that
are live out of the overflow check paths. Once the related
optimizations are fully developed, reviewed and tested, I do expect
this to become default.

llvm-svn: 197926
2013-12-23 23:31:49 +00:00
Kostya Serebryany
a148c8c9ed [asan] don't unpoison redzones on function exit in use-after-return mode.
Summary:
Before this change the instrumented code before Ret instructions looked like:
  <Unpoison Frame Redzones>
  if (Frame != OriginalFrame) // I.e. Frame is fake
     <Poison Complete Frame>

Now the instrumented code looks like:
  if (Frame != OriginalFrame) // I.e. Frame is fake
     <Poison Complete Frame>
  else
     <Unpoison Frame Redzones>

Reviewers: eugenis

Reviewed By: eugenis

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2458

llvm-svn: 197907
2013-12-23 14:15:08 +00:00
Justin Bogner
3b4e34606e Transforms: Don't create bad weights when eliminating dead cases
If we happen to eliminate every case in a switch that has branch
weights, we currently try to create metadata for the one remaining
branch, triggering an assert. Instead, we need to check that the
metadata we're trying to create is sensible.

llvm-svn: 197791
2013-12-20 08:21:30 +00:00
Evgeniy Stepanov
0cd4eea1b6 Add an explicit insert point argument to SplitBlockAndInsertIfThen.
Currently SplitBlockAndInsertIfThen requires that branch condition is an
Instruction itself, which is very inconvenient, because it is sometimes an
Operator, or even a Constant.

llvm-svn: 197677
2013-12-19 13:29:56 +00:00
Yi Jiang
67f2e8e3f8 Enable double to float shrinking optimizations for binary functions like 'fmin/fmax'. Fix radar:15283121
llvm-svn: 197434
2013-12-16 22:42:40 +00:00
Yi Jiang
f648b7ec82 Resubmit r196544: Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x)
llvm-svn: 197109
2013-12-12 01:55:04 +00:00
Justin Bogner
42f9d0cf93 Transforms: Don't create bad branch weights when folding a switch
This avoids creating branch weight metadata of length one when we fold
cases into the default of a switch instruction, which was triggering
an assert.

llvm-svn: 196845
2013-12-10 00:13:41 +00:00
Manman Ren
4fcc808139 Revert 196544 due to internal bot failures.
llvm-svn: 196732
2013-12-08 20:28:33 +00:00
Mark Seaborn
72517bc221 Fix inlining to not lose the "cleanup" clause from landingpads
This fixes PR17872.  This bug can lead to C++ destructors not being
called when they should be, when an exception is thrown.

llvm-svn: 196711
2013-12-08 00:51:21 +00:00
Mark Seaborn
2d856cb007 Fix inlining to not produce duplicate landingpad clauses
Before this change, inlining one "invoke" into an outer "invoke" call
site can lead to the outer landingpad's catch/filter clauses being
copied multiple times into the resulting landingpad.  This happens:

 * when the inlined function contains multiple "resume" instructions,
   because forwardResume() copies the clauses but is called multiple
   times;

 * when the inlined function contains a "resume" and a "call", because
   HandleCallsInBlockInlinedThroughInvoke() copies the clauses but is
   redundant with forwardResume().

Fix this by deduplicating the code.

This problem doesn't lead to any incorrect execution; it's only
untidy.

This change will make fixing PR17872 a little easier.

llvm-svn: 196710
2013-12-08 00:50:58 +00:00
Jakub Staszak
11e1c882f7 Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces
overall time of LLVM compilation by ~1%.

llvm-svn: 196667
2013-12-07 21:20:17 +00:00
Kostya Serebryany
f0897919b4 [asan] fix ndebug build with strict warnings (-Wunused-variable)
llvm-svn: 196574
2013-12-06 09:26:09 +00:00
Kostya Serebryany
fc32a3e5d2 [asan] rewrite asan's stack frame layout
Summary:
Rewrite asan's stack frame layout.
First, most of the stack layout logic is moved into a separte file
to make it more testable and (potentially) useful for other projects.
Second, make the frames more compact by using adaptive redzones
(smaller for small objects, larger for large objects).
Third, try to minimized gaps due to large alignments (this is hypothetical since
today we don't see many stack vars aligned by more than 32).

The frames indeed become more compact, but I'll still need to run more benchmarks
before committing, but I am sking for review now to get early feedback.

This change will be accompanied by a trivial change in compiler-rt tests
to match the new frame sizes.

Reviewers: samsonov, dvyukov

Reviewed By: samsonov

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2324

llvm-svn: 196568
2013-12-06 09:00:17 +00:00
Yi Jiang
0bd0569be5 Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x)
llvm-svn: 196544
2013-12-05 22:42:50 +00:00
Alp Toker
e845f8af67 Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Mark Seaborn
b08ae3c322 InlineFunction.cpp: Remove a return value that is always false
Remove some associated dead code.

This cleanup is associated with PR17872.

llvm-svn: 196147
2013-12-02 20:50:59 +00:00
Michael Ilseman
0f69f39907 Add support for software expansion of 64-bit integer division instructions.
Patch by Dmitri Shtilman!

llvm-svn: 195116
2013-11-19 06:54:19 +00:00
Adrian Prantl
afeac86924 Debug info: Let LowerDbgDeclare perfom the dbg.declare -> dbg.value
lowering only for load/stores to scalar allocas. The resulting values
confuse the backend and don't add anything because we can describe
array-allocas with a dbg.declare intrinsic just fine.

rdar://problem/15464571

llvm-svn: 195052
2013-11-18 23:04:38 +00:00
NAKAMURA Takumi
413002f21c Utils/LoopUnroll.cpp: Tweak (StringRef)OldName to be valid until it is used, since r194601.
eraseFromParent() invalidates OldName.

llvm-svn: 194970
2013-11-17 18:05:34 +00:00
Hal Finkel
f09058ab9e Add the cold attribute to error-reporting call sites
Generally speaking, control flow paths with error reporting calls are cold.
So far, error reporting calls are calls to perror and calls to fprintf,
fwrite, etc. with stderr as the stream. This can be extended in the future.

The primary motivation is to improve block placement (the cold attribute
affects the static branch prediction heuristics).

llvm-svn: 194943
2013-11-17 02:06:35 +00:00
Jakub Staszak
5e8ab0ddcf Use StringRef instead of std::string
llvm-svn: 194601
2013-11-13 20:09:11 +00:00
Nadav Rotem
8fbd606127 FoldBranchToCommonDest merges branches into a single branch with or/and of the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable.
llvm-svn: 194524
2013-11-12 22:37:16 +00:00
Benjamin Kramer
d93d22eb86 SimplifyCFG: Use existing constant folding logic when forming switch tables.
Both simpler and more powerful than the hand-rolled folding logic.

llvm-svn: 194475
2013-11-12 12:24:36 +00:00
Matt Arsenault
e3debcae62 Use type form of getIntPtrType.
This should be inconsequential and is work
towards removing the default address space
arguments.

llvm-svn: 194347
2013-11-10 04:46:57 +00:00
Nadav Rotem
537b5180e9 SimplifyCFG has a heuristics for out-of-order processors that decides when it is worthwhile to merge branches. It tries to estimate if the operands of the instruction that we want to hoist are ready. This commit marks function arguments as 'ready' because they require no calculation. This boosts libquantum and a few other workloads from the testsuite.
llvm-svn: 194346
2013-11-10 04:13:31 +00:00
David Majnemer
2bbbbaaeee Revert "Inliner: Handle readonly attribute per argument when adding memcpy"
This reverts commit r193356, it caused PR17781.

A reduced test case covering this regression has been added to the test suite.

llvm-svn: 193955
2013-11-03 12:22:13 +00:00
Bob Wilson
eb7943100b Convert calls to __sinpi and __cospi into __sincospi_stret
This adds an SimplifyLibCalls case which converts the special __sinpi and
__cospi (float & double variants) into a __sincospi_stret where appropriate to
remove duplicated work.

Patch by Tim Northover

llvm-svn: 193943
2013-11-03 06:48:38 +00:00
Manman Ren
bba5655c39 Do not convert "call asm" to "invoke asm" in Inliner.
Given that backend does not handle "invoke asm" correctly ("invoke asm" will be
handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right
setup for LPadToCallSiteMap) and we already made the assumption that inline asm
does not throw in InstCombiner::visitCallSite, we are going to make the same
assumption in Inliner to make sure we don't convert "call asm" to "invoke asm".

If it becomes necessary to add support for "invoke asm" later on, we will need
to modify the backend as well as remove the assumptions that inline asm does
not throw.

Fix rdar://15317907

llvm-svn: 193808
2013-10-31 21:56:03 +00:00
Wan Xiaofei
f3100f24fa Quick look-up for block in loop.
This patch implements quick look-up for block in loop by maintaining a hash set for blocks.
It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng).
Below are the compilation time for our benchmark in llc before & after the patch.

Benchmark	llc - trunk		llc - patched	
401.bzip2	0.339081	100.00%	0.329657	102.86%
403.gcc		19.853966	100.00%	19.605466	101.27%
429.mcf		0.049823	100.00%	0.048451	102.83%
433.milc	0.514898	100.00%	0.510217	100.92%
444.namd	1.109328	100.00%	1.103481	100.53%
445.gobmk	4.988028	100.00%	4.929114	101.20%
456.hmmer	0.843871	100.00%	0.825865	102.18%
458.sjeng	0.754238	100.00%	0.714095	105.62%
464.h264ref	2.9668		100.00%	2.90612		102.09%
471.omnetpp	4.556533	100.00%	4.511886	100.99%
bitmnp01	0.038168	100.00%	0.0357		106.91%
idctrn01	0.037745	100.00%	0.037332	101.11%
libquake2	3.78689		100.00%	3.76209		100.66%
libquake_	2.251525	100.00%	2.234104	100.78%
linpack		0.033159	100.00%	0.032788	101.13%
matrix01	0.045319	100.00%	0.043497	104.19%
nbench		0.333161	100.00%	0.329799	101.02%
tblook01	0.017863	100.00%	0.017666	101.12%
ttsprk01	0.054337	100.00%	0.053057	102.41%

Reviewer	: Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov>
Approver	: Andrew Trick <atrick@apple.com>
Test		: Pass make check-all & llvm test-suite

llvm-svn: 193460
2013-10-26 03:08:02 +00:00
Rafael Espindola
37e88054ad Handle calls and invokes in GlobalStatus.
This patch teaches GlobalStatus to analyze a call that uses the global value as
a callee, not as an argument.

With this change internalize call handle the common use of linkonce_odr
functions. This reduces the number of linkonce_odr functions in a LTO build of
clang (checked with the emit-llvm gold plugin option) from 1730 to 60.

llvm-svn: 193436
2013-10-25 21:29:52 +00:00
Tom Stellard
3863b94253 Inliner: Handle readonly attribute per argument when adding memcpy
Patch by: Vincent Lejeune

llvm-svn: 193356
2013-10-24 16:38:33 +00:00
Tom Stellard
6f3b22d6ce SimplifyCFG: Don't duplicate calls to functions marked noduplicate v2
v2:
  - Use CI->cannotDuplicate()

llvm-svn: 193115
2013-10-21 20:07:30 +00:00
Matt Arsenault
8faf1f4444 Teach SimplifyCFG about address spaces
llvm-svn: 193104
2013-10-21 18:55:08 +00:00
Rafael Espindola
34304975c5 Optimize more linkonce_odr values during LTO.
When a linkonce_odr value that is on the dso list is not unnamed_addr
we can still look to see if anything is actually using its address. If
not, it is safe to hide it.

This patch implements that by moving GlobalStatus to Transforms/Utils
and using it in Internalize.

llvm-svn: 193090
2013-10-21 17:14:55 +00:00
Michael Gottesman
80054d1ccd Fix the predecessor removal logic in r193045.
Additionally some small comment/stylistic fixes are included as well.

llvm-svn: 193068
2013-10-21 05:20:11 +00:00
Bill Wendling
ab46af5515 Don't eliminate a partially redundant load if it's in a landing pad.
A landing pad can be jumped to only by the unwind edge of an invoke
instruction. If we eliminate a partially redundant load in a landing pad, it
will create a basic block that violates this constraint. It then leads to other
problems down the line if it tries to merge that basic block with the landing
pad. Avoid this by not eliminating the load in a landing pad.

PR17621

llvm-svn: 193064
2013-10-21 04:09:17 +00:00
Michael Gottesman
f056facc24 Teach simplify-cfg how to correctly create covered lookup tables for switches on iN with N >= 3.
One optimization simplify-cfg performs is the converting of switches to
lookup tables if the switch has > 4 cases. This is done by:

1. Finding the max/min case value and calculating the switch case range.
2. Create a lookup table basic block.
3. Perform a check in the switch's BB to see if the input value is in
the switch's case range. If the input value satisfies said predicate
branch to the lookup table BB, otherwise branch to the switch's default
destination BB using the default value as the result.

The conditional check consists of subtracting the min case value of the
table from any input iN value and then ensuring that said value is
unsigned less than the size of the lookup table represented as an iN
value.

If the lookup table is a covered lookup table, the size of the table will be N
which is 0 as an iN value. Thus the comparison will be an `icmp ult` of an iN
value against 0 which is always false yielding the incorrect result.

This patch fixes this problem by recognizing if we have a covered lookup table
and if we do, unconditionally jumps to the lookup table BB since the covering
property of the lookup table implies no input values could not be handled by
said BB.

rdar://15268442

llvm-svn: 193045
2013-10-20 07:04:37 +00:00
Bill Wendling
fda67cdf91 Perform an intelligent splice of the predecessor with the single successor.
If the predecessor's being spliced into a landing pad, then we need the PHIs to
come first and the rest of the predecessor's code to come *after* the landing
pad instruction.

llvm-svn: 193035
2013-10-19 11:27:12 +00:00
Chris Lattner
18bf2f7f32 Basic blocks typically have few predecessors. Use a SmallDenseMap to
avoid a heap allocation when this is the case.

llvm-svn: 192602
2013-10-14 16:05:55 +00:00
Hal Finkel
83df3058ed UpdatePHINodes in BasicBlockUtils should not crash on duplicate predecessors
UpdatePHINodes has an optimization to reuse an existing PHI node, where it
first deletes all of its entries and then replaces them. Unfortunately, in the
case where we had duplicate predecessors (which are allowed so long as the
associated PHI entries have the same value), the loop removing the existing PHI
entries from the to-be-reused PHI would assert (if that PHI was not the one
which had the duplicates).

llvm-svn: 192001
2013-10-04 23:41:05 +00:00
Chandler Carruth
ee12d58370 Remove the very substantial, largely unmaintained legacy PGO
infrastructure.

This was essentially work toward PGO based on a design that had several
flaws, partially dating from a time when LLVM had a different
architecture, and with an effort to modernize it abandoned without being
completed. Since then, it has bitrotted for several years further. The
result is nearly unusable, and isn't helping any of the modern PGO
efforts. Instead, it is getting in the way, adding confusion about PGO
in LLVM and distracting everyone with maintenance on essentially dead
code. Removing it paves the way for modern efforts around PGO.

Among other effects, this removes the last of the runtime libraries from
LLVM. Those are being developed in the separate 'compiler-rt' project
now, with somewhat different licensing specifically more approriate for
runtimes.

llvm-svn: 191835
2013-10-02 15:42:23 +00:00
Rafael Espindola
a279462828 Remove several unused variables.
Patch by Alp Toker.

llvm-svn: 191757
2013-10-01 13:32:03 +00:00
Benjamin Kramer
7b5eaaacfd Convert manual insert point restores to the new RAII object.
llvm-svn: 191675
2013-09-30 15:40:17 +00:00
Robert Wilhelm
6b36431ffa Fix spelling intruction -> instruction.
llvm-svn: 191610
2013-09-28 11:46:15 +00:00
Benjamin Kramer
109a525643 Push analysis passes to InstSimplify when they're around anyways.
llvm-svn: 191309
2013-09-24 16:37:40 +00:00
Benjamin Kramer
1308256cf8 Provide basic type safety for array_pod_sort comparators.
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.

llvm-svn: 191175
2013-09-22 14:09:50 +00:00
Benjamin Kramer
b6950f09cc Replace some unnecessary vector copies with references.
llvm-svn: 190770
2013-09-15 22:04:42 +00:00
Robert Wilhelm
0ba533b69c Fix spelling.
llvm-svn: 190750
2013-09-14 09:34:59 +00:00
Matt Arsenault
eeee915ad0 Use StringRef::npos for StringRef instead of std::string one
llvm-svn: 190375
2013-09-10 00:41:53 +00:00
Bob Wilson
6c7d3717b3 Revert patches to add case-range support for PR1255.
The work on this project was left in an unfinished and inconsistent state.
Hopefully someone will eventually get a chance to implement this feature, but
in the meantime, it is better to put things back the way the were.  I have
left support in the bitcode reader to handle the case-range bitcode format,
so that we do not lose bitcode compatibility with the llvm 3.3 release.

This reverts the following commits: 155464, 156374, 156377, 156613, 156704,
156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575,
157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884,
157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100,
159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659,
159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736

llvm-svn: 190328
2013-09-09 19:14:35 +00:00
Matt Arsenault
4c2083b14a Use type helper functions.
llvm-svn: 190113
2013-09-06 00:37:24 +00:00
Benjamin Kramer
4bd23de2ce SimplifyLibCalls: When emitting an overloaded fp function check that it's available.
The existing code missed some edge cases when e.g. we're going to emit sqrtf but
only the availability of sqrt was checked. This happens on odd platforms like
windows.

llvm-svn: 189724
2013-08-31 18:19:35 +00:00
Benjamin Kramer
02d328a0f9 Add a function object to compare the first or second component of a std::pair.
Replace instances of this scattered around the code base.

llvm-svn: 189169
2013-08-24 12:54:27 +00:00
Yunzhong Gao
9a8c1892ea No functionality change.
Replace "(255 & value)" with "(0xFF & value)" to improve clarity.

llvm-svn: 188941
2013-08-21 22:11:15 +00:00
Peter Collingbourne
aa48e92de9 Introduce SpecialCaseList::isIn overload for GlobalAliases.
Differential Revision: http://llvm-reviews.chandlerc.com/D1437

llvm-svn: 188688
2013-08-19 19:00:35 +00:00
Michael Kuperstein
9bedf7fc8c Adds missing TLI check for library simplification of
* pow(x, 0.5) -> fabs(sqrt(x)) 
* pow(2.0, x) -> exp2(x)

llvm-svn: 188656
2013-08-19 06:55:47 +00:00
Peter Collingbourne
ea402298e2 Remove SpecialCaseList::findCategory.
It turned out that I didn't need this for DFSan.

llvm-svn: 188646
2013-08-19 00:24:20 +00:00
Yunzhong Gao
37d3ce60e8 Fixing a corner-case bug in strchr and strrchr lib call optimizations where
the input character is not converted to char before comparing with zero.

The patch was discussed in this thread:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184069.html

llvm-svn: 188489
2013-08-15 20:58:59 +00:00
Mark Lacey
c3ab3bb4bc Fix small typo: s/succ/Succ/
llvm-svn: 188415
2013-08-14 22:11:42 +00:00
Chandler Carruth
219c3d81d0 Fix a really terrifying but improbable bug in mem2reg. If you have seen
extremely subtle miscompilations (such as a load getting replaced with
the value stored *below* the load within a basic block) related to
promoting an alloca to an SSA value, there is the dim possibility that
you hit this. Please let me know if you won this unfortunate lottery.

The first half of mem2reg's core logic (as it is used both in the
standalone mem2reg pass and in SROA) builds up a mapping from
'Instruction *' to the index of that instruction within its basic block.
This allows quickly establishing which store dominate a particular load
even for large basic blocks. We cache this information throughout the
run of mem2reg over a function in order to amortize the cost of
computing it.

This is not in and of itself a strange pattern in LLVM. However, it
introduces a very important constraint: absolutely no instruction can be
deleted from the program without updating the mapping. Otherwise a newly
allocated instruction might get the same pointer address, and then end
up with a wrong index. Yes, LLVM routinely suffers from a *single
threaded* variant of the ABA problem. Most places in LLVM don't find
avoiding this an imposition because they don't both delete and create
new instructions iteratively, but mem2reg *loves* to do this... All the
time. Fortunately, the mem2reg code was really careful about updating
this cache to handle this eventuallity... except when it comes to the
debug declare intrinsic. Oops. The fix is to invalidate that pointer in
the cache when we delete it, the same as we do when deleting alloca
instructions and other instructions.

I've also caused the same bug in new code while working on a fix to
PR16867, so this seems to be a really unfortunate pattern. Hopefully in
subsequent patches the deletion of dead instructions can be consolidated
sufficiently to make it less likely that we'll see future occurences of
this bug.

Sorry for not having a test case, but I have literally no idea how to
reliably trigger this kind of thing. It may be single-threaded, but it
remains an ABA problem. It would require a really amazing number of
stars to align.

llvm-svn: 188367
2013-08-14 08:56:41 +00:00
Nick Lewycky
eab287a60a Revert r187191, which broke opt -mem2reg on the testcases included in PR16867.
However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146
when SROA started sending more things directly down the PromoteMemToReg path.

In order to revert r187191, I also revert dependent revisions r187296, r187322
and r188146. Fixes PR16867. Does not add the testcases from that PR, but both
of them should get added for both mem2reg and sroa when this revert gets
unreverted.

llvm-svn: 188327
2013-08-13 22:51:58 +00:00
Peter Collingbourne
723e9b89ec Reapply r188119 now that the bug it exposed is fixed.
llvm-svn: 188217
2013-08-12 22:38:43 +00:00
Alexey Samsonov
d27599b42b Remove unused SpecialCaseList constructors
llvm-svn: 188171
2013-08-12 11:50:44 +00:00
Alexey Samsonov
cdc0f339aa Add SpecialCaseList::createOrDie() factory and use it in sanitizer passes
llvm-svn: 188169
2013-08-12 11:46:09 +00:00
Alexey Samsonov
d27061d7e7 Introduce factory methods for SpecialCaseList
Summary:
Doing work in constructors is bad: this change suggests to
call SpecialCaseList::create(Path, Error) instead of
"new SpecialCaseList(Path)". Currently the latter may crash with
report_fatal_error, which is undesirable - sometimes we want to report
the error to user gracefully - for example, if he provides an incorrect
file as an argument of Clang's -fsanitize-blacklist flag.

Reviewers: pcc

Reviewed By: pcc

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1327

llvm-svn: 188156
2013-08-12 07:49:36 +00:00
Arnold Schwaighofer
3525aff8f6 Revert r188119 "Kill some duplicated code for removing unreachable BBs."
It is breaking builbots with libgmalloc enabled on Mac OS X.

$ cd llvm ; mkdir release ; cd release
$ ../configure --enable-optimized —prefix=$PWD/install
$ make
$ make check
$ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \
  gmalloc_path=/usr/lib/libgmalloc.dylib \
  ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll

llvm-svn: 188142
2013-08-10 20:16:06 +00:00
Peter Collingbourne
0b56f9dd44 Kill some duplicated code for removing unreachable BBs.
This moves removeUnreachableBlocksFromFn from SimplifyCFGPass.cpp
to Utils/Local.cpp and uses it to replace the implementation of
llvm::removeUnreachableBlocks, which appears to do a strict subset
of what removeUnreachableBlocksFromFn does.

Differential Revision: http://llvm-reviews.chandlerc.com/D1334

llvm-svn: 188119
2013-08-09 22:47:24 +00:00
Serge Pavlov
a57ba3eab8 Unbreak Debug build on Windows
llvm-svn: 187786
2013-08-06 08:44:18 +00:00
Tom Stellard
e4e3be6f50 Factor FlattenCFG out from SimplifyCFG
Patch by: Mei Ye

llvm-svn: 187764
2013-08-06 02:43:45 +00:00
Peter Collingbourne
42b450c977 Introduce an optimisation for special case lists with large numbers of literal entries.
Our internal regex implementation does not cope with large numbers
of anchors very efficiently.  Given a ~3600-entry special case list,
regex compilation can take on the order of seconds.  This patch solves
the problem for the special case of patterns matching literal global
names (i.e. patterns with no regex metacharacters).  Rather than
forming regexes from literal global name patterns, add them to
a StringSet which is checked before matching against the regex.
This reduces regex compilation time by an order of roughly thousands
when reading the aforementioned special case list, according to a
completely unscientific study.

No test cases.  I figure that any new tests for this code should
check that regex metacharacters are properly recognised.  However,
I could not find any documentation which documents the fact that the
syntax of global names in special case lists is based on regexes.
The extent to which regex syntax is supported in special case lists
should probably be decided on/documented before writing tests.

Differential Revision: http://llvm-reviews.chandlerc.com/D1150

llvm-svn: 187732
2013-08-05 17:48:04 +00:00
Alexey Samsonov
4124641a22 Fix dereferencing end iterator in SimplifyCFG. Patch by Ye Mei.
llvm-svn: 187646
2013-08-02 08:06:43 +00:00
Matt Arsenault
f9b8fbb891 Teach getOrEnforceKnownAlignment about address spaces
llvm-svn: 187629
2013-08-01 22:42:18 +00:00
Rafael Espindola
ca86cefe79 Fix -Wdocumentation warnings.
llvm-svn: 187336
2013-07-28 23:43:28 +00:00