1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

2538 Commits

Author SHA1 Message Date
Chandler Carruth
cf6f5436f5 [attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.

This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.

I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.

Differential Revision: http://reviews.llvm.org/D15668

llvm-svn: 256465
2015-12-27 08:13:45 +00:00
Benjamin Kramer
b2614d51ce [FunctionImport] Move pass into anonymous namespace.
No functional change.

llvm-svn: 256374
2015-12-24 10:03:35 +00:00
David Majnemer
22cb4eb850 [OperandBundles] Have DeadArgElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256326
2015-12-23 09:58:36 +00:00
Akira Hatanaka
6a7dbf68a2 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r256277 with two changes:

- In emitFnAttrCompatCheck, change FuncName's type to std::string to fix
  a use-after-free bug.
- Remove an unnecessary install-local target in lib/IR/Makefile. 

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256304
2015-12-22 23:57:37 +00:00
Rafael Espindola
067bfb9e99 Also add unnamed_addr to functions.
llvm-svn: 256281
2015-12-22 20:43:30 +00:00
Akira Hatanaka
dfd76e927a Revert r256277 and r256279.
Some of the bots failed again.

llvm-svn: 256280
2015-12-22 20:29:09 +00:00
Akira Hatanaka
fa235f0243 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252990 and r252949. I've added member function getKind
to the Attr classes which returns the enum or string of the attribute.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256277
2015-12-22 20:00:05 +00:00
Rafael Espindola
f795707790 Delete dead GlobalAliases.
llvm-svn: 256276
2015-12-22 19:50:22 +00:00
Rafael Espindola
2af3ff098d Merge duplicated code.
The code for deleting dead global variables and functions was
duplicated.

This is in preparation for also deleting dead global aliases.

llvm-svn: 256274
2015-12-22 19:38:07 +00:00
Rafael Espindola
7f6f7bca5d Use early continue to reduce indentation.
llvm-svn: 256272
2015-12-22 19:26:18 +00:00
Rafael Espindola
611a6e336d Simplify iterator management. NFC.
Not passing an iterator to processGlobal will allow it to work with
other GlobalValues.

llvm-svn: 256271
2015-12-22 19:16:50 +00:00
Easwaran Raman
66e5fa28c2 Determine callee's hotness and adjust threshold based on that. NFC.
This uses the same criteria used in CFE's CodeGenPGO to identify hot and cold
callees and uses values of inlinehint-threshold and inlinecold-threshold
respectively as the thresholds for such callees.

Differential Revision: http://reviews.llvm.org/D15245

llvm-svn: 256222
2015-12-22 00:32:35 +00:00
Evgeniy Stepanov
7ed9f33690 [cfi] Fix LowerBitSets on 32-bit targets.
This code attempts to truncate IntPtrTy to i32, which may be the same
type.

llvm-svn: 256205
2015-12-21 22:14:04 +00:00
Vedant Kumar
358f3ea995 Re-reapply "[IR] Move optional data in llvm::Function into a hungoff uselist"
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Includes a fix to scrub value subclass data in dropAllReferences. Does not
use binary literals.

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256095
2015-12-19 08:52:49 +00:00
Vedant Kumar
2e1a683bae Revert "Reapply "[IR] Move optional data in llvm::Function into a hungoff uselist""
This reverts commit r256093.

This broke lld-x86_64-win7 because of -Werror,-Wc++1y-extensions.

llvm-svn: 256094
2015-12-19 08:48:43 +00:00
Vedant Kumar
c33a34516e Reapply "[IR] Move optional data in llvm::Function into a hungoff uselist"
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Includes a fix to scrub value subclass data in dropAllReferences.

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256093
2015-12-19 08:29:51 +00:00
Vedant Kumar
6843b30188 Revert "[IR] Move optional data in llvm::Function into a hungoff uselist"
This reverts commit r256090.

This broke llvm-clang-lld-x86_64-debian-fast.

llvm-svn: 256091
2015-12-19 07:30:44 +00:00
Vedant Kumar
46b3967fa2 [IR] Move optional data in llvm::Function into a hungoff uselist
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256090
2015-12-19 07:08:56 +00:00
Teresa Johnson
0dce8d436c [ThinLTO] Metadata linking for imported functions
Summary:
Second patch split out from http://reviews.llvm.org/D14752.

Maps metadata as a post-pass from each module when importing complete,
suturing up final metadata to the temporary metadata left on the
imported instructions.

This entails saving the mapping from bitcode value id to temporary
metadata in the importing pass, and from bitcode value id to final
metadata during the metadata linking postpass.

Depends on D14825.

Reviewers: dexonsmith, joker.eph

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14838

llvm-svn: 255909
2015-12-17 17:14:09 +00:00
Rafael Espindola
f7a0054c75 Change linkInModule to take a std::unique_ptr.
Passing in a std::unique_ptr should help find errors when the module
is used after being linked into another module.

llvm-svn: 255842
2015-12-16 23:16:33 +00:00
Justin Bogner
58647df890 LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC
As of r255720, the loop pass manager will DTRT when passes update the
loop info for removed loops, so they no longer need to reach into
LPPassManager APIs to do this kind of transformation. This change very
nearly removes the need for the LPPassManager to even be passed into
loop passes - the only remaining pass that uses the LPM argument is
LoopUnswitch.

llvm-svn: 255797
2015-12-16 18:40:20 +00:00
Richard Trieu
9af860a927 Remove one of the void casts used to suppress unused variable warning.
llvm-svn: 255709
2015-12-15 23:47:17 +00:00
Evgeniy Stepanov
493b24312f Suppress unused variable warning in the no-asserts build.
llvm-svn: 255706
2015-12-15 23:30:29 +00:00
Richard Trieu
bf34638e13 Cast variable to void to resolve unused variable warning in non-asserts builds.
llvm-svn: 255704
2015-12-15 23:25:34 +00:00
Evgeniy Stepanov
39e538e166 Cross-DSO control flow integrity (LLVM part).
An LTO pass that generates a __cfi_check() function that validates a
call based on a hash of the call-site-known type and the target
pointer.

llvm-svn: 255693
2015-12-15 23:00:08 +00:00
James Molloy
fb86086405 [PassManagerBuilder] Add a few more scalar optimization passes
This patch does two things:
  1. mem2reg is now run immediately after globalopt. Now that globalopt
     can localize variables more aggressively, it makes sense to lower
     them to SSA form earlier rather than later so they can benefit from
     the full set of optimization passes.

  2. More scalar optimizations are run after the loop optimizations in
     LTO mode. The loop optimizations (especially indvars) can clean up
     scalar code sufficiently to make it worthwhile running more scalar
     passes. I've particularly added SCCP here as it isn't run anywhere
     else in the LTO pass pipeline.

Mem2reg is super cheap and shouldn't affect compilation time at all. The
rest of the added passes are in the LTO pipeline only so doesn't affect
the vast majority of compilations, just the link step.

llvm-svn: 255634
2015-12-15 09:24:01 +00:00
Rafael Espindola
2d1739bf50 A better attempt to add a missing include
llvm-svn: 255578
2015-12-14 23:34:35 +00:00
Rafael Espindola
5f225be291 Trying to fix the build in a bot.
llvm-svn: 255577
2015-12-14 23:31:08 +00:00
Rafael Espindola
5b397256de Use diagnostic handler in the LLVMContext
This patch converts code that has access to a LLVMContext to not take a
diagnostic handler.

This has a few advantages

* It is easier to use a consistent diagnostic handler in a single program.
* Less clutter since we are not passing a handler around.

It does make it a bit awkward to implement some C APIs that return a
diagnostic string. I will propose new versions of these APIs and
deprecate the current ones.

llvm-svn: 255571
2015-12-14 23:17:03 +00:00
Sanjoy Das
987d70ed26 [MergeFunctions] Use II instead of CI for InvokeInst; NFC
Using `CI` is slightly misleading.

llvm-svn: 255529
2015-12-14 19:11:45 +00:00
Sanjoy Das
97417780af Teach MergeFunctions about operand bundles
llvm-svn: 255528
2015-12-14 19:11:40 +00:00
Diego Novillo
9bbd13f9a0 SamplePGO - Reduce memory utilization by 10x.
DenseMap is the wrong data structure to use for sample records and call
sites.  The keys are too large, causing massive core memory growth when
reading profiles.

Before this patch, a 21Mb input profile was causing the compiler to grow
to 3Gb in memory.  By switching to std::map, the compiler now grows to
300Mb in memory.

There still are some opportunities for memory footprint reduction. I'll
be looking at those next.

llvm-svn: 255389
2015-12-11 23:21:38 +00:00
Artur Pilipenko
8b6635b2f6 PruneEH pass incorrectly reports that a change was made
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D14097

llvm-svn: 255343
2015-12-11 16:30:26 +00:00
Teresa Johnson
087bc3b677 [ThinLTO] Debug message cleanup (NFC)
Added some missing spaces between the module identifier and the start of
the debug message. Also added a ":" after the module identifier to make
this look a little nicer.

llvm-svn: 255259
2015-12-10 16:39:07 +00:00
Sanjoy Das
d85ded90d0 Add arg_begin() and arg_end() to CallInst and InvokeInst; NFCI
- This simplifies the CallSite class, arg_begin / arg_end are now
   simple wrapper getters.

 - In several places, we were creating CallSite instances solely to call
   arg_begin and arg_end.  With this change, that's no longer required.

llvm-svn: 255226
2015-12-10 06:39:02 +00:00
Rafael Espindola
2e47184a91 Don't assign a temporary string to a StringRef.
Should fix the windows debug and asan bots.

llvm-svn: 255149
2015-12-09 20:41:10 +00:00
Teresa Johnson
83a7df21b2 [ThinLTO] FunctionImport pass can take a const index pointer (NFC)
llvm-svn: 255140
2015-12-09 19:39:47 +00:00
Mehdi Amini
b282e7bd00 The current importing scheme is processing one function at a time,
loading the source Module, linking the function in the destination
module, and destroying the source Module before repeating with the
next function to import (potentially from the same Module).

Ideally we would keep the source Module alive and import the next
Function needed from this Module. Unfortunately this is not possible
because the linker does not leave it in a usable state.

However we can do better by first computing the list of all candidates
per Module, and only then load the source Module and import all the
function we need for it.

The trick to process callees is to materialize function in the source
module when building the list of function to import, and inspect them
in their source module, collecting the list of callees for each
callee.

When we move the the actual import, we will import from each source
module exactly once. Each source module is loaded exactly once.
The only drawback it that it requires to have all the lazy-loaded
source Module in memory at the same time.

Currently this patch already improves considerably the link time,
a multithreaded link of llvm-dis on my laptop was:

  real  1m12.175s  user  6m32.430s sys  0m10.529s

and is now:

  real  0m40.697s  user  2m10.237s sys  0m4.375s

Note: this is the full link time (linker+Import+Optimizer+CodeGen)

Differential Revision: http://reviews.llvm.org/D15178

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255100
2015-12-09 08:17:35 +00:00
Sanjoy Das
cb770fbcb6 [OperandBundles] Have PruneEH work correct with operand bundles.
For an invoke with operand bundles, the [op_begin(), op_end()-3] range
can contain things other than invoke arguments.  This change teaches
PruneEH to use arg_begin() and arg_end() explicitly.

llvm-svn: 255073
2015-12-08 23:16:52 +00:00
Mehdi Amini
5d4cc87b91 Fix/Improve Debug print in FunctionImport pass
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255071
2015-12-08 23:04:19 +00:00
Mehdi Amini
ba2c064383 Remove caching in FunctionImport: a Module can't be reused after being linked from
The Linker destroys the source module (API change coming to make it explicit)

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255064
2015-12-08 22:39:40 +00:00
Teresa Johnson
1fb89d62fb [ThinLTO] Support for specifying function index from pass manager
Summary:
Add a field on the PassManagerBuilder that clang or gold can use to pass
down a pointer to the function index in memory to use for importing when
the ThinLTO backend is triggered. Add support to supply this to the
function import pass.

Reviewers: joker.eph, dexonsmith

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D15024

llvm-svn: 254926
2015-12-07 19:21:11 +00:00
Mehdi Amini
e0e1d33bec clang-format FunctionImport after refactoring (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254585
2015-12-03 02:58:14 +00:00
Mehdi Amini
30d3bd787c Refactor FunctionImporter::importFunctions with a helper function to process the Worklist (NFC)
This precludes some more functional changes to perform bulk imports.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254583
2015-12-03 02:37:33 +00:00
David Majnemer
56dee65385 Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Mehdi Amini
34766825ef Change ModuleLinker to take a set of GlobalValues to import instead of a single one
For efficiency reason, when importing multiple functions for the same Module,
we can avoid reparsing it every time.

Differential Revision: http://reviews.llvm.org/D15102

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254486
2015-12-02 04:34:28 +00:00
Mehdi Amini
1909ee53b2 Modify FunctionImport to take a callback to load modules
When linking static archive, there is no individual module files to
load. Instead they can be mmap'ed and could be initialized from a
buffer directly. The callback provide flexibility to override the
scheme for loading module from the summary.

Differential Revision: http://reviews.llvm.org/D15101

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254479
2015-12-02 02:00:29 +00:00
Rafael Espindola
e3fda2ca99 Use references now that it is natural to do so.
The linker never takes ownership of a module or changes which module it
is refering to, making it natural to use references.

llvm-svn: 254449
2015-12-01 19:50:54 +00:00
Teresa Johnson
0bc8d23948 [ThinLTO] Wrap dbgs() output in DEBUG macro
Missed in a couple places.

llvm-svn: 254422
2015-12-01 17:12:10 +00:00
Teresa Johnson
0e55603ffe [ThinLTO] Remove stale comment (NFC)
Stale as of r254036 which added basic profitability check.

llvm-svn: 254421
2015-12-01 16:45:23 +00:00
Diego Novillo
d74fdde81f SamplePGO - Do not use std::to_string in diagnostics.
This fixes buildbots in systems that std::to_string is not present. It
also tidies the output of the diagnostic to render doubles a bit better
(thanks Ben Kramer for help with string streams and format).

llvm-svn: 254261
2015-11-29 18:23:26 +00:00
Diego Novillo
d08de97276 SamplePGO - Add initial support for inliner annotations.
This adds two thresholds to the sample profiler to affect inlining
decisions: the concept of global hotness and coldness.

Functions that have accumulated more than a certain fraction of samples at
runtime, are annotated with the InlineHint attribute. Conversely,
functions that accumulate less than a certain fraction of samples, are
annotated with the Cold attribute.

This is very similar to the hints emitted by Clang when using
instrumentation profiles.

Notice that this is a very blunt instrument. A function may have
globally collected a significant fraction of samples, but that does not
necessarily mean that every callsite for that function is hot.

Ideally, we would annotate each callsite with the samples collected at
that callsite. This way, the inliner can incorporate all these weights
into its cost model.

Once the inliner offers this functionality, we can change the hints
emitted here to a more precise per-callsite annotation. For now, this is
providing some measure of speedups with our internal benchmarks. I've
observed speedups of up to 23% (though the geo mean is about 3%). I expect
these numbers to improve as the inliner gets better annotations.

llvm-svn: 254212
2015-11-27 23:14:51 +00:00
Diego Novillo
c52a667205 SamplePGO - Fix default threshold for hot callsites.
Based on testing of internal benchmarks, I'm lowering this threshold to
a value of 0.1%.  This means that SamplePGO will respect 99.9% of the
original inline decisions when following a profile.

The performance difference is noticeable in some tests. With the
previous threshold, the speedups over baseline -O2 was about 0.63%. With
the new default, the speedups are around 3% on average.

The point of this threshold is not to do more aggressive inlining. When
an inlined callsite crosses this threshold, SamplePGO will redo the
inline decision so that it can better apply the input profile.

By respecting most original inline decisions, we can apply more of the
input profile because the shape of the code follows the profile more
closely.

In the next series, I'll be looking at adding some inline hints for the
cold callsites and for toplevel functions that are hot/cold as well.

llvm-svn: 254211
2015-11-27 23:14:49 +00:00
Rafael Espindola
d215bba299 Disallow aliases to available_externally.
They are as much trouble as aliases to declarations. They are requiring
the code generator to define a symbol with the same value as another
symbol, but the second symbol is undefined.

If representing this is important for some optimization, we could add
support for available_externally aliases. They would be *required* to
point to a declaration (or available_externally definition).

llvm-svn: 254170
2015-11-26 19:22:59 +00:00
Rong Xu
c4f897c441 [PGO] Revert revision r254021,r254028,r254035
Revert the above revision due to multiple issues.

llvm-svn: 254040
2015-11-24 23:49:08 +00:00
Teresa Johnson
cbf6e0bf1b [ThinLTO] Add option to limit importing based on instruction count
Add a simple initial heuristic to control importing based on the number
of instructions recorded in the function's summary. Add option to
control the limit, and test using option.

llvm-svn: 254036
2015-11-24 22:55:46 +00:00
Diego Novillo
2b7c3c54ab SamplePGO - Add test for hot/cold inlined functions.
When the original binary is executed and sampled, the resulting profile
contains information on the original inline stack. We currently follow
the original inline plan if we notice that the inlined callsite has more
than 0 samples to it.

A better way is to determine whether the callsite is actually worth
inlining. If the callsite accumulates a small fraction of the samples
spent in the parent function, then we don't want to bother inlining it
(as it means that the callsite is actually cold).

This patch introduces a threshold expressed in percentage of samples
in relation to the parent function.  If the callsite uses less than N%
of the total samples used by its parent, the original inline decision is
not re-applied.

I've set the threshold to the very arbitrary value of 5%. I'm yet to do
any actual experiments to see what's a good value. I wanted to separate
the basic mechanism from the tuning.

llvm-svn: 254034
2015-11-24 22:38:37 +00:00
Rong Xu
025bf7be0c [PGO] MST based PGO instrumentation infrastructure
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.

Differential Revision: http://reviews.llvm.org/D12781

llvm-svn: 254021
2015-11-24 21:31:25 +00:00
Teresa Johnson
697f6bcd05 [ThinLTO] Refactor function body scan during importing into helper (NFC)
llvm-svn: 254020
2015-11-24 21:15:19 +00:00
Teresa Johnson
a3214913e6 [ThinLTO] Enable iterative importing in FunctionImport pass
Analyze imported function bodies and add any new external calls to
the worklist for importing. Currently no controls on the importing
so this will end up importing everything possible in the call tree
below the importing module. Basic profitability checks coming next.

Update test to check for iteratively inlined functions.

llvm-svn: 254011
2015-11-24 19:55:04 +00:00
Teresa Johnson
9c0a1779ce [ThinLTO] Fix FunctionImport alias checking and test
Skip imports for weak_any aliases as well. Fix the test to check
non-import of weak aliases and functions, and import of normal alias.

llvm-svn: 253991
2015-11-24 16:10:43 +00:00
Ismail Donmez
266a7da4e3 Fix build after r253954
llvm-svn: 253969
2015-11-24 09:48:09 +00:00
Mehdi Amini
2fe02188ef Add a FunctionImporter helper to perform summary-based cross-module function importing
Summary:
This is a helper to perform cross-module import for ThinLTO. Right now
it is importing naively every possible called functions.

Reviewers: tejohnson

Subscribers: dexonsmith, llvm-commits

Differential Revision: http://reviews.llvm.org/D14914

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253954
2015-11-24 06:07:49 +00:00
Diego Novillo
d28d079aa7 SamplePGO - Add coverage tracking for samples.
The existing coverage tracker counts the number of records that were used
from the input profile. An alternative view of coverage is to check how
many available samples were applied.

This way, if the profile contains several records with few samples, it
doesn't really matter much that they were not applied. The more
interesting records to apply are the ones that contribute many samples.

llvm-svn: 253912
2015-11-23 20:12:21 +00:00
Diego Novillo
e85b89498c SamplePGO - Clear coverage tracking when clearing per-function data.
llvm-svn: 253877
2015-11-23 16:30:17 +00:00
Diego Novillo
f27e33d714 SamplePGO - Use newly introduced local variable. NFC.
llvm-svn: 253868
2015-11-23 15:24:13 +00:00
Diego Novillo
372bf7dc64 SamplePGO - Do not count never-executed inlined functions when computing coverage.
If a function was originally inlined but not actually hot at runtime,
its samples will not be counted inside the parent function. This throws
off the coverage calculation because it expects to find more used
records than it should.

Fixed by ignoring functions that will not be inlined into the parent.
Currently, this is inlined functions with 0 samples.  In subsequent
patches, I'll change this to mean "cold" functions.

llvm-svn: 253716
2015-11-20 21:46:38 +00:00
Tilmann Scheller
a99f5d534e Revert "[FunctionAttrs] Remove redundant assignment."
This reverts r253661.

Turns out that the assignment is not redundant (despite the Clang static analyzer claiming the opposite).

The variable is being used by the lambda function AddUsersToWorklistIfCapturing().

llvm-svn: 253696
2015-11-20 19:17:10 +00:00
Diego Novillo
4255fabac8 SamplePGO - Add line offset and discriminator information to sample reports.
While debugging some sampling coverage problems, I found this useful:
When applying samples from a profile, it helps to also know what line
offset and discriminator the sample belongs to. This makes it easy to
correlate against the input profile.

llvm-svn: 253670
2015-11-20 15:39:42 +00:00
Tilmann Scheller
baba8378a4 [FunctionAttrs] Remove redundant assignment.
Identified by the Clang static analyzer.

llvm-svn: 253661
2015-11-20 12:51:58 +00:00
James Molloy
2208ca52dd [GlobalOpt] Localize some globals that have non-instruction users
We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive.

llvm-svn: 253584
2015-11-19 18:04:33 +00:00
James Molloy
b585b0aee8 [FunctionAttrs] Provide a mechanism for adding function attributes from the command line
This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc).

The syntax is -force-attribute=function_name:attribute_name

All function attributes are parsed except alignstack as it requires an argument.

llvm-svn: 253550
2015-11-19 08:49:57 +00:00
James Molloy
b92ba28077 [LTO] Add an early run of functionattrs
Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt.

llvm-svn: 253451
2015-11-18 11:24:42 +00:00
Elena Demikhovsky
0600baa03e Vector of pointers in function attributes calculation
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.

Differential Revision: http://reviews.llvm.org/D14693

llvm-svn: 253363
2015-11-17 19:30:51 +00:00
James Molloy
63f33470bb [GlobalOpt] Address post-commit review comments on r253168
Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed:
  * clang-format one area where whitespace diffs occurred.
  * Add a threshold to limit the store/load dominance checks as they are quadratic.

llvm-svn: 253192
2015-11-16 10:16:22 +00:00
Benjamin Kramer
c51f76e89b Move helper classes into anonymous namespaces. NFC.
llvm-svn: 253189
2015-11-16 09:01:28 +00:00
James Molloy
85bd37fc58 [GlobalOpt] Demote globals to locals more aggressively
Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized.

For a global to be demoted, it must only be accessed by one function and that function:
  1. Must never recurse directly or indirectly, else the GV would be clobbered.
  2. Must never rely on the value in GV at the start of the function (apart from the initializer).

GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once.

In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason.

The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing.

llvm-svn: 253168
2015-11-15 14:21:37 +00:00
James Molloy
7d379efeea [GlobalOpt] Make sure all debug lines end with '\n'
GlobalVariable::print() used to emit a newline. It hasn't for a while now, but these debug lines weren't updated.

llvm-svn: 253030
2015-11-13 11:05:13 +00:00
James Molloy
e22291bd4d [GlobalOpt] Coding style - remove function names from doxygen comments
Suggested by Mehdi in the review of D14148.

llvm-svn: 253029
2015-11-13 11:05:07 +00:00
Akira Hatanaka
a41bf2744e Revert r252990.
Some of the buildbots are still failing.

llvm-svn: 252999
2015-11-13 01:44:32 +00:00
Akira Hatanaka
5df31fbb8f Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252949. I've changed the type of FuncName to be
std::string instead of StringRef in emitFnAttrCompatCheck.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 252990
2015-11-13 01:23:11 +00:00
Akira Hatanaka
8642e85c17 Revert r252949.
It broke some of the bots including clang-x64-ninja-win7.

llvm-svn: 252951
2015-11-12 21:19:18 +00:00
Akira Hatanaka
ca7dc7a319 Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 252949
2015-11-12 20:59:43 +00:00
James Molloy
64e756bd07 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

llvm-svn: 252871
2015-11-12 10:55:20 +00:00
James Molloy
9cd723ec63 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

llvm-svn: 252863
2015-11-12 09:05:43 +00:00
James Molloy
4da975f03a [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

llvm-svn: 252862
2015-11-12 08:53:04 +00:00
David Majnemer
3deb8be573 [IR] Add support for empty tokens
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token.  Currently, we have no
mechanism to represent an initial token state.

Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like.  This new constant
is called ConstantTokenNone and is written textually as "token none".

Differential Revision: http://reviews.llvm.org/D14581

llvm-svn: 252811
2015-11-11 21:57:16 +00:00
Oliver Stannard
989496fc9c GlobalOpt should maintain externally_initialized when splitting aggregates
When GlobalOpt splits an internal, global variable with an aggregate type, it
should propagate the externally_initialized flag to the newly created globals.

This makes the pass safe for our downstream use of this flag, while still
allowing some useful optimisations (such as removing dead parts of the split
aggregate) to be performed.

Differential Revision: http://reviews.llvm.org/D13382

llvm-svn: 252490
2015-11-09 16:47:16 +00:00
Sanjoy Das
62bf9f3dd6 Unbreak the build
My code clashed with some ilist iterator changes upstream.  Fix by
adding an explicit "&*" coercion.

llvm-svn: 252392
2015-11-07 02:26:53 +00:00
Sanjoy Das
c550b2c217 [FunctionAttrs] Add comment and clarify assertion message; NFC
llvm-svn: 252389
2015-11-07 01:56:07 +00:00
Sanjoy Das
6b6a5c9388 [FunctionAttrs] Add handling for operand bundles
Summary:
Teach the FunctionAttrs to do the right thing for IR with operand
bundles.

Reviewers: reames, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14408

llvm-svn: 252387
2015-11-07 01:56:00 +00:00
Sanjoy Das
9dbcbc0397 [FunctionAttrs] Fix an iterator wraparound bug
Summary:
This change fixes an iterator wraparound bug in
`determinePointerReadAttrs`.

Ideally, ++'ing off the `end()` of an iplist should result in a failed
assert, but currently iplist seems to silently wrap to the head of the
list on `end()++`.  This is why the bad behavior is difficult to
demonstrate.

Reviewers: chandlerc, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14350

llvm-svn: 252386
2015-11-07 01:55:53 +00:00
Duncan P. N. Exon Smith
2c4e1a2f50 ADT: Remove last implicit ilist iterator conversions, NFC
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress.  This removes them.

I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).

llvm-svn: 252371
2015-11-07 00:01:16 +00:00
Peter Collingbourne
5b721561aa DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Sanjoy Das
3c5b8b4566 [FunctionAttrs] Remove a loop, NFC refactor
Summary:
Remove the loop over the uses of the CallSite in ArgumentUsesTracker.
Since we have the `Use *` for actual argument operand, we can just use
pointer subtraction.

The time complexity remains the same though (except for a vararg
argument) -- `std::advance` is O(UseIndex) for the ArgumentList
iterator.

The real motivation is to make a later change adding support for operand
bundles simpler.

Reviewers: reames, chandlerc, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14363

llvm-svn: 252141
2015-11-05 03:04:40 +00:00
Adam Nemet
933537bc6b LLE 6/6: Add LoopLoadElimination pass
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop.  E.g.:

  for (i)
     A[i + 1] = A[i] + B[i]

  =>

  T = A[0]
  for (i)
     T = T + B[i]
     A[i + 1] = T

The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load.  Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.

This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning.  As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here.  Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.

In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads.  I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.

The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop.  Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%).  This gain also transfers over to x86: it's
around 8-10%.

Right now the pass is off by default and can be enabled
with -enable-loop-load-elim.  On the LNT testsuite, there are two
performance changes (negative number -> improvement):

  1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
     critical paths is reduced
  2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
     outside of LNT

The pass is scheduled after the loop vectorizer (which is after loop
distribution).  The rational is to try to reuse LAA state, rather than
recomputing it.  The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.

LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences).  LAA is known to omit loop-independent
dependences in certain situations.  The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.

Reviewers: dberlin, hfinkel

Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13259

llvm-svn: 252017
2015-11-03 23:50:08 +00:00
Teresa Johnson
90b0eac682 Restore "Support for ThinLTO function importing and symbol linking."
This restores commit r251837, with the new library dependence added to
llvm-link/Makefile to address bot failures.

llvm-svn: 251866
2015-11-03 00:14:15 +00:00
Teresa Johnson
077216c4b1 Revert "Support for ThinLTO function importing and symbol linking."
This reverts commit r251837, due to a number of bot failures of the form:

/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef,
llvm::LLVMContext&, llvm::Module const*, bool)'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'

I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?

llvm-svn: 251841
2015-11-02 22:17:32 +00:00
Teresa Johnson
30954fb923 Support for ThinLTO function importing and symbol linking.
Summary:
Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.

Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.

Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.

Reviewers: dexonsmith, joker.eph, davidxl

Subscribers: tobiasvk, tejohnson, llvm-commits

Differential Revision: http://reviews.llvm.org/D13515

llvm-svn: 251837
2015-11-02 21:39:10 +00:00
David Blaikie
a98b9cfa54 StringRef-ify DiagnosticInfoSampleProfile::Filename
llvm-svn: 251823
2015-11-02 20:01:13 +00:00
Teresa Johnson
1cf56803da Clang format a few prior patches (NFC)
I had clang formatted my earlier patches using the wrong style.
Reformatted with the LLVM style.

llvm-svn: 251812
2015-11-02 18:02:11 +00:00
Diego Novillo
b170ccb76f SamplePGO - Count sample records in embedded profiles when computing coverage.
The initial coverage checking code for sample records failed to count
records inside inlined profiles. This change fixes the oversight.

llvm-svn: 251752
2015-10-31 21:53:58 +00:00
Chandler Carruth
cabded8f2b [FunctionAttrs] Inline the prototype attribute inference to an existing
loop over the SCC.

The separate function wasn't really adding much, NFC.

llvm-svn: 251728
2015-10-31 00:28:37 +00:00
Justin Bogner
4a77d84be2 [PM] Port StripDeadPrototypes to the new pass manager
This is a really straightforward port. Also adds a test for the pass,
since it only seemed to be tested tangentially before.

llvm-svn: 251726
2015-10-30 23:28:12 +00:00
Justin Bogner
7d746e1b5b Whitespace. NFC
llvm-svn: 251724
2015-10-30 23:02:38 +00:00
Chandler Carruth
5b1a50aeaa [FunctionAttrs] Separate another chunk of the logic for functionattrs
from its pass harness by providing a lambda to query for AA results.

This allows the legacy pass to easily provide a lambda that uses the
special helpers to construct function AA results from a legacy CGSCC
pass. With the new pass manager (the next patch) the lambda just
directly wraps the intuitive query API.

llvm-svn: 251715
2015-10-30 16:48:08 +00:00
Chandler Carruth
f6cef1566e [FunctionAttrs] Provide a single SCC node set to all of the
transformations in FunctionAttrs rather than building a new one each
time.

This isn't trivial because there are different heuristics from different
passes for exactly what set they want. The primary difference is whether
an *overridable* function completely disables the synthesis of
attributes. I've modeled this by directly testing for overridable, and
using the common set that excludes external and opt-none functions.

This does cause some changes by disabling more optimizations in the face
of opt-none. Specifically, we were still optimizing *calls* to opt-none
functions based on their attributes, just not the bodies. It seems
better to be conservative on both fronts given the intended semanticas
here (best effort to not assume or disturb anything). I've not tried to
test this change as it seems complex, brittle, and not important to the
implicit contract of opt-none. Instead, it seems more like a choice that
should be dictated by the simplified implementation and the change to be
acceptable differences within the space of opt-none.

A big benefit here is that these transformations no longer rely on the
legacy pass manager's SCC types, they just work on generic sets of
function pointers. This will make it easy to re-use their logic in the
new pass manager.

I've also made the transforms static functions instead of members where
trivial while I was touching the signatures.

llvm-svn: 251640
2015-10-29 18:29:15 +00:00
Daniel Jasper
d024d95356 Fix use-after-free. Thanks ASAN for giving me a detailed report :-).
llvm-svn: 251623
2015-10-29 12:49:37 +00:00
Diego Novillo
9692d53ba5 SamplePGO - Add flag to check sampling coverage.
This adds the flag -mllvm -sample-profile-check-coverage=N to the
SampleProfile pass. N is the percent of input sample records that the
user expects to apply.  If the pass does not use N% (or more) of the
sample records in the input, it emits a warning.

This is useful to detect some forms of stale profiles. If the code has
drifted enough from the original profile, there will be records that do
not match the IR anymore.

This will not detect cases where a sample profile record for line L is
referring to some other instructions that also used to be at line L.

llvm-svn: 251568
2015-10-28 22:30:25 +00:00
Diego Novillo
512c00aec3 SamplePGO - Clear per-function data after applying a profile.
The pass was keeping around a lot of per-function data (visited blocks,
edges, dominance, etc) that is just taking up memory for no reason. In
fact, from function to function it could potentially confuse the
propagator since some maps are indexed by line offsets which can be
common between functions.

llvm-svn: 251531
2015-10-28 17:40:22 +00:00
James Molloy
b82cf99fec [GlobalOpt] Add newlines to DEBUG messages
I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable.

NFC.

llvm-svn: 251517
2015-10-28 14:30:53 +00:00
Diego Novillo
0ad41ba447 Tidy a comment. NFC.
llvm-svn: 251434
2015-10-27 18:41:46 +00:00
Diego Novillo
13365b7f1f Fix SamplePGO segfault when debug info is missing.
When emitting a remark for a conditional branch annotation, the remark
uses the line location information of the conditional branch in the
message.  In some cases, that information is unavailable and the
optimization would segfaul. I'm still not sure whether this is a bug or
WAI, but the optimizer should not die because of this.

llvm-svn: 251420
2015-10-27 17:37:00 +00:00
Chandler Carruth
33adb879ee [function-attrs] Refactor code to handle shorter code with early exits.
No functionality changed here, but the indentation is substantially
reduced and IMO the code is much easier to read. I've also added some
helpful comments.

This is just a clean-up I wrote while studying the code, and that has
been in my backlog for a while.

llvm-svn: 251381
2015-10-27 01:41:43 +00:00
Diego Novillo
4d3e41c2db Remove unused local variable. NFC.
llvm-svn: 251344
2015-10-26 20:50:26 +00:00
Diego Novillo
73fa815dcb SamplePGO - Add optimization reports.
This adds a couple of optimization remarks to the SamplePGO
transformation. When it decides to inline a hot function (to mimic the
inline stack and repeat useful inline decisions in the original build).

It will also report branch destinations. For instance, given the code
fragment:

     6      if (i < 1000)
     7        sum -= i;
     8      else
     9        sum += -i * rand();

If the 'else' branch is taken most of the time, building this code with
-Rpass=sample-profile will produce:

a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile]
      sum += -i * rand();
             ^

llvm-svn: 251330
2015-10-26 18:52:53 +00:00
Dehao Chen
de5bbe325e Tolerate negative offset when matching sample profile.
In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation.

http://reviews.llvm.org/D13914

llvm-svn: 250873
2015-10-21 01:22:27 +00:00
Diego Novillo
52c1392c42 Sample Profiles - Adjust integer types. Mostly NFC.
This adjusts all integers in the reader/writer to reflect the types
stored on profile files. They should all be unsigned 32-bit or 64-bit
values. Changed all associated internal types to be uint32_t or
uint64_t.

The only place that needed some adjustments is in the sample profile
transformation. Altough the weight read from the profile are 64-bit
values, the internal API for branch weights only accepts 32-bit values.
The pass now saturates weights that overflow uint32_t.

llvm-svn: 250427
2015-10-15 16:36:21 +00:00
Duncan P. N. Exon Smith
e4d7b5af96 IPO: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250187
2015-10-13 17:51:03 +00:00
James Molloy
3eec7b82c2 [GlobalsAA] Turn GlobalsAA on again by default
Now that all the known faults with GlobalsAA have been fixed, flip the big switch on -enable-non-lto-gmr again.

Feel free to pester me with any more bugs found, and don't hesitate to flip the switch back off.

llvm-svn: 250157
2015-10-13 10:43:57 +00:00
Oliver Stannard
f7f8e4fda8 GlobalOpt does not treat externally_initialized globals correctly
GlobalOpt currently merges stores into the initialisers of internal,
externally_initialized globals, but should not do so as the value of the global
may change between the initialiser and any code in the module being run.

llvm-svn: 250035
2015-10-12 13:20:52 +00:00
Dehao Chen
2c6e16ef0f Make HeaderLineno a local variable.
http://reviews.llvm.org/D13576

As we are using hierarchical profile, there is no need to keep HeaderLineno a member variable. This is because each level of the inline stack will have its own header lineno. One should use the head lineno of its own inline stack level instead of the actual symbol.

llvm-svn: 249848
2015-10-09 16:50:16 +00:00
Hans Wennborg
7d1f4ff326 Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D13321

llvm-svn: 249482
2015-10-06 23:24:35 +00:00
Arnold Schwaighofer
0e64b55852 MergeFunctions: Clear GlobalNumbers ValueMap
Otherwise, the map will observe changes as long as MergeFunctions is alive. This
is bad because follow-up passes could replace-all-uses-with on the key of an
entry in the map. The value handle callback of ValueMap however asserts that the
key type matches.

rdar://22971893

llvm-svn: 249327
2015-10-05 17:26:36 +00:00
Dehao Chen
ab10674b4f Update sample profile propagation algorithm.
http://reviews.llvm.org/D13218

llvm-svn: 248968
2015-10-01 00:26:56 +00:00
Dehao Chen
0024ada446 http://reviews.llvm.org/D13145
Support hierarachical sample profile format.

llvm-svn: 248865
2015-09-30 00:42:46 +00:00
Dehao Chen
a883fbf7ee http://reviews.llvm.org/D13231
Change lookup functions to const functions.

llvm-svn: 248818
2015-09-29 18:28:15 +00:00
Dehao Chen
c53a8ea9f1 Revert r248810 which breaks tests.
llvm-svn: 248814
2015-09-29 18:18:49 +00:00
Dehao Chen
dcb22a73b2 http://reviews.llvm.org/D13231
Change lookup functions to const functions.

llvm-svn: 248810
2015-09-29 17:59:58 +00:00
Evgeniy Stepanov
166d47b089 Move dbg.declare intrinsics when merging and replacing allocas.
Place new and update dbg.declare calls immediately after the
corresponding alloca.

Current code in replaceDbgDeclareForAlloca puts the new dbg.declare
at the end of the basic block. LLVM codegen has problems emitting
debug info in a situation when dbg.declare appears after all uses of
the variable. This usually kinda works for inlining and ASan (two
users of this function) but not for SafeStack (see the pending change
in http://reviews.llvm.org/D13178).

llvm-svn: 248769
2015-09-29 00:30:19 +00:00
Sean Silva
41beb27585 [GlobalOpt] Sort members of llvm.used deterministically
Patch by Jake VanAdrighem!

Summary:
Fix the way we sort the llvm.used and llvm.compiler.used members.

This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr.

Reviewers: silvas

Subscribers: silvas, llvm-commits

Differential Revision: http://reviews.llvm.org/D12851

llvm-svn: 248728
2015-09-28 19:02:11 +00:00
Michael Zolotukhin
3efc2c1480 Add CFG Simplification pass after Loop Unswitching.
Loop unswitching produces conditional branches with constant condition,
and it's beneficial for later passes to clean this up with simplify-cfg.
We do this after the second invocation of loop-unswitch, but not after
the first one. Not doing so might cause problem for passes like
LoopUnroll, whose estimate of loop body size would be less accurate.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D13064

llvm-svn: 248460
2015-09-24 03:50:17 +00:00
David Majnemer
db93b7df86 [DeadArgElim] Split the invoke successor edge
Invoking a function which returns an aggregate can sometimes be
transformed to return a scalar value.  However, this means that we need
to create an insertvalue instruction(s) to recreate the correct
aggregate type.  We achieved this by inserting an insertvalue
instruction at the invoke's normal successor.  However, this is not
feasible if the normal successor uses the invoke's return value inside a
PHI node.

Instead, split the edge between the invoke and the unwind successor and
create the insertvalue instruction in the new basic block.  The new
basic block's successor will be the old invoke successor which leaves
us with IR which is well behaved.

This fixes PR24906.

llvm-svn: 248387
2015-09-23 15:41:09 +00:00
Chandler Carruth
121df1c86d [FunctionAttrs] Extract a helper function for the core logic used to
evaluate whether 'readonly' or 'readnone' apply to a given function.
This both reduces indentation and will make it easy to share the logic
with a new pass manager implementation.

llvm-svn: 248181
2015-09-21 17:39:41 +00:00
James Molloy
a0c4399e43 [GlobalsAA] Disable globals-aa by default
Several issues have been found with it - disabling in the meantime.

llvm-svn: 247674
2015-09-15 10:44:06 +00:00
David Blaikie
a319aa10b6 [opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway
llvm-svn: 247610
2015-09-14 20:29:26 +00:00
David Blaikie
e04e393feb Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space"
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).

Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.

This reverts commit 236160.

llvm-svn: 247585
2015-09-14 18:01:59 +00:00
JF Bastien
298f59c2b7 [MergeFuncs] Fix bug in merging GetElementPointers
GetElementPointers must have the first argument's type compared
for structural equivalence. Previously the code erroneously compared the
pointer's type, but this code was dead because all pointer types (of the
same address space) are the same. The pointee must be compared instead
(using the type stored in the GEP, not from the pointer type which will
be erased anyway).

Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers: nlewycky, llvm-commits
Differential revision: http://reviews.llvm.org/D12820

llvm-svn: 247570
2015-09-14 15:37:48 +00:00
Chandler Carruth
bcbc41fdb2 [FunctionAttrs] Move the malloc-like test to a static helper function
that could be used from a new pass manager. This one makes particular
sense as a static helper as it doesn't even need TLI.

llvm-svn: 247525
2015-09-13 08:23:27 +00:00
Chandler Carruth
55bf36b635 [FunctionAttrs] Factor the logic to test for a known non-null return out
of a method and into a re-usable static helper. We can potentially use
this function from the implementation of a new pass manager oriented
version of the pass. Also add some better documentation of exactly what
the semantic model of this routine is (it isn't trivial) and use a more
modern naming convention for it.

llvm-svn: 247524
2015-09-13 08:17:14 +00:00
Chandler Carruth
fd5bcc53a3 [FunctionAttrs] Make the per-function attribute inference a boring
static function rather than a method. It just needed access to
TargetLibraryInfo, and this way it can be easily reused between the
current FunctionAttrs implementation and any port for the new pass
manager.

llvm-svn: 247522
2015-09-13 08:03:23 +00:00
Chandler Carruth
3fa570d0fb [FunctionAttrs] Collect utility functions as static helpers rather than
methods. They don't need anything from the class anyways.

Also, collect the declarations into the private section of the pass.

llvm-svn: 247521
2015-09-13 07:50:43 +00:00
Chandler Carruth
5d5fff4806 Clean up doxygen comments in FunctionAttrs, promoting some non-doxygen
comments, deleting duplicate comments, moving comments to consistently
live on the definition since these are all really internal routines,
etc. NFC.

llvm-svn: 247520
2015-09-13 06:57:25 +00:00
Chandler Carruth
a49c1a1040 Do some spring cleaning on FunctionAttrs.cpp with clang-format prior to
other refactorings and cleanups here.

llvm-svn: 247519
2015-09-13 06:47:20 +00:00
JF Bastien
bb46c62876 [MergeFuncs] Fix callsite attributes in thunk generation
This change correctly sets the attributes on the callsites
generated in thunks. This makes sure things such as sret, sext, etc.
are correctly set, so that the call can be a proper tailcall.

Also, the transfer of attributes in the replaceDirectCallers function
appears to be unnecessary, but until this is confirmed it will remain.

Author: jrkoenig
Reviewers: dschuff, jfb
Subscribers: llvm-commits, nlewycky
Differential revision: http://reviews.llvm.org/D12581

llvm-svn: 247313
2015-09-10 18:08:35 +00:00
James Molloy
52c8c7c3ff Enable GlobalsAA by default
This can give significant improvements to alias analysis in some situations, and improves its testing coverage in all situations.

llvm-svn: 247264
2015-09-10 10:22:20 +00:00
Peter Collingbourne
04c6c402ba LowerBitSets: Fix non-determinism bug.
Visit disjoint sets in a deterministic order based on the maximum BitSetNM
index, otherwise the order in which we visit them will depend on pointer
comparisons. This was being exposed by MSan.

llvm-svn: 247201
2015-09-09 22:30:32 +00:00
Chandler Carruth
d7003090ac [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Peter Collingbourne
38a322b063 Re-apply r247080 with order of evaluation fix.
llvm-svn: 247095
2015-09-08 22:49:35 +00:00
Peter Collingbourne
b98c01cec9 Revert r247080, "LowerBitSets: Extend pass to support functions as bitset
members." as it causes test failures on a number of bots.

llvm-svn: 247088
2015-09-08 22:33:23 +00:00
Peter Collingbourne
6e881fda02 LowerBitSets: Extend pass to support functions as bitset members.
This change extends the bitset lowering pass to support bitsets that may
contain either functions or global variables. A function bitset is lowered to
a jump table that is laid out before one of the functions in the bitset.

Also add support for non-string bitset identifier names. This allows for
distinct metadata nodes to stand in for names with internal linkage,
as done in D11857.

Differential Revision: http://reviews.llvm.org/D11856

llvm-svn: 247080
2015-09-08 21:57:45 +00:00
Yaron Keren
5f053b7016 Remove two unused includes and C++11 rangify for loops.
llvm-svn: 246865
2015-09-04 20:24:24 +00:00
JF Bastien
dd75963f9f [MergeFuncs] Efficiently defer functions on merge
Summary:
This patch introduces a side table in Merge Functions to
efficiently remove functions from the function set when functions
they refer to are merged. Previously these functions would need to
be compared lg(N) times to find the appropriate FunctionNode in the
tree to defer. With the recent determinism changes, this comparison
is more expensive. In addition, the removal function would not always
actually remove the function from the set (i.e. after remove(F),
there would sometimes still be a node in the tree which contains F).

With these changes, these functions are properly deferred, and so more
functions can be merged. In addition, when there are many merged
functions (and thus more deferred functions), there is a speedup:

chromium: 48678 merged -> 49380 merged; 6.58s -> 5.49s
libxul.so: 41004 merged -> 41030 merged; 8.02s -> 6.94s
mysqld: 1607 merged -> 1607 merged (same); 0.215s -> 0.212s (probably noise)

Author: jrkoenig
Reviewers: jfb, dschuff
Subscribers: llvm-commits, nlewycky
Differential revision: http://reviews.llvm.org/D12537

llvm-svn: 246735
2015-09-02 23:55:23 +00:00
Yaron Keren
56a458d0dd Move createEliminateAvailableExternallyPass earlier in the pass pipeline
to save running many ModulePasses on available external functions that
are thrown away anyhow.

llvm-svn: 246619
2015-09-02 06:34:11 +00:00
Hans Wennborg
4ed67f023f DeadArgElim: don't eliminate arguments from naked functions
Differential Revision: http://reviews.llvm.org/D12534

llvm-svn: 246564
2015-09-01 18:06:46 +00:00
Hans Wennborg
601f6f5c64 Fix Windows build by including raw_ostream.h
llvm-svn: 246486
2015-08-31 21:19:18 +00:00
Philip Reames
c1eca23247 [FunctionAttr] Infer nonnull attributes on returns
Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't.

Differential Revision: http://reviews.llvm.org/D9688

llvm-svn: 246476
2015-08-31 19:44:38 +00:00
JF Bastien
940c24167e Remove Merge Functions pointer comparisons
Summary:
This patch removes two remaining places where pointer value comparisons
are used to order functions: comparing range annotation metadata, and comparing
block address constants. (These are both rare cases, and so no actual
non-determinism was observed from either case).

The fix for range metadata is simple: the annotation always consists of a pair
of integers, so we just order by those integers.

The fix for block addresses is more subtle. Two constants are the same if they
are the same basic block in the same function, or if they refer to corresponding
basic blocks in each respective function. Note that in the first case, merging
is trivially correct. In the second, the correctness of merging relies on the
fact that the the values of block addresses cannot be compared. This change is
actually an enhancement, as these functions could not previously be merged (see
merge-block-address.ll).

There is still a problem with cross function block addresses, in that constants
pointing to a basic block in a merged function is not updated.

This also more robustly compares floating point constants by all fields of their
semantics, and fixes a dyn_cast/cast mixup.

Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers llvm-commits
Differential revision: http://reviews.llvm.org/D12376

llvm-svn: 246305
2015-08-28 16:49:09 +00:00
Diego Novillo
1b494a160f Fix memory leak in sample profile pass.
The problem here were the function analyses invoked by the function pass
manager from the new IPO pass. I looked at other IPO passes needing
dominance information and the only one that requires it (partial
inliner) does not use the standard dependency mechanism.

This patch mimics what the partial inliner does to compute dominance,
post-dominance and loop info. One thing I like about this approach is
that I can delay the computation of all this until I actually need it.

This should bring the ASAN buildbot back to green. If there's a better
way to fix this, I'll do it in a follow-up patch.

llvm-svn: 246066
2015-08-26 20:00:27 +00:00
JF Bastien
c5d6ea096a Comparing operands should not require the same ValueID
Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging.

Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes.

Author: jrkoenig
Reviewers: nlewycky, jfb, dschuff
Subscribers: llvm-commits
Differential revision: http://reviews.llvm.org/D12302

llvm-svn: 246001
2015-08-26 03:02:58 +00:00
NAKAMURA Takumi
48e91d8a5c Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940.
llvm-svn: 245957
2015-08-25 17:11:17 +00:00
Matthias Braun
194ba5f3f6 Fix dependencies/shared library build
llvm-svn: 245955
2015-08-25 17:07:40 +00:00
Diego Novillo
776683223a Convert SampleProfile pass into a Module pass.
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models.  To do this, we need the sample profile pass to
be a module pass.

This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.

llvm-svn: 245940
2015-08-25 15:25:11 +00:00
Piotr Padlewski
9757df9cfe Assume intrinsic handling in global opt
It doesn't solve the problem, when for example we load something, and
then assume that it is the same as some constant value, because
globalopt will fail on unknown load instruction. The proposed solution
would be to skip some instructions that we can't evaluate and they are
safe to skip (f.e. load, assume and many others) and see if they are
required to perform optimization (f.e. we don't care about ephemeral
instructions that may appear using @llvm.assume())

http://reviews.llvm.org/D12266

llvm-svn: 245919
2015-08-25 01:34:15 +00:00
Mehdi Amini
490bf85c83 Require Dominator Tree For SROA, improve compile-time
TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree.
There is no reason not to require it up-front for SROA.

Some history is necessary to understand why we ended-up here.

r123437 switched the second (Legacy)SROA in the optimizer pipeline to
use SSAUpdater in order to avoid recomputing the costly
DominanceFrontier. The purpose was to speed-up the compile-time.

Later r123609 removed the need for the DominanceFrontier in
(Legacy)SROA.

Right after, some cleanup was made in r123724 to remove any reference
to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and
SROA_DT (the latter replacing SROA_DF).
The second argument of `createScalarReplAggregatesPass` was renamed
from `UseDomFrontier` to `UseDomTree`.
I believe this is were a mistake was made. The pipeline was not
updated and the call site was still:
    PM->add(createScalarReplAggregatesPass(-1, false));

At that time, SROA was immediately followed in the pipeline by
EarlyCSE which required alread the DominatorTree. Not requiring
the DominatorTree in SROA didn't save anything, but unfortunately
it was lost at this point.

When the new SROA Pass was introduced in r163965, I believe the goal
was to have an exact replacement of the existing SROA, this bug
slipped through.

You can see currently:

$ echo "" | clang -x c++  -O3 -c - -mllvm -debug-pass=Structure
...
...
      FunctionPass Manager
        SROA
        Dominator Tree Construction
        Early CSE

After this patch:

$ echo "" | clang -x c++  -O3 -c - -mllvm -debug-pass=Structure
...
...
      FunctionPass Manager
        Dominator Tree Construction
        SROA
        Early CSE

This improves the compile time from 88s to 23s for PR17855.
https://llvm.org/bugs/show_bug.cgi?id=17855

And from 113s to 12s for PR16756
https://llvm.org/bugs/show_bug.cgi?id=16756

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D12267

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245820
2015-08-23 22:15:49 +00:00
JF Bastien
91f6300e91 Improve the determinism of MergeFunctions
Summary:

Merge functions previously relied on unsigned comparisons of pointer values to
order functions. This caused observable non-determinism in the compiler for
large bitcode programs. Basically, opt -mergefuncs program.bc | md5sum produces
different hashes when run repeatedly on the same machine. Differing output was
observed on three large bitcodes, but it was less frequent on the smallest file.
It is possible that this only manifests on the large inputs, hence remaining
undetected until now.

This patch fixes this by removing (almost, see below) all places where
comparisons between pointers are used to order functions. Most of these changes
are local, but the comparison of global values requires assigning an identifier
to each local in the order it is visited. This is very similar to the way the
comparison function identifies Value*'s defined within a function. Because the
order of visiting the functions and their subparts is deterministic, the
identifiers assigned to the globals will be as well, and the order of functions
will be deterministic.

With these changes, there is no more observed non-determinism. There is also
only minor slowdowns (negligible to 4%) compared to the baseline, which is
likely a result of the fact that global comparisons involve hash lookups and not
just pointer comparisons.

The one caveat so far is that programs containing BlockAddress constants can
still be non-deterministic. It is not clear what the right solution is here. In
particular, even if the global numbers are used to order by function, we still
need a way to order the BasicBlock*'s. Unfortunately, we cannot just bail out
and fail to order the functions or consider them equal, because we require a
total order over functions. Note that programs with BlockAddress constants are
relatively rare, so the impact of leaving this in is minor as long as this pass
is opt-in.

Author: jrkoenig

Reviewers: nlewycky, jfb, dschuff

Subscribers: jevinskie, llvm-commits, chapuni

Differential revision: http://reviews.llvm.org/D12168

llvm-svn: 245762
2015-08-21 23:27:24 +00:00
Chandler Carruth
bf271cc4e6 [PM/AA] Remove the last relics of the separate IPA library from LLVM,
folding the code into the main Analysis library.

There already wasn't much of a distinction between Analysis and IPA.
A number of the passes in Analysis are actually IPA passes, and there
doesn't seem to be any advantage to separating them.

Moreover, it makes it hard to have interactions between analyses that
are both local and interprocedural. In trying to make the Alias Analysis
infrastructure work with the new pass manager, it becomes particularly
awkward to navigate this split.

I've tried to find all the places where we referenced this, but I may
have missed some. I have also adjusted the C API to continue to be
equivalently functional after this change.

Differential Revision: http://reviews.llvm.org/D12075

llvm-svn: 245318
2015-08-18 17:51:53 +00:00
NAKAMURA Takumi
f382e9735d MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting.
Don't assume second would be ordered in the module.

llvm-svn: 245168
2015-08-16 02:41:23 +00:00
JF Bastien
fe4c9948ee Accelerate MergeFunctions with hashing
This patch makes the Merge Functions pass faster by calculating and comparing
a hash value which captures the essential structure of a function before
performing a full function comparison.

The hash is calculated by hashing the function signature, then walking the basic
blocks of the function in the same order as the main comparison function. The
opcode of each instruction is hashed in sequence, which means that different
functions according to the existing total order cannot have the same hash, as
the comparison requires the opcodes of the two functions to be the same order.

The hash function is a static member of the FunctionComparator class because it
is tightly coupled to the exact comparison function used. For example, functions
which are equivalent modulo a single variant callsite might be merged by a more
aggressive MergeFunctions, and the hash function would need to be insensitive to
these differences in order to exploit this.

The hashing function uses a utility class which accumulates the values into an
internal state using a standard bit-mixing function. Note that this is a different interface
than a regular hashing routine, because the values to be hashed are scattered
amongst the properties of a llvm::Function, not linear in memory. This scheme is
fast because only one word of state needs to be kept, and the mixing function is
a few instructions.

The main runOnModule function first computes the hash of each function, and only
further processes functions which do not have a unique function hash. The hash
is also used to order the sorted function set. If the hashes differ, their
values are used to order the functions, otherwise the full comparison is done.

Both of these are helpful in speeding up MergeFunctions. Together they result in
speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
three cases, the new speed of MergeFunctions is about half that of the module
verifier, making it relatively inexpensive even for large LTO builds with
hundreds of thousands of functions. The same functions are merged, so this
change is free performance.

Author: jrkoenig

Reviewers: nlewycky, dschuff, jfb

Subscribers: llvm-commits, aemerson

Differential revision: http://reviews.llvm.org/D11923

llvm-svn: 245140
2015-08-15 01:18:18 +00:00
David Majnemer
10f2d9234b [IR] Add token types
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.

There are several applications for such a type but my immediate
motivation stems from WinEH.  Our personality routine enforces a
single-entry - single-exit regime for cleanups.  After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together.  We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.

Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup.  This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.

What is the burden to the optimizer?  Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway.  There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.

Differential Revision: http://reviews.llvm.org/D11861

llvm-svn: 245029
2015-08-14 05:09:07 +00:00
Chandler Carruth
5effbccc9f [PM/AA] Extract the interface for GlobalsModRef into a header along with
its creation function.

This required shifting a bunch of method definitions to be out-of-line
so that we could leave most of the implementation guts in the .cpp file.

llvm-svn: 245021
2015-08-14 03:48:20 +00:00
Chandler Carruth
ac11f6dc12 [PM/AA] Hoist the interface to TBAA into a dedicated header along with
its creation function. Update the relevant includes accordingly.

llvm-svn: 245019
2015-08-14 03:33:48 +00:00
Chandler Carruth
0e1aede735 [PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the
creation function there.

Same basic refactoring as the other alias analyses. Nothing special
required this time around.

llvm-svn: 245012
2015-08-14 02:55:50 +00:00
Chandler Carruth
394485dd25 [PM/AA] Extract a minimal interface for CFLAA to its own header file.
I've used forward declarations and reorderd the source code some to make
this reasonably clean and keep as much of the code as possible in the
source file, including all the stratified set details. Just the basic AA
interface and the create function are in the header file, and the header
file is now included into the relevant locations.

llvm-svn: 245009
2015-08-14 02:42:20 +00:00
Teresa Johnson
a8fc4e7498 Enable EliminateAvailableExternally pass in the LTO pipeline.
Summary:
For LTO we need to enable this pass in the LTO pipeline,
as it is skipped during the "-flto -c" compile step (when PrepareForLTO is
set).

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11919

llvm-svn: 244622
2015-08-11 16:26:41 +00:00
Sanjay Patel
39f0213518 Variable names should start with an upper case letter; NFC
llvm-svn: 244618
2015-08-11 16:05:43 +00:00
Chandler Carruth
a0655c50ee [PM/AA] Hoist the interface for BasicAA into a header file.
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.

I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.

llvm-svn: 244197
2015-08-06 07:33:15 +00:00
David Blaikie
4b09d11d27 -Wdeprecated cleanup: Make CallGraph movable by default by using unique_ptr members rather than raw pointers.
The only place that tries to return a CallGraph by value
(CallGraphAnalysis::run) doesn't seem to be used right now, but it's a
reasonable bit of cleanup anyway.

llvm-svn: 244122
2015-08-05 20:55:50 +00:00
Sanjay Patel
83e1c48540 wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994
2015-08-04 15:49:57 +00:00
David Majnemer
cce4d2aeb3 Drive-by fixes for LandingPad -> EHPad
This change was done as an audit and is by inspection.  The new EH
system is still very much a work in progress.  NFC for the landingpad
case.

llvm-svn: 243965
2015-08-04 08:21:40 +00:00
Craig Topper
bbb2ce25cc De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842
2015-08-01 22:20:21 +00:00
Peter Collingbourne
fa2563134a LowerBitSets: Add debugging output.
Differential Revision: http://reviews.llvm.org/D11583

llvm-svn: 243546
2015-07-29 18:12:36 +00:00
Diego Novillo
0a1bb40d4c Remove unused variable. NFC.
llvm-svn: 243145
2015-07-24 19:18:32 +00:00
Jingyue Wu
344082ead8 Remove the user-count threshold when analyzing read attributes
Summary:
This threshold limited FunctionAttrs ability to prove arguments to be read-only. 
In NVPTX, a specialized instruction ld.global.nc can be used to load memory
with non-coherent texture cache. We notice that in SHOC [1] benchmark, some
function arguments are not marked with readonly because FunctionAttrs reaches
a hardcoded threshold when analysis uses.

Removing this threshold won't cause significant regression in compilation time, because the worst-case time complexity of the algorithm is still O(# of instructions) for each parameter.

Patched by Xuetian Weng.  

[1] https://github.com/vetter/shoc

Reviewers: nlewycky, jingyue, nicholas

Subscribers: nicholas, test, llvm-commits

Differential Revision: http://reviews.llvm.org/D11311

llvm-svn: 243141
2015-07-24 19:05:53 +00:00
Pete Cooper
31257c8c3c Use foreach loops for StructType::elements(). NFC.
We had a few places where we did

for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {

but those could instead do

for (auto *EltTy : STy->elements()) {

llvm-svn: 243136
2015-07-24 18:55:49 +00:00
Chandler Carruth
efdadcc65a [GMR] Add a late run of GlobalsModRef to the main pass pipeline behind
the general GMR-in-non-LTO flag.

Without this, we have the global information during the CGSCC pipeline
for GVN and such, but don't have it available during the late loop
optimizations such as the vectorizer. Moreover, after the CGSCC pipeline
has finished we have substantially more accurate and refined call graph
information, function annotations, etc, which will make GMR even more
powerful than it is early in the pipelien.

Note that we have to play silly games with preserving AliasAnalysis
(which is now trivially preserved) in order to let a module analysis
magically be preserved into the entire function pass pipeline.
Simultaneously we have to not make GMR an immutable pass in order to be
able to re-run it and collect fresh data on the final call graph.

llvm-svn: 242999
2015-07-23 09:34:01 +00:00
Chandler Carruth
2e896f4f08 [PM/AA] Extract the ModRef enums from the AliasAnalysis class in
preparation for de-coupling the AA implementations.

In order to do this, they had to become fake-scoped using the
traditional LLVM pattern of a leading initialism. These can't be actual
scoped enumerations because they're bitfields and thus inherently we use
them as integers.

I've also renamed the behavior enums that are specific to reasoning
about the mod/ref behavior of functions when called. This makes it more
clear that they have a very narrow domain of applicability.

I think there is a significantly cleaner API for all of this, but
I don't want to try to do really substantive changes for now, I just
want to refactor the things away from analysis groups so I'm preserving
the exact original design and just cleaning up the names, style, and
lifting out of the class.

Differential Revision: http://reviews.llvm.org/D10564

llvm-svn: 242963
2015-07-22 23:15:57 +00:00
Anthony Pesch
769ee8846e Revert "Improve merging of stores from static constructors in GlobalOpt"
This reverts commit 0a9dee959a30b81b9e7df64c9a58ff9898c24024.

llvm-svn: 242954
2015-07-22 22:26:54 +00:00
Anthony Pesch
acd0c70ff9 Revert "IPO: Avoid brace initialization of a map, some versions of libc++ don't like it"
This reverts commit fc2dad0c68f8d32273d3c2d790ed496961f829af.

llvm-svn: 242953
2015-07-22 22:26:52 +00:00
Justin Bogner
e1590d2f22 IPO: Avoid brace initialization of a map, some versions of libc++ don't like it
Should fix the build failure on these darwin bots:

http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_build/12427/
http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/10389/

llvm-svn: 242945
2015-07-22 21:41:12 +00:00
Anthony Pesch
7bfb3a910e Improve merging of stores from static constructors in GlobalOpt
Summary:
While working on a project I wound up generating a fairly large lookup table (10k entries) of callbacks inside of a static constructor. Clang was taking upwards of ~10 minutes to compile the lookup table. I generated a smaller test case (http://www.inolen.com/static_initializer_test.ll) that, after running with -ftime-report, pointed fingers at GlobalOpt and MemCpyOptimizer.

Running globalopt took around ~9 minutes. The slowdown came from how GlobalOpt merged stores from static constructors individually into the global initializer in EvaluateStaticConstructor. For each store it discovered and wanted to commit, it would copy the existing global initializer and then merge in the individual store. I changed this so that stores are now grouped by global, and sorted from most significant to least significant by their GEP indexes (e.g. a store to GEP 0, 0 comes before GEP 0, 0, 1). With this representation, the existing initializer can be copied and all new stores merged into it in a single pass.

With this patch and http://reviews.llvm.org/D11198, the lookup table that was taking ~10 minutes to compile now compiles in around 5 seconds. I've ran 'make check' and the test-suite, which all passed.

I'm not really sure who to tag as a reviewer, Lang mentioned that Chandler may be appropriate.

Reviewers: chandlerc, nlewycky

Subscribers: nlewycky, llvm-commits

Differential Revision: http://reviews.llvm.org/D11200

llvm-svn: 242935
2015-07-22 21:10:45 +00:00
Anthony Pesch
78b72fd3c8 Test commit, added blank line
llvm-svn: 242923
2015-07-22 18:50:10 +00:00
Chandler Carruth
1665e31f7b [GMR] Add a flag to enable GlobalsModRef in the normal compilation
pipeline.

Even before I started improving its runtime, it was already crazy fast
once the call graph exists, and if we can get it to be conservatively
correct, will still likely catch a lot of interesting and useful cases.
So it may well be useful to enable by default.

But more importantly for me, this should make it easier for me to test
that changes aren't breaking it in fundamental ways by enabling it for
normal builds.

llvm-svn: 242895
2015-07-22 11:57:28 +00:00
Chandler Carruth
cdb8301de0 [PM/AA] Remove the last of the legacy update API from AliasAnalysis as
part of simplifying its interface and usage in preparation for porting
to work with the new pass manager.

Note that this will likely expose that we have dead arguments, members,
and maybe even pass requirements for AA. I'll be cleaning those up in
seperate patches. This just zaps the actual update API.

Differential Revision: http://reviews.llvm.org/D11325

llvm-svn: 242881
2015-07-22 09:49:59 +00:00
Arnold Schwaighofer
dbca53379c MergeFunc: Transfer the callee's attributes when replacing a direct caller
We insert a bitcast which obfuscates the getCalledFunction for the utility
function which looks up attributes from the called function. Loosing ABI
changing parameter attributes is a bad thing.

rdar://21516488

llvm-svn: 242807
2015-07-21 17:07:07 +00:00
Arnold Schwaighofer
a066a47d7f Revert "MergeFuncs: Transfer the function parameter attributes to the call site"
It is okay to not transfer parameter attributes.

This reverts commit r242558.

llvm-svn: 242646
2015-07-19 19:30:43 +00:00
Yaron Keren
f08ef41c26 Narrow Callee scope, suggestion from David Blaikie.
llvm-svn: 242644
2015-07-19 15:48:07 +00:00
Yaron Keren
e53c5e66f9 De-duplicate CS.getCalledFunction() expression.
Not sure if the optimizer will save the call as getCalledFunction()
is not a trivial access function but the code is clearer this way.

llvm-svn: 242641
2015-07-19 11:52:02 +00:00
Yaron Keren
85b5639b0e Rangify for loops in GlobalDCE, NFC.
llvm-svn: 242619
2015-07-18 19:57:34 +00:00
Arnold Schwaighofer
f23a39e929 MergeFuncs: Transfer the function parameter attributes to the call site
rdar://21516488

llvm-svn: 242558
2015-07-17 18:59:08 +00:00
Peter Collingbourne
0d8bc759d0 Internalize: internalize comdat members as a group, and drop comdat on such members.
Internalizing an individual comdat group member without also internalizing
the other members of the comdat can break comdat semantics. For example,
if a module contains a reference to an internalized comdat member, and the
linker chooses a comdat group from a different object file, this will break
the reference to the internalized member.

This change causes the internalizer to only internalize comdat members if all
other members of the comdat are not externally visible. Once a comdat group
has been fully internalized, there is no need to apply comdat rules to its
members; later optimization passes (e.g. globaldce) can legally drop individual
members of the comdat. So we drop the comdat attribute from all comdat members.

Differential Revision: http://reviews.llvm.org/D10679

llvm-svn: 242423
2015-07-16 17:42:21 +00:00
Tobias Grosser
12abedf39e Add PM extension point EP_VectorizerStart
This extension point allows passes to be executed right before the vectorizer
and other highly target specific optimizations are run.

llvm-svn: 242389
2015-07-16 08:20:37 +00:00
JF Bastien
a8f830427c Fix mergefunc infinite loop
Self-referential constants containing references to a merged function
no longer cause the MergeFunctions pass to infinite loop. Also adds a
reproduction IR which would otherwise fail, which was isolated from a similar
issue in Chromium.

Author: jrkoenig
Reviewers: nlewycky, jfb
Subscribers: llvm-commits, nlewycky, jfb

Differential Revision: http://reviews.llvm.org/D11208

llvm-svn: 242337
2015-07-15 21:51:33 +00:00
Rafael Espindola
9cb6e1b395 Remove unused variable.
Sorry I missed it in the previous commit.

llvm-svn: 242032
2015-07-13 14:43:33 +00:00
Rafael Espindola
a3587e99c3 Aliases don't have available_externally linkage.
Allowing that is probably a good idea, but currently we don't, so
this is dead code.

llvm-svn: 242031
2015-07-13 14:39:02 +00:00
Rafael Espindola
5af0439872 Don't change the visibility when converting a definition to a declaration.
llvm-svn: 242030
2015-07-13 14:18:22 +00:00
Chandler Carruth
c98a5f7bff [PM/AA] Completely remove the AliasAnalysis::copyValue interface.
No in-tree alias analysis used this facility, and it was not called in
any particularly rigorous way, so it seems unlikely to be correct.

Note that one of the only stateful AA implementations in-tree,
GlobalsModRef is completely broken currently (and any AA passes like it
are equally broken) because Module AA passes are not effectively
invalidated when a function pass that fails to update the AA stack runs.

Ultimately, it doesn't seem like we know how we want to build stateful
AA, and until then trying to support and maintain correctness for an
untested API is essentially impossible. To that end, I'm planning to rip
out all of the update API. It can return if and when we need it and know
how to build it on top of the new pass manager and as part of *tested*
stateful AA implementations in the tree.

Differential Revision: http://reviews.llvm.org/D10889

llvm-svn: 241975
2015-07-11 04:39:00 +00:00
Alexey Bataev
d7edeff54f Disable loop re-rotation for -Oz (patch by Andrey Turetsky)
After changes in rL231820 loop re-rotation is performed even in -Oz mode. Since loop rotation is disabled for -Oz, it seems loop re-rotation should be disabled too.
Differential Revision: http://reviews.llvm.org/D10961

llvm-svn: 241897
2015-07-10 10:37:09 +00:00
Reid Kleckner
9e47912607 [llvm-extract] Drop comdats from declarations
The verifier rejects comdats on declarations.

llvm-svn: 241483
2015-07-06 18:48:02 +00:00
Teresa Johnson
edf145601a Resubmit "Add new EliminateAvailableExternally module pass" (r239480)
This change includes a fix for https://code.google.com/p/chromium/issues/detail?id=499508#c3,
which required updating the visibility for symbols with eliminated definitions.

--Original Commit Message--

Add new EliminateAvailableExternally module pass, which is performed in
O2 compiles just before GlobalDCE, unless we are preparing for LTO.

This pass eliminates available externally globals (turning them into
declarations), regardless of whether they are dead/unreferenced, since
we are guaranteed to have a copy available elsewhere at link time.
This enables additional opportunities for GlobalDCE.

If we are preparing for LTO (e.g. a -flto -c compile), the pass is not
included as we want to preserve available externally functions for possible
link time inlining. The FE indicates whether we are doing an -flto compile
via the new PrepareForLTO flag on the PassManagerBuilder.

llvm-svn: 241466
2015-07-06 16:22:42 +00:00
Peter Collingbourne
f49ef7d3ac IR: Do not consider available_externally linkage to be linker-weak.
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.

Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.

Differential Revision: http://reviews.llvm.org/D10941

llvm-svn: 241413
2015-07-05 20:52:35 +00:00
Yaron Keren
024a437e42 Remove whitespace from start of line, NFC.
llvm-svn: 241268
2015-07-02 14:25:09 +00:00
David Majnemer
bce3d2c083 [PruneEH] A naked, noinline function can return via InlineAsm
The PruneEH pass tries to annotate functions as 'noreturn' if it doesn't
see a ReturnInst.  However, a naked function containing inline assembly
can contain control flow leaving the function.

This fixes PR23971.

llvm-svn: 240876
2015-06-27 07:52:53 +00:00
Peter Collingbourne
2603d36970 LowerBitSets: Ignore bitset entries that do not directly refer to a global.
It is possible for a global to be substituted with another global of a
different type or a different kind (i.e. an alias) at IR link time. One
example of this scenario is when a Microsoft ABI vtable is substituted with
an alias referring to a larger vtable containing an RTTI reference.

This will cause the global to be RAUW'd with a possibly bitcasted reference
to the other global. This will of course also affect any references to the
global in bitset metadata.

The right way to handle such metadata is simply to ignore it. This is sound
because the linked module should contain another copy of the bitset entries as
applied to the new global.

llvm-svn: 240866
2015-06-27 00:17:51 +00:00
Pete Cooper
c2ffa0891f Use foreach loop over constant operands. NFC.
A number of places had explicit loops over Constant::operands().
Just use foreach loops where possible.

llvm-svn: 240694
2015-06-25 20:51:38 +00:00
Yaron Keren
dce18afe3b Rangify for loop in Inliner.cpp. NFC.
llvm-svn: 240678
2015-06-25 19:28:24 +00:00
Alexander Kornienko
f993659b8f Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Yaron Keren
a1bf88d988 Rangify for loops in Inliner::runOnSCC(), NFC.
llvm-svn: 240215
2015-06-20 07:12:33 +00:00
Alexander Kornienko
40cb19d802 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
David Majnemer
c8b1f095a3 Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

llvm-svn: 239940
2015-06-17 20:52:32 +00:00
Peter Collingbourne
e555c294f0 LowerBitSets: Do not assign names to aliases of unnamed bitset element objects.
The restriction on unnamed aliases was removed in r239921. Mostly reverts
r239590, but we keep the test.

llvm-svn: 239923
2015-06-17 18:31:02 +00:00
Chandler Carruth
aa98916d54 [PM/AA] Remove the UnknownSize static member from AliasAnalysis.
This is now living in MemoryLocation, which is what it pertains to. It
is also an enum there rather than a static data member which is left
never defined.

llvm-svn: 239886
2015-06-17 07:21:38 +00:00
Chandler Carruth
cc1aae13e7 [PM/AA] Remove the Location typedef from the AliasAnalysis class now
that it is its own entity in the form of MemoryLocation, and update all
the callers.

This is an entirely mechanical change. References to "Location" within
AA subclases become "MemoryLocation", and elsewhere
"AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps
out-of-tree folks update.

llvm-svn: 239885
2015-06-17 07:18:54 +00:00
Peter Collingbourne
ea9bf98c05 Protection against stack-based memory corruption errors using SafeStack
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).

The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.

Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.

This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:

- Add the safestack function attribute, similar to the ssp, sspstrong and
  sspreq attributes.

- Add the SafeStack instrumentation pass that applies the safe stack to all
  functions that have the safestack attribute. This pass moves all unsafe local
  variables to the unsafe stack with a separate stack pointer, whereas all
  safe variables remain on the regular stack that is managed by LLVM as usual.

- Invoke the pass as the last stage before code generation (at the same time
  the existing cookie-based stack protector pass is invoked).

- Add unit tests for the safe stack.

Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.

Differential Revision: http://reviews.llvm.org/D6094

llvm-svn: 239761
2015-06-15 21:07:11 +00:00
Peter Collingbourne
06ca544fd2 LowerBitSets: Give names to aliases of unnamed bitset element objects.
It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.

llvm-svn: 239590
2015-06-12 03:25:05 +00:00
Teresa Johnson
a9e6ea6582 Revert commit r239480 as it causes https://code.google.com/p/chromium/issues/detail?id=499508#c3.
llvm-svn: 239589
2015-06-12 03:12:00 +00:00
Peter Collingbourne
1285efe0c1 ArgumentPromotion: Drop sret attribute on functions that are only called directly.
If the first argument to a function is a 'this' argument and the second
has the sret attribute, the ArgumentPromotion pass may promote the 'this'
argument to more than one argument, violating the IR constraint that 'sret'
may only be applied to the first or second argument.

Although this IR constraint is arguably unnecessary, it highlighted the fact
that ArgPromotion does not need to preserve this attribute. Dropping the
attribute reduces register pressure in the backend by avoiding the register
copy required by sret. Because sret implies noalias, we also replace the
former with the latter.

Differential Revision: http://reviews.llvm.org/D10353

llvm-svn: 239488
2015-06-10 21:14:34 +00:00
Teresa Johnson
3b11c8e6e3 Add new EliminateAvailableExternally module pass, which is performed in
O2 compiles just before GlobalDCE, unless we are preparing for LTO.

This pass eliminates available externally globals (turning them into
declarations), regardless of whether they are dead/unreferenced, since
we are guaranteed to have a copy available elsewhere at link time.
This enables additional opportunities for GlobalDCE.

If we are preparing for LTO (e.g. a -flto -c compile), the pass is not
included as we want to preserve available externally functions for possible
link time inlining. The FE indicates whether we are doing an -flto compile
via the new PrepareForLTO flag on the PassManagerBuilder.

llvm-svn: 239480
2015-06-10 17:49:28 +00:00
Akira Hatanaka
c42437a4f8 Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427
2015-06-09 19:07:19 +00:00
Arnold Schwaighofer
215d830ec3 MergeFunctions: Don't replace a weak function use by another equivalent weak function
We don't know whether the weak functions definition is the definitive definition.

rdar://21303727

llvm-svn: 239422
2015-06-09 18:19:17 +00:00
Denis Protivensky
4582c79879 MergeFunctions: Fix gcc warning in condition
llvm-svn: 239391
2015-06-09 09:28:37 +00:00
Arnold Schwaighofer
359de6b293 Fix unused variable warning
llvm-svn: 239369
2015-06-09 00:17:40 +00:00
Arnold Schwaighofer
50df132702 MergeFunctions: Impose a total order on the replacement of functions
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.

If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.

rdar://21265586

llvm-svn: 239367
2015-06-09 00:03:29 +00:00
Chandler Carruth
935258c90a [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and
port it to the new pass manager.

All this does is extract the inner "location" class used by AA into its
own full fledged type. This seems *much* cleaner as MemoryDependence and
soon MemorySSA also use this heavily, and it doesn't make much sense
being inside the AA infrastructure.

This will also make it much easier to break apart the AA infrastructure
into something that stands on its own rather than using the analysis
group design.

There are a few places where this makes APIs not make sense -- they were
taking an AliasAnalysis pointer just to build locations. I'll try to
clean those up in follow-up commits.

Differential Revision: http://reviews.llvm.org/D10228

llvm-svn: 239003
2015-06-04 02:03:15 +00:00
Benjamin Kramer
0e31955b32 Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.


Call sites were found with the ASTMatcher + some semi-automated cleanup.

memberCallExpr(
    argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
    on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
    hasArgument(0, bindTemporaryExpr(
                       hasType(recordDecl(hasNonTrivialDestructor())),
                       has(constructExpr()))),
    unless(isInTemplateInstantiation()))

No functional change intended.

llvm-svn: 238602
2015-05-29 19:43:39 +00:00
Benjamin Kramer
5a56e44e39 Don't call utostr in Twine/raw_ostream contexts.
Creating temporary std::strings there is unnecessary.

llvm-svn: 238412
2015-05-28 11:24:24 +00:00
Bjorn Steinbrink
8603eb4818 Remove conflicting attributes before adding deduced readonly/readnone
Summary:
In case of functions that have a pointer argument and only pass it to
each other, the function attributes pass deduces that the pointer should
get the readnone attribute, but fails to remove a readonly attribute
that may already have been present.

Reviewers: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9995

llvm-svn: 238152
2015-05-25 19:46:38 +00:00
Wei Mi
cb1cc5a43c Remove the InstructionSimplifierPass immediately after InstructionCombiningPass.
InstructionCombiningPass was added after LoopUnrollPass in r237395. Because
InstructionCombiningPass is strictly more powerful than InstructionSimplifierPass,
remove the unnecessary InstructionSimplifierPass.

Differential Revision: http://reviews.llvm.org/D9838

llvm-svn: 237702
2015-05-19 16:09:11 +00:00
Jingyue Wu
e40608ff17 [NFC] remove an extra new line
llvm-svn: 237462
2015-05-15 18:32:21 +00:00
Jingyue Wu
d795ba5ad9 Add a speculative execution pass
Summary:
This is a pass for speculative execution of instructions for simple if-then (triangle) control flow. It's aimed at GPUs, but could perhaps be used in other contexts. Enabling this pass gives us a 1.0% geomean improvement on Google benchmark suites, with one benchmark improving 33%.

Credit goes to Jingyue Wu for writing an earlier version of this pass.

Patched by Bjarke Roune. 

Test Plan:
This patch adds a set of tests in test/Transforms/SpeculativeExecution/spec.ll
The pass is controlled by a flag which defaults to having the pass not run.

Reviewers: eliben, dberlin, meheff, jingyue, hfinkel

Reviewed By: jingyue, hfinkel

Subscribers: majnemer, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9360

llvm-svn: 237459
2015-05-15 17:54:48 +00:00
Wei Mi
ce82536353 Add another InstCombine pass after LoopUnroll.
This is to cleanup some redundency generated by LoopUnroll pass. Such redundency may not be cleaned up by existing passes after LoopUnroll.

Differential Revision: http://reviews.llvm.org/D9777

llvm-svn: 237395
2015-05-14 22:02:54 +00:00
Adam Nemet
b3422c24d4 New Loop Distribution pass
Summary:
This implements the initial version as was proposed earlier this year
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html).
Since then Loop Access Analysis was split out from the Loop Vectorizer
and was made into a separate analysis pass.  Loop Distribution becomes
the second user of this analysis.

The pass is off by default and can be enabled
with -enable-loop-distribution.  There is currently no notion of
profitability; if there is a loop with dependence cycles, the pass will
try to split them off from other memory operations into a separate loop.

I decided to remove the control-dependence calculation from this first
version.  This and the issues with the PDT are actively discussed so it
probably makes sense to treat it separately.  Right now I just mark all
terminator instruction required which keeps identical CFGs for each
distributed loop.  This seems to be working pretty well for 456.hmmer
where even though there is an empty if-then block in the distributed
loop initially, it gets completely removed.

The pass keeps DominatorTree and LoopInfo updated.  I've tested this
with -loop-distribute-verify with the testsuite where we distribute ~90
loops.  SimplifyLoop is violated in some cases and I have a FIXME
covering this.

Reviewers: hfinkel, nadav, aschwaighofer

Reviewed By: aschwaighofer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8831

llvm-svn: 237358
2015-05-14 12:05:18 +00:00
Arnold Schwaighofer
f4e875470e MergeFunctions: Two different sized allocas are *not* the same
llvm-svn: 237193
2015-05-12 21:42:22 +00:00
Pete Cooper
cd94898d6b Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.
We already had a method to iterate over all the incoming values of a PHI.  This just changes all eligible code to use it.

Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB.

llvm-svn: 237169
2015-05-12 20:05:31 +00:00
David Blaikie
bcf1246376 Recommit r236670: [opaque pointer type] Pass explicit pointer type through GEP constant folding""
Clang regressions were caused by more stringent assertion checking
introduced by this change. Small fix needed to clang has been committed
in r236751.

llvm-svn: 236752
2015-05-07 17:28:58 +00:00
David Blaikie
78bd95f7d4 Revert "[opaque pointer type] Pass explicit pointer type through GEP constant folding"
Causes regressions in Clang. Reverting while I investigate.

This reverts commit r236670.

llvm-svn: 236678
2015-05-06 23:56:21 +00:00
David Blaikie
70e0090e88 [opaque pointer type] Pass explicit pointer type through GEP constant folding
llvm-svn: 236670
2015-05-06 23:49:14 +00:00
Pete Cooper
762f70e897 Change typeIncompatible to return an AttrBuilder instead of new-ing an AttributeSet.
This makes use of the new API which can remove attributes from a set given a builder.

This is much faster than creating a temporary set and reduces llc time by about 0.3% which was all spent creating temporary attributes sets on the context.

llvm-svn: 236668
2015-05-06 23:19:56 +00:00
David Majnemer
06d0bf99d8 [Inliner] Discard empty COMDAT groups
COMDAT groups which have become rendered unused because of inline are
discardable if we can prove that we've made the group empty.

This fixes PR22285.

llvm-svn: 236539
2015-05-05 20:14:22 +00:00
David Blaikie
0f91b70796 [opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space
Many of the callers already have the pointer type anyway, and for the
couple of callers that don't it's pretty easy to call PointerType::get
on the pointee type and address space.

This avoids LLParser from using PointerType::getElementType when parsing
GlobalAliases from IR.

llvm-svn: 236160
2015-04-29 21:22:39 +00:00