1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

2497 Commits

Author SHA1 Message Date
Justin Lebar
f84464712c Revert "[attrs] Handle convergent CallSites."
This reverts r261544, which was causing a test failure in
Transforms/FunctionAttrs/readattrs.ll.

llvm-svn: 261549
2016-02-22 18:24:43 +00:00
Justin Lebar
ca379cda9f [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Reviewers: chandlerc, jingyue

Subscribers: hfinkel, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17317

llvm-svn: 261544
2016-02-22 17:51:35 +00:00
Duncan P. N. Exon Smith
d5e432aea7 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Richard Trieu
5a759985de Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Chandler Carruth
d8a5b5b32e [PM] Port the PostOrderFunctionAttrs pass to the new pass manager and
convert one test to use this.

This is a particularly significant milestone because it required
a working per-function AA framework which can be queried over each
function from within a CGSCC transform pass (and additionally a module
analysis to be accessible). This is essentially *the* point of the
entire pass manager rewrite. A CGSCC transform is able to query for
multiple different function's analysis results. It works. The whole
thing appears to actually work and accomplish the original goal. While
we were able to hack function attrs and basic-aa to "work" in the old
pass manager, this port doesn't use any of that, it directly leverages
the new fundamental functionality.

For this to work, the CGSCC framework also has to support SCC-based
behavior analysis, etc. The only part of the CGSCC pass infrastructure
not sorted out at this point are the updates in the face of inlining and
running function passes that mutate the call graph.

The changes are pretty boring and boiler-plate. Most of the work was
factored into more focused preperatory patches. But this is what wires
it all together.

llvm-svn: 261203
2016-02-18 11:03:11 +00:00
Mehdi Amini
6f127aff71 Define the ThinLTO Pipeline (experimental)
Summary:
On the contrary to Full LTO, ThinLTO can afford to shift compile time
from the frontend to the linker: both phases are parallel (even if
it is not totally "free": projects like clang are reusing product
from the "compile phase" for multiple link, think about
libLLVMSupport reused for opt, llc, etc.).

This pipeline is based on the proposal in D13443 for full LTO. We
didn't move forward on this proposal because the LTO link was far too
long after that. We believe that we can afford it with ThinLTO.

The ThinLTO pipeline integrates in the regular O2/O3 flow:

 - The compile phase perform the inliner with a somehow lighter
   function simplification. (TODO: tune the inliner thresholds here)
   This is intendend to simplify the IR and get rid of obvious things
   like linkonce_odr that will be inlined.
 - The link phase will run the pipeline from the start, extended with
   some specific passes that leverage the augmented knowledge we have
   during LTO. Especially after the inliner is done, a sequence of
   globalDCE/globalOpt is performed, followed by another run of the
   "function simplification" passes. It is not clear if this part
   of the pipeline will stay as is, as the split model of ThinLTO
   does not allow the same benefit as FullLTO without added tricks.

The measurements on the public test suite as well as on our internal
suite show an overall net improvement. The binary size for the clang
executable is reduced by 5%. We're still tuning it with the bringup
of ThinLTO and it will evolve, but this should provide a good starting
point.

Reviewers: tejohnson

Differential Revision: http://reviews.llvm.org/D17115

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261029
2016-02-16 23:02:29 +00:00
Mehdi Amini
639bf1e488 Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()" (NFC)
It is intended to contains the passes run over a function after the
inliner is done with a function and before it moves to its callers.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261028
2016-02-16 22:54:27 +00:00
Benjamin Kramer
dceab201ec Use ArrayRef to hide SmallVector details, kill a useless vector copy along the way.
llvm-svn: 260824
2016-02-13 16:01:12 +00:00
Chandler Carruth
710380f52f [attrs] Move the norecurse deduction to operate on the node set rather
than the SCC object, and have it scan the instruction stream directly
rather than relying on call records.

This makes the behavior of this routine consistent between libc routines
and LLVM intrinsics for libc routines. We can go and start teaching it
about those being norecurse, but we should behave the same for the
intrinsic and the libc routine rather than differently. I chatted with
James Molloy and the inconsistency doesn't seem intentional and likely
is due to intrinsic calls not being modelled in the call graph analyses.

This also fixes a bug where we would deduce norecurse on optnone
functions, when generally we try to handle optnone functions as-if they
were replaceable and thus unanalyzable.

llvm-svn: 260813
2016-02-13 08:47:51 +00:00
Chandler Carruth
bcbc1b4b16 [attrs] Simplify the convergent removal to directly use the pre-built
node set rather than walking the SCC directly.

This directly exposes the functions and has already had null entries
filtered out. We also don't need need to handle optnone as it has
already been handled in the caller -- we never try to remove convergent
when there are optnone functions in the SCC.

With this change, the code for removing convergent should work with the
new pass manager and a different SCC analysis.

llvm-svn: 260668
2016-02-12 09:47:49 +00:00
Chandler Carruth
3fa7bccaae [attrs] Consolidate the test for a non-SCC, non-convergent function call
with the test for a non-convergent intrinsic call.

While it is possible to use the call records to search for function
calls, we're going to do an instruction scan anyways to find the
intrinsics, we can handle both cases while scanning instructions. This
will also make the logic more amenable to the new pass manager which
doesn't use the same call graph structure.

My next patch will remove use of CallGraphNode entirely and allow this
code to work with both the old and new pass manager. Fortunately, it
should also get strictly simpler without changing functionality.

llvm-svn: 260666
2016-02-12 09:23:53 +00:00
Chandler Carruth
21873dcc8f [attrs] Run clang-format over a newly added routine in function-attrs
before I update it to be friendly with the new pass manager.

llvm-svn: 260653
2016-02-12 03:07:50 +00:00
Mehdi Amini
b04fa7a485 Revert "Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()""
This reverts commit r260603.
I didn't intend to push it :(

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260607
2016-02-11 22:09:11 +00:00
Mehdi Amini
984140a546 Revert "Define the ThinLTO Pipeline"
This reverts commit r260604.
I didn't intend to push this now.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260606
2016-02-11 22:09:07 +00:00
Mehdi Amini
72f4a4c810 Define the ThinLTO Pipeline
Summary:
On the contrary to Full LTO, ThinLTO can afford to shift compile time
from the frontend to the linker: both phases are parallel.
This pipeline is based on the proposal in D13443 for full LTO. We ]
didn't move forward on this proposal because the link was far too long
after that.

This patch refactor the "function simplification" passes that are part
of the inliner loop in a helper function (this part is NFC and can be
commited separately to simplify the diff). The ThinLTO pipeline
integrates in the regular O2/O3 flow:

 - The compile phase perform the inliner with a somehow lighter
   function simplification. (TODO: tune the inliner thresholds here)
   This is intendend to simplify the IR and get rid of obvious things
   like linkonce_odr that will be inlined.
 - The link phase will run the pipeline from the start, extended with
   some specific passes that leverage the augmented knowledge we have
   during LTO. Especially after the inliner is done, a sequence of
   globalDCE/globalOpt is performed, followed by another run of the
   "function simplification" passes.

The measurements on the public test suite as well as on our internal
suite show an overall net improvement. The binary size for the clang
executable is reduced by 5%. We're still tuning it with the bringup
of ThinLTO but this should provide a good starting point.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits, dexonsmith

Differential Revision: http://reviews.llvm.org/D17115

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260604
2016-02-11 22:00:31 +00:00
Mehdi Amini
ce40bf909a Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()"
It is intended to contains the passes run over a function after the
inliner is done with a function and before it moves to its callers.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260603
2016-02-11 22:00:25 +00:00
Ashutosh Nema
dfd5a90734 Fixed typo in comment & coding style for LoopVersioningLICM.
llvm-svn: 260504
2016-02-11 09:23:53 +00:00
Teresa Johnson
0f2dd60a58 Fix Windows bot failure in Transforms/FunctionImport/funcimport.ll
Make sure we split ":" from the end of the global function id (which
is <path>:<function> for local functions) instead of the beginning to
avoid splitting at the wrong place for Windows file paths that contain
a ":".

llvm-svn: 260469
2016-02-10 23:47:38 +00:00
Mehdi Amini
4f13675fcd FunctionImport: add a progressive heuristic to limit importing too deep in the callgraph
The current function importer will walk the callgraph, importing
transitively any callee that is below the threshold. This can
lead to import very deep which is costly in compile time and not
necessarily beneficial as most of the inline would happen in
imported function and not necessarilly in user code.

The actual factor has been carefully chosen by flipping a coin ;)
Some tuning need to be done (just at the existing limiting threshold).

Reviewers: tejohnson

Differential Revision: http://reviews.llvm.org/D17082

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260466
2016-02-10 23:31:45 +00:00
Mehdi Amini
ba524073d1 Use a StringSet in Internalize, and allow to create the pass from an existing one (NFC)
There is not reason to pass an array of "char *" to rebuild a set if
the client already has one.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260462
2016-02-10 23:24:31 +00:00
Teresa Johnson
c47e95e1b1 Restore "[ThinLTO] Use MD5 hash in function index." with fix
This restores commit r260408, along with a fix for a bot failure.

The bot failure was caused by dereferencing a unique_ptr in the same
call instruction parameter list where it was passed via std::move.
Apparently due to luck this was not exposed when I built the compiler
with clang, only with gcc.

llvm-svn: 260442
2016-02-10 21:55:02 +00:00
Teresa Johnson
02e161e44f Revert "[ThinLTO] Use MD5 hash in function index." due to bot failure
This reverts commit r260408. Bot failure that I need to investigate.

llvm-svn: 260412
2016-02-10 19:11:15 +00:00
Teresa Johnson
b1e839beb3 [ThinLTO] Use MD5 hash in function index.
Summary:
This patch uses the lower 64-bits of the MD5 hash of a function name as
a GUID in the function index, instead of storing function names. Any
local functions are first given a global name by prepending the original
source file name. This is the same naming scheme and GUID used by PGO in
the indexed profile format.

This change has a couple of benefits. The primary benefit is size
reduction in the combined index file, for example 483.xalancbmk's
combined index file was reduced by around 70%. It should also result in
memory savings for the index file in memory, as the in-memory map is
also indexed by the hash instead of the string.

Second, this enables integration with indirect call promotion, since the
indirect call profile targets are recorded using the same global naming
convention and hash. This will enable the function importer to easily
locate function summaries for indirect call profile targets to enable
their import and subsequent promotion.

The original source file name is recorded in the bitcode in a new
module-level record for use in the ThinLTO backend pipeline.

Reviewers: davidxl, joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D17028

llvm-svn: 260408
2016-02-10 18:57:54 +00:00
Teresa Johnson
aef9d8ceab [ThinLTO] Move global processing from Linker to TransformUtils (NFC)
Summary:
As discussed on IRC, move the ThinLTOGlobalProcessing code out of
the linker, and into TransformUtils. The name of the class is changed
to FunctionImportGlobalProcessing.

Reviewers: joker.eph, rafael

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17081

llvm-svn: 260395
2016-02-10 18:11:31 +00:00
Justin Lebar
f16a73d9fa Add convergent-removing bits to FunctionAttrs pass.
Summary:
Remove the convergent attribute on any functions which provably do not
contain or invoke any convergent functions.

After this change, we'll be able to modify clang to conservatively add
'convergent' to all functions when compiling CUDA.

Reviewers:  jingyue, joker.eph

Subscribers: llvm-commits, tra, jhen, hfinkel, resistor, chandlerc, arsenm

Differential Revision: http://reviews.llvm.org/D17013

llvm-svn: 260319
2016-02-09 23:03:22 +00:00
Peter Collingbourne
e87706e06b Fix GCC build.
llvm-svn: 260317
2016-02-09 23:01:38 +00:00
Peter Collingbourne
54e8749794 WholeProgramDevirt: introduce.
This pass implements whole program optimization of virtual calls in cases
where we know (via bitset information) that the list of callees is fixed. This
includes the following:

- Single implementation devirtualization: if a virtual call has a single
  possible callee, replace all calls with a direct call to that callee.

- Virtual constant propagation: if the virtual function's return type is an
  integer <=64 bits and all possible callees are readnone, for each class and
  each list of constant arguments: evaluate the function, store the return
  value alongside the virtual table, and rewrite each virtual call as a load
  from the virtual table.

- Uniform return value optimization: if the conditions for virtual constant
  propagation hold and each function returns the same constant value, replace
  each virtual call with that constant.

- Unique return value optimization for i1 return values: if the conditions
  for virtual constant propagation hold and a single vtable's function
  returns 0, or a single vtable's function returns 1, replace each virtual
  call with a comparison of the vptr against that vtable's address.

Differential Revision: http://reviews.llvm.org/D16795

llvm-svn: 260312
2016-02-09 22:50:34 +00:00
Sanjoy Das
0a3d627330 [FunctionAttrs] Fix SCC logic around operand bundles
FunctionAttrs does an "optimistic" analysis of SCCs as a unit, which
means normally it is able to disregard calls from an SCC into itself.
However, calls and invokes with operand bundles are allowed to have
memory effects not fully described by the memory effects on the call
target, so we can't be optimistic around operand-bundled calls from an
SCC into itself.

llvm-svn: 260244
2016-02-09 18:40:40 +00:00
Sanjoy Das
ca2c8b410a Add an "addUsedAAAnalyses" helper function
Summary:
Passes that call `getAnalysisIfAvailable<T>` also need to call
`addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the
legacy pass manager that it uses `T`.  This contract was being
violated by passes that used `createLegacyPMAAResults`.  This change
fixes this by exposing a helper in AliasAnalysis.h,
`addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults
and does the right thing when called from `getAnalysisUsage`.

Reviewers: chandlerc

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17010

llvm-svn: 260183
2016-02-09 01:21:57 +00:00
Ashutosh Nema
d6dcbf971a New Loop Versioning LICM Pass
Summary:
When alias analysis is uncertain about the aliasing between any two accesses,
it will return MayAlias. This uncertainty from alias analysis restricts LICM
from proceeding further. In cases where alias analysis is uncertain we might
use loop versioning as an alternative.

Loop Versioning will create a version of the loop with aggressive aliasing
assumptions in addition to the original with conservative (default) aliasing
assumptions. The version of the loop making aggressive aliasing assumptions
will have all the memory accesses marked as no-alias. These two versions of
loop will be preceded by a memory runtime check. This runtime check consists
of bound checks for all unique memory accessed in loop, and it ensures the
lack of memory aliasing. The result of the runtime check determines which of
the loop versions is executed: If the runtime check detects any memory
aliasing, then the original loop is executed. Otherwise, the version with
aggressive aliasing assumptions is used.

The pass is off by default and can be enabled with command line option 
-enable-loop-versioning-licm.

Reviewers: hfinkel, anemet, chatur01, reames

Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga,
             llvm-commits

Differential Revision: http://reviews.llvm.org/D9151

llvm-svn: 259986
2016-02-06 07:47:48 +00:00
Peter Collingbourne
7f6faddfa9 LowerBitSets: Don't bother to do any work if the llvm.bitset.test intrinsic is unused.
llvm-svn: 259625
2016-02-03 03:48:46 +00:00
Peter Collingbourne
7a6e886fda Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused.
llvm-svn: 259621
2016-02-03 02:51:00 +00:00
Craig Topper
ca667eb8a2 Convert int to Twine instead of using utostr since it was already being added to a Twine. NFC
llvm-svn: 259308
2016-01-31 00:15:35 +00:00
Matthias Braun
882ae69776 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Chris Bieneman
1b8d4f74aa Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Evgeniy Stepanov
258db6665b [cfi] Cross-DSO CFI diagnostic mode (LLVM part).
* __cfi_check gets a 3rd argument: ubsan handler data
* Instead of trapping on failure, call __cfi_check_fail which must be
  present in the module (generated in the frontend).

llvm-svn: 258746
2016-01-25 23:35:03 +00:00
David Majnemer
f62478a34a [PruneEH] Don't try to insert a terminator after another terminator
LLVM's BasicBlock has a single terminator, it is not valid to have two.

llvm-svn: 258616
2016-01-23 06:00:44 +00:00
David Majnemer
7a3addc91c [PruneEH] FuncletPads must not have undef operands
Instead of RAUW with undef, replace the first non-token instruction with
unreachable.

This fixes PR26263.

llvm-svn: 258611
2016-01-23 05:41:29 +00:00
David Majnemer
09858a3961 [PruneEH] Unify invoke and call handling in DeleteBasicBlock
No functionality change is intended.

llvm-svn: 258610
2016-01-23 05:41:27 +00:00
David Majnemer
0728f4a41f [PruneEH] Reuse code from removeUnwindEdge
PruneEH had functionality idential to removeUnwindEdge.
Consolidate around removeUnwindEdge.
No functionality change is intended.

llvm-svn: 258609
2016-01-23 05:41:22 +00:00
Sergei Larin
7b219abac0 Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object.
Summary:
Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object.

A good example of improper behavior in the current implementation is section information associated with the GlobalObject. If a section was set for it, and GlobalOpt is creating/modifying a new object based on this one (often copying the original name), without this change new object will be placed in a default section, resulting in inappropriate properties of the new variable.
The argument here is that if customer specified a section for a variable, any changes to it that compiler does should not cause it to change that section allocation.
Moreover, any other properties worth representation in copyAttributesFrom() should also be propagated.

Reviewers: jmolloy, joker-eph, joker.eph

Subscribers: slarin, joker.eph, rafael, tobiasvk, llvm-commits

Differential Revision: http://reviews.llvm.org/D16074

llvm-svn: 258556
2016-01-22 21:18:20 +00:00
Teresa Johnson
2a387148a1 [ThinLTO] Do metadata linking during batch function importing
Summary:
Since we are currently not doing incremental importing there is
no need to link metadata as a postpass. The module linker will
only link in the imported subroutines due to the functionality
added by r256003.

(Note that the metadata postpass linking functionalitiy is still
used by llvm-link, and may be needed here in the future if a more
incremental strategy is adopted.)

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16424

llvm-svn: 258458
2016-01-22 00:15:53 +00:00
Rong Xu
69b08ad25b [PGO] Passmanagerbuilder change that enable IR level PGO instrumentation
This patch includes the passmanagerbuilder change that enables IR level PGO instrumentation. It adds two passmanagerbuilder options: -profile-generate=<profile_filename> and -profile-use=<profile_filename>. The new options are primarily for debug purpose.

Reviewers: davidxl, silvas

Differential Revision: http://reviews.llvm.org/D15828

llvm-svn: 258420
2016-01-21 18:28:59 +00:00
Eduard Burtescu
c55147fcdc [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Eduard Burtescu
313153c723 [opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16272

llvm-svn: 258028
2016-01-18 00:10:01 +00:00
Manuel Jacob
e6438acb66 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Easwaran Raman
9b73e2c66d Refactor threshold computation for inline cost analysis
Differential Revision: http://reviews.llvm.org/D15401

llvm-svn: 257832
2016-01-14 23:16:29 +00:00
Rui Ueyama
dca64dbccc Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
James Molloy
e02efd6bb1 [LTO] Add a run of LoopUnroll
Loop trip counts can often be resolved during LTO. We should obviously be unrolling small loops once those trip counts have been resolved, but we weren't.

llvm-svn: 257767
2016-01-14 15:00:09 +00:00
Teresa Johnson
10e78a41c7 [ThinLTO] Handle an external call from an import to an alias in dest
The findExternalCalls routine ignores calls to functions already
defined in the dest module. This was not handling the case where
the definition in the current module is actually an alias to a
function call.

llvm-svn: 257493
2016-01-12 17:48:44 +00:00
Justin Bogner
879f86bb78 LoopInfo: Simplify ownership of Loop objects
It's strange that LoopInfo mostly owns the Loop objects, but that it
defers deleting them to the loop pass manager. Instead, change the
oddly named "updateUnloop" to "markAsRemoved" and have it queue the
Loop object for deletion. We can't delete the Loop immediately when we
remove it, since we need its pointer identity still, so we'll mark the
object as "invalid" so that clients can see what's going on.

llvm-svn: 257191
2016-01-08 19:08:53 +00:00
Teresa Johnson
675bf59b32 [ThinLTO] Use new in-place symbol changes for exporting module
Due to the new in-place ThinLTO symbol handling support added in
r257174, we now invoke renameModuleForThinLTO on the current
module from within the FunctionImport pass.

Additionally, renameModuleForThinLTO no longer needs to return the
Module as it is performing the renaming in place on the one provided.

This commit will be immediately preceeded by a companion clang patch to
remove its invocation of renameModuleForThinLTO.

llvm-svn: 257181
2016-01-08 17:06:29 +00:00
Teresa Johnson
ce9de68594 [ThinLTO] Delay metadata materializtion in function importer
The function importer was still materializing metadata when modules were
loaded for function importing. We only want to materialize it when we
are going to invoke the metadata linking postpass. Materializing it
before function importing is not only unnecessary, but also causes
metadata referenced by imported functions to be mapped in early, and
then not connected to the rest of the module level metadata when it is
ultimately linked in.

Augmented the test case to specifically check for the metadata being
properly connected, which it wasn't before this fix.

llvm-svn: 257171
2016-01-08 14:17:41 +00:00
Chandler Carruth
1b5532dd29 [attrs] Split the late-revisit pattern for deducing norecurse in
a top-down manner into a true top-down or RPO pass over the call graph.

There are specific patterns of function attributes, notably the
norecurse attribute, which are most effectively propagated top-down
because all they us caller information.

Walk in RPO over the call graph SCCs takes the form of a module pass run
immediately after the CGSCC pass managers postorder walk of the SCCs,
trying again to deduce norerucrse for each singular SCC in the call
graph.

This removes a very legacy pass manager specific trick of using a lazy
revisit list traversed during finalization of the CGSCC pass. There is
no analogous finalization step in the new pass manager, and a lazy
revisit list is just trying to produce an RPO iteration of the call
graph. We can do that more directly if more expensively. It seems
unlikely that this will be the expensive part of any compilation though
as we never examine the function bodies here. Even in an LTO run over
a very large module, this should be a reasonable fast set of operations
over a reasonably small working set -- the function call graph itself.

In the future, if this really is a compile time performance issue, we
can look at building support for both post order and RPO traversals
directly into a pass manager that builds and maintains the PO list of
SCCs.

Differential Revision: http://reviews.llvm.org/D15785

llvm-svn: 257163
2016-01-08 10:55:52 +00:00
Weiming Zhao
d8aec406ad Fix option desc in FunctionAttrs; NFC
Summary: The example in desc should match with actual option name

Reviewers: jmolloy

Differential Revision: http://reviews.llvm.org/D15800

llvm-svn: 256951
2016-01-06 18:18:16 +00:00
Philip Reames
780b59a41c [BasicAA] Remove special casing of memset_pattern16 in favor of generic attribute inference
Most of the properties of memset_pattern16 can be now covered by the generic attributes and inferred by InferFunctionAttrs.  The only exceptions are:
- We don't yet have a writeonly attribute for the first argument.
- We don't have an attribute for modeling the access size facts encoded in MemoryLocation.cpp.  

Differential Revision: http://reviews.llvm.org/D15879

llvm-svn: 256911
2016-01-06 04:53:16 +00:00
Philip Reames
a43feccb31 [MemoryBuiltins] Remove isOperatorNewLike by consolidating non-null inference handling
This patch removes the isOperatorNewLike predicate since it was only being used to establish a non-null return value and we have attributes specifically for that purpose with generic handling. To keep approximate the same behaviour for existing frontends, I added the various operator new like (i.e. instances of operator new) to InferFunctionAttrs. It's not really clear to me why this isn't handled in Clang, but I didn't want to break existing code and any subtle assumptions it might have.

Once this patch is in, I'm going to start separating the isAllocLike family of predicates. These appear to be being used for a mixture of things which should be more clearly separated and documented. Today, they're being used to indicate (at least) aliasing facts, CSE-ability, and default values from an allocation site.

Differential Revision: http://reviews.llvm.org/D15820

llvm-svn: 256787
2016-01-04 22:49:23 +00:00
Easwaran Raman
efb03dbc75 Refactor inline costs analysis by removing the InlineCostAnalysis class
InlineCostAnalysis is an analysis pass without any need for it to be one.
Once it stops being an analysis pass, it doesn't maintain any useful state
and the member functions inside can be made free functions. NFC.

Differential Revision: http://reviews.llvm.org/D15701

llvm-svn: 256521
2015-12-28 20:28:19 +00:00
Chandler Carruth
8beb86a806 [attrs] Extract the pure inference of function attributes into
a standalone pass.

There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.

In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.

The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.

Differential Revision: http://reviews.llvm.org/D15676

llvm-svn: 256466
2015-12-27 08:41:34 +00:00
Chandler Carruth
cf6f5436f5 [attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.

This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.

I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.

Differential Revision: http://reviews.llvm.org/D15668

llvm-svn: 256465
2015-12-27 08:13:45 +00:00
Benjamin Kramer
b2614d51ce [FunctionImport] Move pass into anonymous namespace.
No functional change.

llvm-svn: 256374
2015-12-24 10:03:35 +00:00
David Majnemer
22cb4eb850 [OperandBundles] Have DeadArgElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256326
2015-12-23 09:58:36 +00:00
Akira Hatanaka
6a7dbf68a2 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r256277 with two changes:

- In emitFnAttrCompatCheck, change FuncName's type to std::string to fix
  a use-after-free bug.
- Remove an unnecessary install-local target in lib/IR/Makefile. 

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256304
2015-12-22 23:57:37 +00:00
Rafael Espindola
067bfb9e99 Also add unnamed_addr to functions.
llvm-svn: 256281
2015-12-22 20:43:30 +00:00
Akira Hatanaka
dfd76e927a Revert r256277 and r256279.
Some of the bots failed again.

llvm-svn: 256280
2015-12-22 20:29:09 +00:00
Akira Hatanaka
fa235f0243 Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252990 and r252949. I've added member function getKind
to the Attr classes which returns the enum or string of the attribute.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 256277
2015-12-22 20:00:05 +00:00
Rafael Espindola
f795707790 Delete dead GlobalAliases.
llvm-svn: 256276
2015-12-22 19:50:22 +00:00
Rafael Espindola
2af3ff098d Merge duplicated code.
The code for deleting dead global variables and functions was
duplicated.

This is in preparation for also deleting dead global aliases.

llvm-svn: 256274
2015-12-22 19:38:07 +00:00
Rafael Espindola
7f6f7bca5d Use early continue to reduce indentation.
llvm-svn: 256272
2015-12-22 19:26:18 +00:00
Rafael Espindola
611a6e336d Simplify iterator management. NFC.
Not passing an iterator to processGlobal will allow it to work with
other GlobalValues.

llvm-svn: 256271
2015-12-22 19:16:50 +00:00
Easwaran Raman
66e5fa28c2 Determine callee's hotness and adjust threshold based on that. NFC.
This uses the same criteria used in CFE's CodeGenPGO to identify hot and cold
callees and uses values of inlinehint-threshold and inlinecold-threshold
respectively as the thresholds for such callees.

Differential Revision: http://reviews.llvm.org/D15245

llvm-svn: 256222
2015-12-22 00:32:35 +00:00
Evgeniy Stepanov
7ed9f33690 [cfi] Fix LowerBitSets on 32-bit targets.
This code attempts to truncate IntPtrTy to i32, which may be the same
type.

llvm-svn: 256205
2015-12-21 22:14:04 +00:00
Vedant Kumar
358f3ea995 Re-reapply "[IR] Move optional data in llvm::Function into a hungoff uselist"
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Includes a fix to scrub value subclass data in dropAllReferences. Does not
use binary literals.

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256095
2015-12-19 08:52:49 +00:00
Vedant Kumar
2e1a683bae Revert "Reapply "[IR] Move optional data in llvm::Function into a hungoff uselist""
This reverts commit r256093.

This broke lld-x86_64-win7 because of -Werror,-Wc++1y-extensions.

llvm-svn: 256094
2015-12-19 08:48:43 +00:00
Vedant Kumar
c33a34516e Reapply "[IR] Move optional data in llvm::Function into a hungoff uselist"
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Includes a fix to scrub value subclass data in dropAllReferences.

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256093
2015-12-19 08:29:51 +00:00
Vedant Kumar
6843b30188 Revert "[IR] Move optional data in llvm::Function into a hungoff uselist"
This reverts commit r256090.

This broke llvm-clang-lld-x86_64-debian-fast.

llvm-svn: 256091
2015-12-19 07:30:44 +00:00
Vedant Kumar
46b3967fa2 [IR] Move optional data in llvm::Function into a hungoff uselist
Make personality functions, prefix data, and prologue data hungoff
operands of Function.

This is based on the email thread "[RFC] Clean up the way we store
optional Function data" on llvm-dev.

Thanks to sanjoyd, majnemer, rnk, loladiro, and dexonsmith for feedback!

Differential Revision: http://reviews.llvm.org/D13829

llvm-svn: 256090
2015-12-19 07:08:56 +00:00
Teresa Johnson
0dce8d436c [ThinLTO] Metadata linking for imported functions
Summary:
Second patch split out from http://reviews.llvm.org/D14752.

Maps metadata as a post-pass from each module when importing complete,
suturing up final metadata to the temporary metadata left on the
imported instructions.

This entails saving the mapping from bitcode value id to temporary
metadata in the importing pass, and from bitcode value id to final
metadata during the metadata linking postpass.

Depends on D14825.

Reviewers: dexonsmith, joker.eph

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D14838

llvm-svn: 255909
2015-12-17 17:14:09 +00:00
Rafael Espindola
f7a0054c75 Change linkInModule to take a std::unique_ptr.
Passing in a std::unique_ptr should help find errors when the module
is used after being linked into another module.

llvm-svn: 255842
2015-12-16 23:16:33 +00:00
Justin Bogner
58647df890 LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC
As of r255720, the loop pass manager will DTRT when passes update the
loop info for removed loops, so they no longer need to reach into
LPPassManager APIs to do this kind of transformation. This change very
nearly removes the need for the LPPassManager to even be passed into
loop passes - the only remaining pass that uses the LPM argument is
LoopUnswitch.

llvm-svn: 255797
2015-12-16 18:40:20 +00:00
Richard Trieu
9af860a927 Remove one of the void casts used to suppress unused variable warning.
llvm-svn: 255709
2015-12-15 23:47:17 +00:00
Evgeniy Stepanov
493b24312f Suppress unused variable warning in the no-asserts build.
llvm-svn: 255706
2015-12-15 23:30:29 +00:00
Richard Trieu
bf34638e13 Cast variable to void to resolve unused variable warning in non-asserts builds.
llvm-svn: 255704
2015-12-15 23:25:34 +00:00
Evgeniy Stepanov
39e538e166 Cross-DSO control flow integrity (LLVM part).
An LTO pass that generates a __cfi_check() function that validates a
call based on a hash of the call-site-known type and the target
pointer.

llvm-svn: 255693
2015-12-15 23:00:08 +00:00
James Molloy
fb86086405 [PassManagerBuilder] Add a few more scalar optimization passes
This patch does two things:
  1. mem2reg is now run immediately after globalopt. Now that globalopt
     can localize variables more aggressively, it makes sense to lower
     them to SSA form earlier rather than later so they can benefit from
     the full set of optimization passes.

  2. More scalar optimizations are run after the loop optimizations in
     LTO mode. The loop optimizations (especially indvars) can clean up
     scalar code sufficiently to make it worthwhile running more scalar
     passes. I've particularly added SCCP here as it isn't run anywhere
     else in the LTO pass pipeline.

Mem2reg is super cheap and shouldn't affect compilation time at all. The
rest of the added passes are in the LTO pipeline only so doesn't affect
the vast majority of compilations, just the link step.

llvm-svn: 255634
2015-12-15 09:24:01 +00:00
Rafael Espindola
2d1739bf50 A better attempt to add a missing include
llvm-svn: 255578
2015-12-14 23:34:35 +00:00
Rafael Espindola
5f225be291 Trying to fix the build in a bot.
llvm-svn: 255577
2015-12-14 23:31:08 +00:00
Rafael Espindola
5b397256de Use diagnostic handler in the LLVMContext
This patch converts code that has access to a LLVMContext to not take a
diagnostic handler.

This has a few advantages

* It is easier to use a consistent diagnostic handler in a single program.
* Less clutter since we are not passing a handler around.

It does make it a bit awkward to implement some C APIs that return a
diagnostic string. I will propose new versions of these APIs and
deprecate the current ones.

llvm-svn: 255571
2015-12-14 23:17:03 +00:00
Sanjoy Das
987d70ed26 [MergeFunctions] Use II instead of CI for InvokeInst; NFC
Using `CI` is slightly misleading.

llvm-svn: 255529
2015-12-14 19:11:45 +00:00
Sanjoy Das
97417780af Teach MergeFunctions about operand bundles
llvm-svn: 255528
2015-12-14 19:11:40 +00:00
Diego Novillo
9bbd13f9a0 SamplePGO - Reduce memory utilization by 10x.
DenseMap is the wrong data structure to use for sample records and call
sites.  The keys are too large, causing massive core memory growth when
reading profiles.

Before this patch, a 21Mb input profile was causing the compiler to grow
to 3Gb in memory.  By switching to std::map, the compiler now grows to
300Mb in memory.

There still are some opportunities for memory footprint reduction. I'll
be looking at those next.

llvm-svn: 255389
2015-12-11 23:21:38 +00:00
Artur Pilipenko
8b6635b2f6 PruneEH pass incorrectly reports that a change was made
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D14097

llvm-svn: 255343
2015-12-11 16:30:26 +00:00
Teresa Johnson
087bc3b677 [ThinLTO] Debug message cleanup (NFC)
Added some missing spaces between the module identifier and the start of
the debug message. Also added a ":" after the module identifier to make
this look a little nicer.

llvm-svn: 255259
2015-12-10 16:39:07 +00:00
Sanjoy Das
d85ded90d0 Add arg_begin() and arg_end() to CallInst and InvokeInst; NFCI
- This simplifies the CallSite class, arg_begin / arg_end are now
   simple wrapper getters.

 - In several places, we were creating CallSite instances solely to call
   arg_begin and arg_end.  With this change, that's no longer required.

llvm-svn: 255226
2015-12-10 06:39:02 +00:00
Rafael Espindola
2e47184a91 Don't assign a temporary string to a StringRef.
Should fix the windows debug and asan bots.

llvm-svn: 255149
2015-12-09 20:41:10 +00:00
Teresa Johnson
83a7df21b2 [ThinLTO] FunctionImport pass can take a const index pointer (NFC)
llvm-svn: 255140
2015-12-09 19:39:47 +00:00
Mehdi Amini
b282e7bd00 The current importing scheme is processing one function at a time,
loading the source Module, linking the function in the destination
module, and destroying the source Module before repeating with the
next function to import (potentially from the same Module).

Ideally we would keep the source Module alive and import the next
Function needed from this Module. Unfortunately this is not possible
because the linker does not leave it in a usable state.

However we can do better by first computing the list of all candidates
per Module, and only then load the source Module and import all the
function we need for it.

The trick to process callees is to materialize function in the source
module when building the list of function to import, and inspect them
in their source module, collecting the list of callees for each
callee.

When we move the the actual import, we will import from each source
module exactly once. Each source module is loaded exactly once.
The only drawback it that it requires to have all the lazy-loaded
source Module in memory at the same time.

Currently this patch already improves considerably the link time,
a multithreaded link of llvm-dis on my laptop was:

  real  1m12.175s  user  6m32.430s sys  0m10.529s

and is now:

  real  0m40.697s  user  2m10.237s sys  0m4.375s

Note: this is the full link time (linker+Import+Optimizer+CodeGen)

Differential Revision: http://reviews.llvm.org/D15178

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255100
2015-12-09 08:17:35 +00:00
Sanjoy Das
cb770fbcb6 [OperandBundles] Have PruneEH work correct with operand bundles.
For an invoke with operand bundles, the [op_begin(), op_end()-3] range
can contain things other than invoke arguments.  This change teaches
PruneEH to use arg_begin() and arg_end() explicitly.

llvm-svn: 255073
2015-12-08 23:16:52 +00:00
Mehdi Amini
5d4cc87b91 Fix/Improve Debug print in FunctionImport pass
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255071
2015-12-08 23:04:19 +00:00
Mehdi Amini
ba2c064383 Remove caching in FunctionImport: a Module can't be reused after being linked from
The Linker destroys the source module (API change coming to make it explicit)

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255064
2015-12-08 22:39:40 +00:00
Teresa Johnson
1fb89d62fb [ThinLTO] Support for specifying function index from pass manager
Summary:
Add a field on the PassManagerBuilder that clang or gold can use to pass
down a pointer to the function index in memory to use for importing when
the ThinLTO backend is triggered. Add support to supply this to the
function import pass.

Reviewers: joker.eph, dexonsmith

Subscribers: davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D15024

llvm-svn: 254926
2015-12-07 19:21:11 +00:00
Mehdi Amini
e0e1d33bec clang-format FunctionImport after refactoring (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254585
2015-12-03 02:58:14 +00:00
Mehdi Amini
30d3bd787c Refactor FunctionImporter::importFunctions with a helper function to process the Worklist (NFC)
This precludes some more functional changes to perform bulk imports.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254583
2015-12-03 02:37:33 +00:00
David Majnemer
56dee65385 Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Mehdi Amini
34766825ef Change ModuleLinker to take a set of GlobalValues to import instead of a single one
For efficiency reason, when importing multiple functions for the same Module,
we can avoid reparsing it every time.

Differential Revision: http://reviews.llvm.org/D15102

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254486
2015-12-02 04:34:28 +00:00
Mehdi Amini
1909ee53b2 Modify FunctionImport to take a callback to load modules
When linking static archive, there is no individual module files to
load. Instead they can be mmap'ed and could be initialized from a
buffer directly. The callback provide flexibility to override the
scheme for loading module from the summary.

Differential Revision: http://reviews.llvm.org/D15101

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254479
2015-12-02 02:00:29 +00:00
Rafael Espindola
e3fda2ca99 Use references now that it is natural to do so.
The linker never takes ownership of a module or changes which module it
is refering to, making it natural to use references.

llvm-svn: 254449
2015-12-01 19:50:54 +00:00
Teresa Johnson
0bc8d23948 [ThinLTO] Wrap dbgs() output in DEBUG macro
Missed in a couple places.

llvm-svn: 254422
2015-12-01 17:12:10 +00:00
Teresa Johnson
0e55603ffe [ThinLTO] Remove stale comment (NFC)
Stale as of r254036 which added basic profitability check.

llvm-svn: 254421
2015-12-01 16:45:23 +00:00
Diego Novillo
d74fdde81f SamplePGO - Do not use std::to_string in diagnostics.
This fixes buildbots in systems that std::to_string is not present. It
also tidies the output of the diagnostic to render doubles a bit better
(thanks Ben Kramer for help with string streams and format).

llvm-svn: 254261
2015-11-29 18:23:26 +00:00
Diego Novillo
d08de97276 SamplePGO - Add initial support for inliner annotations.
This adds two thresholds to the sample profiler to affect inlining
decisions: the concept of global hotness and coldness.

Functions that have accumulated more than a certain fraction of samples at
runtime, are annotated with the InlineHint attribute. Conversely,
functions that accumulate less than a certain fraction of samples, are
annotated with the Cold attribute.

This is very similar to the hints emitted by Clang when using
instrumentation profiles.

Notice that this is a very blunt instrument. A function may have
globally collected a significant fraction of samples, but that does not
necessarily mean that every callsite for that function is hot.

Ideally, we would annotate each callsite with the samples collected at
that callsite. This way, the inliner can incorporate all these weights
into its cost model.

Once the inliner offers this functionality, we can change the hints
emitted here to a more precise per-callsite annotation. For now, this is
providing some measure of speedups with our internal benchmarks. I've
observed speedups of up to 23% (though the geo mean is about 3%). I expect
these numbers to improve as the inliner gets better annotations.

llvm-svn: 254212
2015-11-27 23:14:51 +00:00
Diego Novillo
c52a667205 SamplePGO - Fix default threshold for hot callsites.
Based on testing of internal benchmarks, I'm lowering this threshold to
a value of 0.1%.  This means that SamplePGO will respect 99.9% of the
original inline decisions when following a profile.

The performance difference is noticeable in some tests. With the
previous threshold, the speedups over baseline -O2 was about 0.63%. With
the new default, the speedups are around 3% on average.

The point of this threshold is not to do more aggressive inlining. When
an inlined callsite crosses this threshold, SamplePGO will redo the
inline decision so that it can better apply the input profile.

By respecting most original inline decisions, we can apply more of the
input profile because the shape of the code follows the profile more
closely.

In the next series, I'll be looking at adding some inline hints for the
cold callsites and for toplevel functions that are hot/cold as well.

llvm-svn: 254211
2015-11-27 23:14:49 +00:00
Rafael Espindola
d215bba299 Disallow aliases to available_externally.
They are as much trouble as aliases to declarations. They are requiring
the code generator to define a symbol with the same value as another
symbol, but the second symbol is undefined.

If representing this is important for some optimization, we could add
support for available_externally aliases. They would be *required* to
point to a declaration (or available_externally definition).

llvm-svn: 254170
2015-11-26 19:22:59 +00:00
Rong Xu
c4f897c441 [PGO] Revert revision r254021,r254028,r254035
Revert the above revision due to multiple issues.

llvm-svn: 254040
2015-11-24 23:49:08 +00:00
Teresa Johnson
cbf6e0bf1b [ThinLTO] Add option to limit importing based on instruction count
Add a simple initial heuristic to control importing based on the number
of instructions recorded in the function's summary. Add option to
control the limit, and test using option.

llvm-svn: 254036
2015-11-24 22:55:46 +00:00
Diego Novillo
2b7c3c54ab SamplePGO - Add test for hot/cold inlined functions.
When the original binary is executed and sampled, the resulting profile
contains information on the original inline stack. We currently follow
the original inline plan if we notice that the inlined callsite has more
than 0 samples to it.

A better way is to determine whether the callsite is actually worth
inlining. If the callsite accumulates a small fraction of the samples
spent in the parent function, then we don't want to bother inlining it
(as it means that the callsite is actually cold).

This patch introduces a threshold expressed in percentage of samples
in relation to the parent function.  If the callsite uses less than N%
of the total samples used by its parent, the original inline decision is
not re-applied.

I've set the threshold to the very arbitrary value of 5%. I'm yet to do
any actual experiments to see what's a good value. I wanted to separate
the basic mechanism from the tuning.

llvm-svn: 254034
2015-11-24 22:38:37 +00:00
Rong Xu
025bf7be0c [PGO] MST based PGO instrumentation infrastructure
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.

Differential Revision: http://reviews.llvm.org/D12781

llvm-svn: 254021
2015-11-24 21:31:25 +00:00
Teresa Johnson
697f6bcd05 [ThinLTO] Refactor function body scan during importing into helper (NFC)
llvm-svn: 254020
2015-11-24 21:15:19 +00:00
Teresa Johnson
a3214913e6 [ThinLTO] Enable iterative importing in FunctionImport pass
Analyze imported function bodies and add any new external calls to
the worklist for importing. Currently no controls on the importing
so this will end up importing everything possible in the call tree
below the importing module. Basic profitability checks coming next.

Update test to check for iteratively inlined functions.

llvm-svn: 254011
2015-11-24 19:55:04 +00:00
Teresa Johnson
9c0a1779ce [ThinLTO] Fix FunctionImport alias checking and test
Skip imports for weak_any aliases as well. Fix the test to check
non-import of weak aliases and functions, and import of normal alias.

llvm-svn: 253991
2015-11-24 16:10:43 +00:00
Ismail Donmez
266a7da4e3 Fix build after r253954
llvm-svn: 253969
2015-11-24 09:48:09 +00:00
Mehdi Amini
2fe02188ef Add a FunctionImporter helper to perform summary-based cross-module function importing
Summary:
This is a helper to perform cross-module import for ThinLTO. Right now
it is importing naively every possible called functions.

Reviewers: tejohnson

Subscribers: dexonsmith, llvm-commits

Differential Revision: http://reviews.llvm.org/D14914

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253954
2015-11-24 06:07:49 +00:00
Diego Novillo
d28d079aa7 SamplePGO - Add coverage tracking for samples.
The existing coverage tracker counts the number of records that were used
from the input profile. An alternative view of coverage is to check how
many available samples were applied.

This way, if the profile contains several records with few samples, it
doesn't really matter much that they were not applied. The more
interesting records to apply are the ones that contribute many samples.

llvm-svn: 253912
2015-11-23 20:12:21 +00:00
Diego Novillo
e85b89498c SamplePGO - Clear coverage tracking when clearing per-function data.
llvm-svn: 253877
2015-11-23 16:30:17 +00:00
Diego Novillo
f27e33d714 SamplePGO - Use newly introduced local variable. NFC.
llvm-svn: 253868
2015-11-23 15:24:13 +00:00
Diego Novillo
372bf7dc64 SamplePGO - Do not count never-executed inlined functions when computing coverage.
If a function was originally inlined but not actually hot at runtime,
its samples will not be counted inside the parent function. This throws
off the coverage calculation because it expects to find more used
records than it should.

Fixed by ignoring functions that will not be inlined into the parent.
Currently, this is inlined functions with 0 samples.  In subsequent
patches, I'll change this to mean "cold" functions.

llvm-svn: 253716
2015-11-20 21:46:38 +00:00
Tilmann Scheller
a99f5d534e Revert "[FunctionAttrs] Remove redundant assignment."
This reverts r253661.

Turns out that the assignment is not redundant (despite the Clang static analyzer claiming the opposite).

The variable is being used by the lambda function AddUsersToWorklistIfCapturing().

llvm-svn: 253696
2015-11-20 19:17:10 +00:00
Diego Novillo
4255fabac8 SamplePGO - Add line offset and discriminator information to sample reports.
While debugging some sampling coverage problems, I found this useful:
When applying samples from a profile, it helps to also know what line
offset and discriminator the sample belongs to. This makes it easy to
correlate against the input profile.

llvm-svn: 253670
2015-11-20 15:39:42 +00:00
Tilmann Scheller
baba8378a4 [FunctionAttrs] Remove redundant assignment.
Identified by the Clang static analyzer.

llvm-svn: 253661
2015-11-20 12:51:58 +00:00
James Molloy
2208ca52dd [GlobalOpt] Localize some globals that have non-instruction users
We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive.

llvm-svn: 253584
2015-11-19 18:04:33 +00:00
James Molloy
b585b0aee8 [FunctionAttrs] Provide a mechanism for adding function attributes from the command line
This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc).

The syntax is -force-attribute=function_name:attribute_name

All function attributes are parsed except alignstack as it requires an argument.

llvm-svn: 253550
2015-11-19 08:49:57 +00:00
James Molloy
b92ba28077 [LTO] Add an early run of functionattrs
Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt.

llvm-svn: 253451
2015-11-18 11:24:42 +00:00
Elena Demikhovsky
0600baa03e Vector of pointers in function attributes calculation
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.

Differential Revision: http://reviews.llvm.org/D14693

llvm-svn: 253363
2015-11-17 19:30:51 +00:00
James Molloy
63f33470bb [GlobalOpt] Address post-commit review comments on r253168
Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed:
  * clang-format one area where whitespace diffs occurred.
  * Add a threshold to limit the store/load dominance checks as they are quadratic.

llvm-svn: 253192
2015-11-16 10:16:22 +00:00
Benjamin Kramer
c51f76e89b Move helper classes into anonymous namespaces. NFC.
llvm-svn: 253189
2015-11-16 09:01:28 +00:00
James Molloy
85bd37fc58 [GlobalOpt] Demote globals to locals more aggressively
Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized.

For a global to be demoted, it must only be accessed by one function and that function:
  1. Must never recurse directly or indirectly, else the GV would be clobbered.
  2. Must never rely on the value in GV at the start of the function (apart from the initializer).

GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once.

In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason.

The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing.

llvm-svn: 253168
2015-11-15 14:21:37 +00:00
James Molloy
7d379efeea [GlobalOpt] Make sure all debug lines end with '\n'
GlobalVariable::print() used to emit a newline. It hasn't for a while now, but these debug lines weren't updated.

llvm-svn: 253030
2015-11-13 11:05:13 +00:00
James Molloy
e22291bd4d [GlobalOpt] Coding style - remove function names from doxygen comments
Suggested by Mehdi in the review of D14148.

llvm-svn: 253029
2015-11-13 11:05:07 +00:00
Akira Hatanaka
a41bf2744e Revert r252990.
Some of the buildbots are still failing.

llvm-svn: 252999
2015-11-13 01:44:32 +00:00
Akira Hatanaka
5df31fbb8f Provide a way to specify inliner's attribute compatibility and merging.
This reapplies r252949. I've changed the type of FuncName to be
std::string instead of StringRef in emitFnAttrCompatCheck.

Original commit message for r252949:

Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 252990
2015-11-13 01:23:11 +00:00
Akira Hatanaka
8642e85c17 Revert r252949.
It broke some of the bots including clang-x64-ninja-win7.

llvm-svn: 252951
2015-11-12 21:19:18 +00:00
Akira Hatanaka
ca7dc7a319 Provide a way to specify inliner's attribute compatibility and merging
rules using table-gen. NFC.

This commit adds new classes CompatRule and MergeRule to Attributes.td,
which are used to generate code to check attribute compatibility and
merge attributes of the caller and callee.

rdar://problem/19836465

llvm-svn: 252949
2015-11-12 20:59:43 +00:00
James Molloy
64e756bd07 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

llvm-svn: 252871
2015-11-12 10:55:20 +00:00
James Molloy
9cd723ec63 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

llvm-svn: 252863
2015-11-12 09:05:43 +00:00
James Molloy
4da975f03a [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

llvm-svn: 252862
2015-11-12 08:53:04 +00:00
David Majnemer
3deb8be573 [IR] Add support for empty tokens
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token.  Currently, we have no
mechanism to represent an initial token state.

Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like.  This new constant
is called ConstantTokenNone and is written textually as "token none".

Differential Revision: http://reviews.llvm.org/D14581

llvm-svn: 252811
2015-11-11 21:57:16 +00:00
Oliver Stannard
989496fc9c GlobalOpt should maintain externally_initialized when splitting aggregates
When GlobalOpt splits an internal, global variable with an aggregate type, it
should propagate the externally_initialized flag to the newly created globals.

This makes the pass safe for our downstream use of this flag, while still
allowing some useful optimisations (such as removing dead parts of the split
aggregate) to be performed.

Differential Revision: http://reviews.llvm.org/D13382

llvm-svn: 252490
2015-11-09 16:47:16 +00:00
Sanjoy Das
62bf9f3dd6 Unbreak the build
My code clashed with some ilist iterator changes upstream.  Fix by
adding an explicit "&*" coercion.

llvm-svn: 252392
2015-11-07 02:26:53 +00:00
Sanjoy Das
c550b2c217 [FunctionAttrs] Add comment and clarify assertion message; NFC
llvm-svn: 252389
2015-11-07 01:56:07 +00:00
Sanjoy Das
6b6a5c9388 [FunctionAttrs] Add handling for operand bundles
Summary:
Teach the FunctionAttrs to do the right thing for IR with operand
bundles.

Reviewers: reames, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14408

llvm-svn: 252387
2015-11-07 01:56:00 +00:00