Summary:
Also fixup rL371928 for cases that occur on our out-of-tree backend
There were still quite a few intermediate APInts and this caused the
compile time of MCCodeEmitter for our target to jump from 16s up to
~5m40s. This patch, brings it back down to ~17s by eliminating pretty
much all of them using two new APInt functions (extractBitsAsZExtValue(),
insertBits() but with a uint64_t). The exact conditions for eliminating
them is that the field extracted/inserted must be <=64-bit which is
almost always true.
Note: The two new APInt API's assume that APInt::WordSize is at least
64-bit because that means they touch at most 2 APInt words. They
statically assert that's true. It seems very unlikely that someone
is patching it to be smaller so this should be fine.
Reviewers: jmolloy
Reviewed By: jmolloy
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67686
llvm-svn: 372243
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.
The algorithm for building the graph involves the following steps in order:
1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
2. For each node in the graph establish def-use edges to/from other nodes in the graph.
3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.
Authored By: bmahjour
Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert
Reviewed By: Meinersbur, fhahn, myhsu
Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto
Tag: #llvm
Differential Revision: https://reviews.llvm.org/D65350
llvm-svn: 372238
is enabled.
We can save memory and reduce binary size significantly by enabling
ProfileSampleAccurate. However when ProfileSampleAccurate is true,
function without sample will be regarded as cold and this could
potentially cause performance regression.
To minimize the potential negative performance impact, we want to be
a little conservative here saying if a function shows up in the profile,
no matter as outline instance, inline instance or call targets, treat
the function as not being cold. This will handle the cases such as most
callsites of a function are inlined in sampled binary (thus outline copy
don't get any sample) but not inlined in current build (because of source
code drift, imprecise debug information, or the callsites are all cold
individually but not cold accumulatively...), so that the outline function
showing up as cold in sampled binary will actually not be cold after current
build. After the change, such function will be treated as not cold even
profile-sample-accurate is enabled.
At the same time we lower the hot criteria of callsiteIsHot check when
profile-sample-accurate is enabled. callsiteIsHot is used to determined
whether a callsite is hot and qualified for early inlining. When
profile-sample-accurate is enabled, functions without profile will be
regarded as cold and much less inlining will happen in CGSCC inlining pass,
so we can worry less about size increase and be aggressive to allow more
early inlining to happen for warm callsites and it is helpful for performance
overall.
Differential Revision: https://reviews.llvm.org/D67561
llvm-svn: 372232
This patch fixes a bug exposed by D65653 where a subsequent invocation
of `determineCalleeSaves` ends up with a different size for the callee
save area, leading to different frame-offsets in debug information.
In the invocation by PEI, `determineCalleeSaves` tries to determine
whether it needs to spill an extra callee-saved register to get an
emergency spill slot. To do this, it calls 'estimateStackSize' and
manually adds the size of the callee-saves to this. PEI then allocates
the spill objects for the callee saves and the remaining frame layout
is calculated accordingly.
A second invocation in LiveDebugValues causes estimateStackSize to return
the size of the stack frame including the callee-saves. Given that the
size of the callee-saves is added to this, these callee-saves are counted
twice, which leads `determineCalleeSaves` to believe the stack has
become big enough to require spilling an extra callee-save as emergency
spillslot. It then updates CalleeSavedStackSize with a larger value.
Since CalleeSavedStackSize is used in the calculation of the frame
offset in getFrameIndexReference, this leads to incorrect offsets for
variables/locals when this information is recalculated after PEI.
Reviewers: omjavaid, eli.friedman, thegameg, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D66935
llvm-svn: 372204
Summary:
The latter is slightly more efficient and communicates the intent of the
API: writeFileAtomically does not own or copy the callback, it merely
calls it at some point.
Reviewers: jkorous
Reviewed By: jkorous
Subscribers: hiraditya, dexonsmith, jfb, llvm-commits, cfe-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67584
llvm-svn: 372201
The filename in the RemarkStreamer should be optional to allow clients
to stream remarks to memory or to existing streams.
This introduces a new overload of `setupOptimizationRemarks`, and avoids
enforcing the presence of a filename at different places.
llvm-svn: 372195
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.
The algorithm for building the graph involves the following steps in order:
1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
2. For each node in the graph establish def-use edges to/from other nodes in the graph.
3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.
Authored By: bmahjour
Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert
Reviewed By: Meinersbur, fhahn, myhsu
Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto
Tag: #llvm
Differential Revision: https://reviews.llvm.org/D65350
llvm-svn: 372162
This patch adds the llvm::gsym::Header class which appears at the start of a stand alone GSYM file, or in the first bytes of the GSYM data in a GSYM section within a file. Added encode and decode methods with full error handling and full tests.
Differential Revision: https://reviews.llvm.org/D67666
llvm-svn: 372149
This patch adds encoding and decoding of the FunctionInfo objects along with full error handling and tests. Full details of the FunctionInfo encoding format appear in the FunctionInfo.h header file.
Differential Revision: https://reviews.llvm.org/D67506
llvm-svn: 372135
Summary:
There were segfaults as we modified and iterated the instruction maps in
the cache at the same time. This was happening because we created new
instructions while we populated the cache. This fix changes the order
in which we perform these actions. First, the caches for the whole
module are created, then we start to create abstract attributes.
I don't have a unit test but the LLVM test suite exposes this problem.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67232
llvm-svn: 372105
* Reordered MVT simple types to group scalable vector types
together.
* New range functions in MachineValueType.h to only iterate over
the fixed-length int/fp vector types.
* Stopped backends which don't support scalable vector types from
iterating over scalable types.
Reviewers: sdesmalen, greened
Reviewed By: greened
Differential Revision: https://reviews.llvm.org/D66339
llvm-svn: 372099
It's a straightforward refactoring that allows to simplify and encapsulate the code.
Differential revision: https://reviews.llvm.org/D67624
llvm-svn: 372083
Summary: This patch introduces a helper struct `AnalysisGetter` to put together analysis getters. In this patch, a getter for `AAResult` is also added for `noalias`.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67603
llvm-svn: 372072
It should be allowed to take a string table in case all the strings in
the remarks point there, but it shouldn't use it during serialization.
llvm-svn: 372042
Speed up queries for coverage info in a file by reducing the amount of
time spent determining whether a function record corresponds to a file.
This gives a 36% speedup when generating a coverage report for `llc`.
The reduction is entirely in user time.
rdar://54758110
Differential Revision: https://reviews.llvm.org/D67575
llvm-svn: 372025
Summary:
This is needed to implemented the same approach as lld (implemented in r338434)
for how to handling symbols that can be generated by LTO code generator
but not present in the symbol table for linker that uses legacy C APIs.
libLTO is in charge of providing the list of symbols. Linker is in
charge of implementing the eager loading from static libraries using
the list of symbols.
rdar://problem/52853974
Reviewers: tejohnson, bd1976llvm, deadalnix, espindola
Reviewed By: tejohnson
Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67568
llvm-svn: 372021
D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features.
Reviewers: jdoerfert, uenoku, hfinkel, efriedma
Subscribers: lebedev.ri, xbolva00, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65408
llvm-svn: 371942
This adds a reproducer dump commands which makes it possible to inspect
a reproducer from inside LLDB. Currently it supports the Files, Commands
and Version providers. I'm planning to add support for the GDB Remote
provider in a follow-up patch.
Differential revision: https://reviews.llvm.org/D67474
llvm-svn: 371909
Because memory intrinsics are handled differently than other calls, we need to
check them for tail call eligiblity in the legalizer. This allows us to still
inline them when it's beneficial to do so, but also tail call when possible.
This adds simple tail calling support for when the intrinsic is followed by a
return.
It ports the attribute checks from `TargetLowering::isInTailCallPosition` into
a similarly-named function in LegalizerHelper.cpp. The target-specific
`isUsedByReturnOnly` hook is not ported here.
Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call
memory intrinsics.
Update legalize-memcpy-et-al.mir to have a case where we don't tail call.
Differential Revision: https://reviews.llvm.org/D67566
llvm-svn: 371893
This is a continuation of the YAML library error reporting
refactoring/improvement and the idea by itself was mentioned
in the following thread:
https://reviews.llvm.org/D67182?id=218714#inline-603404
This performs a cleanup of all object emitters in the library.
It allows using the custom one provided by the caller.
One of the nice things is that each tool can now print its tool name,
e.g: "yaml2obj: error: <text>"
Also, the code became a bit simpler.
Differential revision: https://reviews.llvm.org/D67445
llvm-svn: 371865
This unlocks some goodies like sized deletion and gets the alignment
right on platforms that chose to provide a lower default new alignment.
llvm-svn: 371846
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 371834
This patch adds vecreduce_smax, vecredude_umax, vecreduce_smin, vecreduce_umin and selection for vmaxv and minv.
Differential Revision: https://reviews.llvm.org/D66413
llvm-svn: 371827