This form is like DW_FORM_strp, but points to .debug_line_str instead
of .debug_str as the string section. It's intended to be used from
the line-table header, and allows string-pooling of directory and
filenames across compilation units.
Differential Revision: https://reviews.llvm.org/D42553
llvm-svn: 323476
Summary:
The intent of this is to allow the code to be used with ThinLTO. In
Thinlink phase, a traditional Callgraph can not be computed even though
all the necessary information (nodes and edges of a call graph) is
available. This is due to the fact that CallGraph class is closely tied
to the IR. This patch first extends GraphTraits to add a CallGraphTraits
graph. This is then used to implement a version of counts propagation
on a generic callgraph.
Reviewers: davidxl
Subscribers: mehdi_amini, tejohnson, llvm-commits
Differential Revision: https://reviews.llvm.org/D42311
llvm-svn: 323475
This patch enables aggressive FMA by default on T99, and provides a -mllvm
option to enable the same on other AArch64 micro-arch's (-mllvm
-aarch64-enable-aggressive-fma).
Test case demonstrating the effects on T99 is included.
Patch by: steleman (Stefan Teleman)
Differential Revision: https://reviews.llvm.org/D40696
llvm-svn: 323474
This patch is an enhancement to propagate dbg.value information when
Phis are created on behalf of LCSSA. I noticed a case where a value
carried across a loop was reported as <optimized out>.
Specifically this case:
int bar(int x, int y) {
return x + y;
}
int foo(int size) {
int val = 0;
for (int i = 0; i < size; ++i) {
val = bar(val, i); // Both val and i are correct
}
return val; // <optimized out>
}
In the above case, after all of the interesting computation completes
our value is reported as "optimized out." This change will add a
dbg.value to correct this.
This patch also moves the dbg.value insertion routine from
LoopRotation.cpp into Local.cpp, so that we can share it in both places
(LoopRotation and LCSSA).
Patch by Matt Davis!
Differential Revision: https://reviews.llvm.org/D42551
llvm-svn: 323472
Right now clang uses "_n" suffix for some user space callbacks and "N" for the matching kernel ones. There's no need for this and it actually breaks kernel build with inline instrumentation. Use the same callback names for user space and the kernel (and also make them consistent with the names GCC uses).
Patch by Andrey Konovalov.
Differential Revision: https://reviews.llvm.org/D42423
llvm-svn: 323470
The asm parser puts the lock prefix in the MCInst flags so we need to check that in addition to TSFlags. This matches what the ATT printer does.
llvm-svn: 323469
It was reverted after buildbot regressions.
Original commit message:
This allows relative block frequency of call edges to be passed
to the thinlink stage where it will be used to compute synthetic
entry counts of functions.
llvm-svn: 323460
It looks like this hasn't been updated since bugzilla moved.
Patch by Colden Cullen.
Differential Revision: https://reviews.llvm.org/D42496
llvm-svn: 323457
This brings it in line with std::optional. My recent changes to
make Optional of trivial types trivially copyable introduced
diverging behavior depending on the type, which is bad. Now all
types have the same moving behavior.
llvm-svn: 323445
Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38697
llvm-svn: 323441
This is guarded by shouldChangeType(), so the tests show that
we don't do the fold if the narrower type is not legal. Note
that there is a proposal (D42424) that would change the results
for the specific cases shown in these tests. That difference is
also discussed in PR35792:
https://bugs.llvm.org/show_bug.cgi?id=35792
Alive proofs for the cases handled here as well as the bitwise
logic binops that we should already do better on:
https://rise4fun.com/Alive/c97https://rise4fun.com/Alive/Lc5Ehttps://rise4fun.com/Alive/kdf
llvm-svn: 323437
Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38697
llvm-svn: 323430
computeDeadSymbols accessed isLive() which was not public
before. It does not make much sence to keep isLive() private
because flags are available via flags() public member anyways.
llvm-svn: 323415
This patch extends the atom types used by the Apple accelerator tables
with two dsymutil extensions:
- DW_ATOM_type_type_flags
- DW_ATOM_qual_name_hash
llvm-svn: 323414
Summary:
When creating the debug fragments for a SRA'd struct, use the fields'
offsets, taken from the struct layout, as the offsets for the resulting
fragments. This fixes an issue where GlobalOpt would emit fragments with
incorrect offsets for padded fields.
This should solve PR36016.
Patch by David Stenberg.
Reviewers: aprantl
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42489
llvm-svn: 323411
The regular expressions and the imul names caused some instructions to be matched by multiple regexs creating unpredictable results.
This changes them all to use explicit instrs instead.
While doing this I also found that some instructions in Skylake were missing load latency so I fixed that too.
llvm-svn: 323406
The IMUL instruction names mixed with the prefix matching of the instregex lead to some strange matches. The worst being that several memory instructions are using the register form latency.
I don't know what the right answer is, so I've left TODOs and will try to work with the AMD folks to get this cleaned up.
llvm-svn: 323405
Set cmake policy CMP0068=NEW, if available, and set
"CMAKE_BUILD_WITH_INSTALL_NAME_DIR=On" globally to
maintain current behavior.
This is needed to suppress warnings on OSX starting with cmake version
3.9.6.
Differential Revision: https://reviews.llvm.org/D42463
llvm-svn: 323404
MMX instrutions all start with MMX_ so the 64 isn't needed for disambigutation.
SSE/AVX1 instructions are assumed 128-bit so we don't need to say 128.
AVX2 instructions should use a Y to indicate 256-bits.
llvm-svn: 323402
These were treated as optional suffixes, but the regular expressions are already prefix matches so this is unnecessary. It breaks the binary search optimization in tablegen due to the top level question mark.
llvm-svn: 323401
first argument.
This makes lookupFlags more consistent with lookup (which takes the query as the
first argument) and composes better in practice, since lookups are usually
linearly chained: Each lookupFlags can populate the result map based on the
symbols not found in the previous lookup. (If the maps were returned rather than
passed by reference there would have to be a merge step at the end).
llvm-svn: 323398
https://reviews.llvm.org/D41373
The various components are
GICombinerHelper contains transformations that are common to all
targets. Targets can pick and choose which transformations (at
function/opcode granularity) each pass uses via configuring a
GICombinerInfo.
GICombiner contains some common code and it does the traversal,
driving of combines, worklist management and iterating until
convergence.
GICombinerInfo is an interface with a virtual method called combine.
The combiner info will allow targets to pick and choose (or
implement their own specific combines). CombineInfos can make
use of available combines in GICombineHelper to configure the
transformations for a particular pass. Currently this approach allows
cherry picking transformations from helpers (at function/opcode
granularity) and also allows early returning on specific
transformations. Targets also get to prioritize whether target specific
combines run before/after the opt-in generic combines. Ideally we would
like this part to be configured by both C++ and Tablegen. The
CombinerInfo also has a field which indicates how to deal with
IllegalOps (ie - should we allow to create them/or legalize them?).
A CombinerPass would configure a CombinerInfo, create the GICombiner
with the Info, and call
GICombiner::combineMachineInstrs(MachineFunction&).
This organization is very similar to the GISelLegalizer.
llvm-svn: 323392
Collected statistics for the number of patterns emitted can be
incorrect because rules can be grouped if OptimizeMatchTable
is enabled. Increase the counter in RuleMatcher::emit(...)
to avoid that.
llvm-svn: 323391
functions/methods that return JITSymbols.
lookupFlagsWithLegacyFn takes a SymbolNameSet and a legacy lookup function and
returns a LookupFlagsResult. It uses the legacy lookup function to search for
each symbol. If found, getFlags is called on the symbol and the flags added to
the SymbolFlags map. If not found, the symbol is added to the SymbolsNotFound
set.
lookupWithLegacyFn takes an AsynchronousSymbolQuery, a SymbolNameSet and a
legacy lookup function. Each symbol in the SymbolNameSet is searched for via the
legacy lookup function. If it is found, its getAddress function is called
(triggering materialization if it has not happened already) and the resulting
mapping stored in the query. If it is not found the symbol is added to the
unresolved symbols set which is returned at the end of the function. If an
error occurs during legacy lookup or materialization it is passed to the
query via setFailed and the function returns immediately.
llvm-svn: 323388