Teach LLVM optimize to more precisely in the presence of "deopt" operand
bundles. "deopt" operand bundles imply that the call they're attached
to is at least `readonly` (i.e. they don't imply clobber semantics), and
they don't capture their bundle operands.
llvm-svn: 254118
Summary:
This returns a pointer to the dispatch packet, which can be used to load
information about the kernel dispach.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D14898
llvm-svn: 254116
This is one of the many steps to commonize value profiling support between profile
runtime and compiler/llvm tools.
After this change, profiler runtime now can share the same C APIs to do VP
serialization/deseriazation with LLVM host tools (and produces value data
in identical format between indexed and raw profile).
It is not yet enabled in profiler runtime yet.
Also added a unit test case to test runtime profile data serialization/deserialization
interfaces implemented using common closure code.
llvm-svn: 254110
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.
Reviewers: MatzeB, andreadb, tstellarAMD
Subscribers: arsenm, jyknight, llvm-commits
Differential Revision: http://reviews.llvm.org/D14945
llvm-svn: 254085
to a simple type when lowering a truncating store of a vector type. In this
case for an EVT we'll return Expand as we should in all of the cases anyhow.
The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection
found the rest.
llvm-svn: 254061
1. Convert serialization methods using InstrProfRecord as source into C (impl)
interfaces using Closure.
2. Reimplement InstrProfRecord serialization method to use new C interface
as dummy wrapper.
Now it is ready to implement wrapper for runtime value profile data.
(The new code need better source location -- but not changed in this patch to
minimize diffs. )
llvm-svn: 254057
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.
Differential Revision: http://reviews.llvm.org/D12781
llvm-svn: 254021
This patch fixes the following issues:
1. Fix the return type of X86psadbw: it should not be the same type of inputs.
For vNi8 inputs the output should be vMi64, where M = N/8.
2. Fix the return type of int_x86_avx512_psad_bw_512 accordingly.
3. Fix the definiton of PSADBW, VPSADBW, and VPSADBWY accordingly.
4. Adjust the return type when building a DAG node of X86ISD::PSADBW type.
5. Update related tests.
Differential revision: http://reviews.llvm.org/D14897
llvm-svn: 254010
The closure is designed to abstact away two types of value profile
data:
- InstrProfRecord which is the primary data structure used to
represent profile data in host tools (reader, writer, and profile-use)
- value profile runtime data structure suitable to be used by C
runtime library.
Both sources of data need to serialize to disk/memory-buffer in common
format: ValueProfData.
The abstraction allows compiler-rt's raw profiler writer to share
the same code with indexed profile writer.
llvm-svn: 254008
Convert two C++ static member functions to be C APIs. This
is one of the many steps to get ready to share VP writer code
with profiler runtime.
llvm-svn: 253999
The patch in http://reviews.llvm.org/D13745 is broken into four parts:
1. New interfaces without functional changes.
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights.
3. Use new interfaces in all other passes.
4. Remove old interfaces.
This the second patch above. In this patch SelectionDAG starts to use
probability-based interfaces in MBB to add successors but other MC passes are
still using weight-based interfaces. Therefore, we need to maintain correct
weight list in MBB even when probability-based interfaces are used. This is
done by updating weight list in probability-based interfaces by treating the
numerator of probabilities as weights. This change affects many test cases
that check successor weight values. I will update those test cases once this
patch looks good to you.
Differential revision: http://reviews.llvm.org/D14361
llvm-svn: 253965
Summary:
This is a helper to perform cross-module import for ThinLTO. Right now
it is importing naively every possible called functions.
Reviewers: tejohnson
Subscribers: dexonsmith, llvm-commits
Differential Revision: http://reviews.llvm.org/D14914
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253954
Summary:
This allows to query for a function in the map without creating an
entry, allowing to use a const FunctionInfoIndex.
Reviewers: tejohnson
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14912
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253953
Summary: Adds the ability for callers to detect when saturation occurred on the result of saturating addition/multiplication.
Reviewers: davidxl, silvas, rsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14931
llvm-svn: 253921
The new option is similar to the SampleProfile dump option.
- dump raw/indexed format into text profile format
- merge the profile and output into text profile format.
Note that Value Profiling data text format is not yet designed.
That functionality will be added later.
Differential Revision: http://reviews.llvm.org/D14894
llvm-svn: 253913
Add a shared helper routine to read the function index from a file
and create/return the function index object. Use it in llvm-link and
llvm-lto.
llvm-svn: 253903
Summary:
This change fixes the SaturatingMultiply<T>() function template to not cause undefined behavior with T=uint16_t.
Thanks to Richard Smith's contribution, it also no longer requires an integer division.
Patch by Richard Smith.
Reviewers: silvas, davidxl
Subscribers: rsmith, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14845
llvm-svn: 253870
1. move const qualifier out of raw header field type as runtime use of the header
needs to initialze the fields
2. use C style casting for integer types.
llvm-svn: 253844
Duplicate a few common definitions between DFAPacketizer.cpp and
DFAPacketizerEmitter.cpp to avoid including files from CodeGen
in TableGen.
llvm-svn: 253820
In profile runtime implementation for Darwin, Linux and FreeBSD, the
names of sections holding profile control/counter/naming data need
to be known by the runtime in order to locate the start/end of the
data. Moving the name definitions to the common file to specify the
connection.
llvm-svn: 253814
FIXME: This can be reverted several hours later.
r253790 introduced cyclic deps around llvm-tblgen and it was affecting after reverting.
ninja: error: dependency cycle: include/llvm/IR/Attributes.inc -> include/llvm/IR/Attributes.inc.tmp -> bin/llvm-tblgen -> utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/DFAPacketizerEmitter.cpp.o -> include/llvm/IR/Attributes.inc
It may be a ninja's bug.
FYI, renaming DFAPacketizerEmitter.cpp would be useless.
llvm-svn: 253810
Add more complete description of the content and structure
of the template file. Made the comment in C style to be
shared by C runtime. Also enhance the file structure so
that it can included as standalone header for common
definitions.
llvm-svn: 253807
We had two code paths. One would create names like "foo.1" and the other
names like "foo1".
For globals it is important to use "foo.1" to help C++ name demangling.
For locals there is no strong reason to go one way or the other so I
kept the most common mangling (foo1).
llvm-svn: 253804
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.
1 Don't assert when trying to add a FunctionInfo that doesn't have
a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
a function summary section before trying to parse the index, and skip
the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
where we returned to early while looking for the summary section.
Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14903
llvm-svn: 253796
MachineInstrBuilder::addDisp can already add an immediate or global address MO with an adjusted offset, this patch adds support for constant pool indices as well.
All remaining MO types still assert - there are a number of other types that could support adjusted offsets but I have no test cases at this time.
Required to fix a regression in D13988 found by Mikael Holmén during stress testing (test case attached).
Differential Revision: http://reviews.llvm.org/D14867
llvm-svn: 253795
Extended DFA tablegen to:
- added "-debug-only dfa-emitter" support to llvm-tblgen
- defined CVI_PIPE* resources for the V60 vector coprocessor
- allow specification of multiple required resources
- supports ANDs of ORs
- e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
(SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)
- added support for combo resources
- allows specifying ORs of ANDs
- e.g. [CVI_XLSHF, CVI_MPY01] means:
(CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)
- increased DFA input size from 32-bit to 64-bit
- allows for a maximum of 4 AND'ed terms of 16 resources
- supported expressions now include:
expression => term [AND term] [AND term] [AND term]
term => resource [OR resource]*
resource => one_resource | combo_resource
combo_resource => (one_resource [AND one_resource]*)
Author: Dan Palermo <dpalermo@codeaurora.org>
kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.
Reapply the previous patch, this time without circular dependencies.
llvm-svn: 253793
Extended DFA tablegen to:
- added "-debug-only dfa-emitter" support to llvm-tblgen
- defined CVI_PIPE* resources for the V60 vector coprocessor
- allow specification of multiple required resources
- supports ANDs of ORs
- e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
(SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)
- added support for combo resources
- allows specifying ORs of ANDs
- e.g. [CVI_XLSHF, CVI_MPY01] means:
(CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)
- increased DFA input size from 32-bit to 64-bit
- allows for a maximum of 4 AND'ed terms of 16 resources
- supported expressions now include:
expression => term [AND term] [AND term] [AND term]
term => resource [OR resource]*
resource => one_resource | combo_resource
combo_resource => (one_resource [AND one_resource]*)
Author: Dan Palermo <dpalermo@codeaurora.org>
kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.
llvm-svn: 253790
Summary:
This change refactors two aspects of InstrProfRecord:
1) Add a merge() method to InstrProfRecord (previously InstrProfWriter combineInstrProfRecords()) in order to better encapsulate this functionality and to make the InstrProfRecord and SampleRecord APIs more consistent.
2) Make InstrProfRecord mergeValueProfData() a private method since it is only ever called internally by merge().
Reviewers: dnovillo, bogner, davidxl
Subscribers: silvas, vsk, llvm-commits
Differential Revision: http://reviews.llvm.org/D14786
llvm-svn: 253695
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.
The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.
The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14755
llvm-svn: 253675
Summary:
This is split out from the ThinLTO metadata mapping patch
http://reviews.llvm.org/D14752.
To avoid needing to parse the module level metadata during function
importing, a new module-level record is added which holds the
number of module-level metadata values. This is required because
metadata value ids are assigned implicitly during parsing, and the
function-level metadata ids start after the module-level metadata ids.
I made a change to this version of the code compared to D14752
in order to add more consistent and thorough assertion checking of the
new record value. We now unconditionally use the record value to
initialize the MDValueList size, and handle it the same in parseMetadata
for all module level metadata cases (lazy loading or not).
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14825
llvm-svn: 253668
It caused link errors of the form:
InstrProfiling.c:(.text.__llvm_profile_instrument_target+0x1c0): undefined reference to `__sync_fetch_and_add_8'
We had a network outage at the time of the commit so the first build to show a
problem is http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/10827
llvm-svn: 253656
This adds a new API, LTOCodeGenerator::setFileType, to choose the output file
format for LTO CodeGen. A corresponding change to use this new API from
llvm-lto and a test case is coming in a separate commit.
Differential Revision: http://reviews.llvm.org/D14554
llvm-svn: 253622
Fix avoids gratuitous warnings from gcc for "_MSC_VER" not being defined.
Differential Revision: http://reviews.llvm.org/D14598
Patch by Tony Kelman <tony@kelman.net>
llvm-svn: 253614
When dumping function samples or writing them out as text format, it
helps if the samples are emitted sorted by source location. The sorting
of the maps is a bit slow, so we only do it on demand.
llvm-svn: 253568
The LLVMContext was only used for Diagnostic. Pass a DiagnosticHandler
instead.
Differential Revision: http://reviews.llvm.org/D14794
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253540
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe. For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)
For out of tree owners, I was able to strip alignment from calls using sed by replacing:
(call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
$1i1 false)
and similarly for memmove and memcpy.
I then added back in alignment to test cases which needed it.
A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.
In IRBuilder itself, a new argument was added. Instead of calling:
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)
There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool. This is to prevent isVolatile here from passing its default
parameter to the source alignment.
Note, changes in future can now be made to codegen. I didn't change anything here, but this
change should enable better memcpy code sequences.
Reviewed by Hal Finkel.
llvm-svn: 253511
Summary:
This change adds MathExtras helper functions for handling unsigned, saturating addition and multiplication. It also updates the instrumentation and sample profile merge implementations to use them.
Reviewers: dnovillo, bogner, davidxl
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14720
llvm-svn: 253497
This change introduces an instrumentation intrinsic instruction for
value profiling purposes, the lowering of the instrumentation intrinsic
and raw reader updates. The raw profile data files for llvm-profdata
testing are updated.
llvm-svn: 253484
Starting on an input stream that is not at offset 0 would trigger the
assert in WinCOFFObjectWriter.cpp:1065:
assert(getStream().tell() <= (*i)->Header.PointerToRawData &&
"Section::PointerToRawData is insane!");
llvm-svn: 253464
We use to have an odd difference among MapVector and SetVector. The map
used a DenseMop, but the set used a SmallSet, which in turn uses a
std::set.
I have changed SetVector to use a DenseSet. If you were depending on the
old behaviour you can pass an explicit set type or use SmallSetVector.
The common cases for needing to do it are:
* Optimizing for small sets.
* Sets for types not supported by DenseSet.
llvm-svn: 253439
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites. The operation is described in more detail
in the LangRef changes.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14552
llvm-svn: 253438
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.
It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.
There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
linker since it knowns where the dynamic relocations are.
llvm-svn: 253436
This commit is for a later patch that is depend on it. The sum of two
branch probabilities can be greater than 1 due to rounding. It is safer
to saturate the results of sum and subtraction.
llvm-svn: 253421
Summary:
This change adds MathExtras helper functions for handling unsigned, saturating addition and multiplication. It also updates the instrumentation and sample profile merge implementations to use them.
No functional changes.
Reviewers: dnovillo, bogner, davidxl
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14720
llvm-svn: 253412
While still allowing CodeGen/AsmPrinter in llvm to own them using a bump
ptr allocator. (might be nice to replace the pointers there with
something that at least automatically calls their dtors, if that's
necessary/useful, rather than having it done explicitly (I think a typed
BumpPtrAllocator already does this, or maybe a unique_ptr with a custom
deleter, etc))
llvm-svn: 253409
Verified that this was at least /an/ issue, if not the only one, by
initializing NumBuckets to 1 (previously it was uninitialized, so if
this change made a difference, which it did (causing a bunch of tests to
crash) it demonstrates use-of-uninitialized memory). Initializing then
removes the crashes.
Thanks Reid for the debugging assistance
llvm-svn: 253395
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.
The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.
The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic. This
intrinsic generates no code and simply marks the alloca in use.
Reviewers: JCTremoulet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14668
llvm-svn: 253378
This patch removes the std::string& argument from a number of C++ LTO API calls
and instead makes them use the installed diagnostic handler. This would also
improve consistency of diagnostic handling infrastructure: if an LTO client used
lto_codegen_set_diagnostic_handler() to install a custom error handler, we do
not want some error messages to go through the custom error handler, and some
other error messages to go into sLastErrorString.
llvm-svn: 253367
Currently, if the assembler encounters an error after parsing (such as an
out-of-range fixup), it reports this as a fatal error, and so stops after the
first error. However, for most of these there is an obvious way to recover
after emitting the error, such as emitting the fixup with a value of zero. This
means that we can report on all of the errors in a file, not just the first
one. MCContext::reportError records the fact that an error was encountered, so
we won't actually emit an object file with the incorrect contents.
Differential Revision: http://reviews.llvm.org/D14717
llvm-svn: 253328
This adds reportError to MCContext, which can be used as an alternative to
reportFatalError when the assembler wants to try to continue processing the
rest of the file after the error is reported, so that all of the errors ina
file can be reported. It records the fact that an error was encountered, so we
can avoid emitting an object file if any errors occurred.
This patch doesn't add any uses of this function (a later patch will convert
most uses of reportFatalError to use it), but there is a small functional
change: we use the SourceManager to print the error message, even if we have a
null SMLoc. This means that we get a SourceManager-style message, with the file
and line information shown as <unknown>, rather than the "LLVM ERROR" style
used by report_fatal_error.
llvm-svn: 253327