Also compacted the checkpoints (variables) to one file (plus the index).
This reduces the binary model files to just the variables and their
index. The index is very small. The variables are serialized float
arrays. When updated through training, the changes are very likely
unlocalized, so there's very little value in them being anything else
than binary.
This was emitting the raw value for the reg class ID with a comment
for the actual class name. Switch to emitting the qualified enum name
instead, which obviates the need for the comment and also helps keep
the lit tests on the emitter output more stable.
The GlobalISelEmitter is stricter about matching timm instruction
outputs to timm inputs (although in an accidental sort of way that
doesn't hit a proper import failure error). Also, apparently no
intrinsic patterns were importing since the ID enum declaration was
missing.
This patch generalizes the APIs for writing re-entry blocks, trampolines and
stubs to allow their final linked address to differ from the address of
their initial working memory. This will allow these routines to be used with
JITLinkMemoryManagers, which will in turn allow for unification of code paths
for in-process and cross-process lazy JITing.
This will be used by upcoming patches that implement indirection utils
(reentry, reentry trampolines, and stubs) on top of
JITLinkMemoryManager to unify in-process and cross-process lazy
compilation support.
Summary:
This is an experimental ML-based native size estimator, necessary for
computing partial rewards during -Oz inliner policy training. Data
extraction for model training will be provided in a separate patch.
RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html
Reviewers: davidxl, jdoerfert
Subscribers: mgorny, hiraditya, mgrang, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82817
We have this generic transform in IR (instcombine),
but as shown in PR41098:
http://bugs.llvm.org/PR41098
...the pattern may emerge in codegen too.
x86 has a potential refinement/reversal opportunity here,
but that should come later or needs a target hook to
avoid the transform. Converting to bswap is the more
specific form, so we should use it if it is available.
Summary:
The test fails on AIX as it allows open() and read() on a directory. This patch adds `# UNSUPPORTED:
system-aix` to the test to prevent it from running on AIX.
Reviewers: sameerarora101, daltenty, ShuhongL, hubert.reinterpretcast, MaskRay, smeenai, alexshap
Reviewed By: sameerarora101, hubert.reinterpretcast, MaskRay, smeenai
Subscribers: MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83579
Summary:
Add support for MASM STRUCT casting field accessors: (<TYPE> PTR <value>).<field>
Since these are operands, we add them to X86AsmParser. If/when we extend MASM support to other architectures (e.g., ARM), we will need similar changes there as well.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D83346
We should also pass down the LLVM_LIT_ARGS in runtime build mode,
so that the runtime tests can be well controlled as well.
We actually passed this down in clang/runtime/CMakeLists.txt
But not for calls from llvm/runtime/CMakeLists.txt.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D83565
Teaches the SLPVectorizer to use vectorized library functions for
non-intrinsic calls.
This already worked for intrinsics that have vectorized library
functions, thanks to D75878, but schedules with library functions with a
vector variant were being rejected early.
- assume that there are no load/store dependencies between lib
functions with a vector variant; this would otherwise prevent the
bundle from becoming "ready"
- check during legalization that the vector variant can be used
- fix-up where we previously assumed that a call would be an intrinsic
Differential Revision: https://reviews.llvm.org/D82550
This refines the test to use macros. It is needed for
a follow-up change that adds a functionality to
override more fields.
Also, it is just cleaner to test each key separately.
Differential revision: https://reviews.llvm.org/D83481
This carves out an exception for a pair of consecutive loads that are
reversed from the consecutive order of a pair of stores. All of the
existing profitability/legality checks for the memops remain between
the 2 altered hunks of code.
This should give us the same x86 base-case asm that gcc gets in
PR41098 and PR44895:
http://bugs.llvm.org/PR41098http://bugs.llvm.org/PR44895
I think we are missing a potential subsequent conversion to use "movbe"
if the target supports that. That might be similar to what AArch64
would use to get "rev16".
Differential Revision: https://reviews.llvm.org/D83567
This carves out an exception for a pair of consecutive loads that are
reversed from the consecutive order of a pair of stores. All of the
existing profitability/legality checks for the memops remain between
the 2 altered hunks of code.
This should give us the same x86 base-case asm that gcc gets in
PR41098 and PR44895:i
http://bugs.llvm.org/PR41098http://bugs.llvm.org/PR44895
I think we are missing a potential subsequent conversion to use "movbe"
if the target supports that. That might be similar to what AArch64
would use to get "rev16".
Differential Revision:
This refactors option -disable-mve-tail-predication to take different arguments
so that we have 1 option to control tail-predication rather than several
different ones.
This is also a prep step for D82953, in which we want to reject reductions
unless that is requested with this option.
Differential Revision: https://reviews.llvm.org/D83133
We have an issue currently: --dyn-relocations always prints the following
relocation header when dumping `DynPLTRelRegion`:
"Offset Info Type Symbol's Value Symbol's Name + Addend"
I.e. even for an empty object, --dyn-relocations still prints this.
It is a easy to fix bug, but we have no dedicated test case for this option.
(we have a dynamic-reloc-no-section-headers.test, which has a slightly different purpose).
This patch adds a test and fixes the behavior.
Differential revision: https://reviews.llvm.org/D83387
Summary:
Helper used when splitting load & store operations to calculate
the pointer + offset for the high half of the split
Reviewers: efriedma, sdesmalen, david-arm
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83577
Check that input size matches size of destination reg class.
Attempt to extend input size when needed.
Differential Revision: https://reviews.llvm.org/D83384
We can try to replace select with a Phi not in its parent block alone,
but also in blocks of its arguments. We benefit from it when select's
argument is a Phi.
Differential Revision: https://reviews.llvm.org/D83284
Reviewed By: nikic
This patch adds support for constrained int/fp conversion between
signed/unsigned i32 and f32/f64.
Reviewed By: jhibbits
Differential Revision: https://reviews.llvm.org/D82747
This implements the default(firstprivate) clause as defined in OpenMP
Technical Report 8 (2.22.4).
Reviewed By: jdoerfert, ABataev
Differential Revision: https://reviews.llvm.org/D75591
Remove _COMPAT. Drop the ARCHNAME. Remove the non-COMPAT versions
that are no longer needed.
We now only use these macros in places where we need compatibility
with libgcc/compiler-rt. So we don't need to call out _COMPAT
specifically.
For model 6 CPUs, we have a fallback detection method based on
available features. That mechanism should be enough to detect
these early family 6 CPUs as they only differ in the features
used by the detection anyway.