We currently use an unsigned value for our CostTblEntry and TypeConversionCostTblEntry cost tables which is limiting depending on how the target wishes to handle various CostKinds etc.
For instance, targets might wish to store separate instruction count, latency or throughput values etc. On D46276 we have been investigating storing a code snippet to improve latency/throughput cost calculations.
There is a slight problem in that template argument deduction was struggling to match the now templatized Costs[] tables in a ArrayRef constructor - I've added helper wrappers for CostTableLookup/ConvertCostTableLookup which avoids us having to update all existing calls with a template hint.
Differential Revision: https://reviews.llvm.org/D106351
This does ensure `InformationCache::getPotentiallyReachable` will not
crash/assert on instructions from different functions but simply return
that one is reachable, which is conservatively correct.
This patch introduces AAPointerInfo which tracks the uses of a pointer
and places them in "bins" based on their offset from the base and access
size.
As with other AAs, any pointer can be tracked but it is up to the user
to make sense of the results. The user in this patch is AAValueSimplify
and AAPotentialValues which both utilize AAPointerInfo to determine the
value of a load. For now, this is restricted to loads of allocas and
internal globals. Through the use of AAPointerInfo and the "bins" we can
track struct members separately. The users also know that storing only
zeros (at unknown indices) will result in loading only 0 (from unknown
indices). Other than that, the users are flow and context insensitive
(for now).
To deal with the "bins" more easily, AAPointerInfo provides a
forallInterfearingAccesses that applies a callback on all accesses
that might interfere with a given load or store.
Differential Revision: https://reviews.llvm.org/D104432
As a first step to simplify loads we only handle `null` and `undef`
underlying objects, as well as objects that have the load as a single user.
Loads of those values can be replaced by the initializer, if any.
Proper reasoning is introduced in a follow up patch
Differential Revision: https://reviews.llvm.org/D103862
This reverts commit 4ae575b9997e0903d1c2ec01a43e3f3f2db5df16 and 9b965b37c75d626c01951184088314590e38d299.
There is an use-of-uninitialized-value bug in the `else` branch in ImportSection::addImport.
Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup
kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data
sections). Usually this is done in a convoluted way, with unnamed temp data
symbols which target the start of the function, in which case
WasmObjectWriter::recordRelocation converts it to use the section symbol
instead. However in some cases the function can actually be undefined; in this
case the dwarf generator uses the function symbol (a named undefined function
symbol) instead. In that case the section-symbol transform doesn't work and we
need to generate the correct reloc type a different way. In this change
WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into
account to choose the correct reloc type.
Fixes PR50408
Differential Revision: https://reviews.llvm.org/D103557
We need the compiler generated variable to override the weak symbol of
the same name inside the profile runtime, but using LinkOnceODRLinkage
results in weak symbol being emitted in which case the symbol selected
by the linker is going to depend on the order of inputs which can be
fragile.
This change replaces the use of weak definition inside the runtime with
a weak alias. We place the compiler generated symbol inside a COMDAT
group so dead definition can be garbage collected by the linker.
We also disable the use of runtime counter relocation on Darwin since
Mach-O doesn't support weak external references, but Darwin already uses
a different continous mode that relies on overmapping so runtime counter
relocation isn't needed there.
Differential Revision: https://reviews.llvm.org/D105176
The patch does not depend on the availability of the library functions for
memcpy/memset as it operates on LLVM intrinsics. The optimizations are useful
on the targets that have these functions disabled (e.g. NVPTX & AMDGPU).
Differential Revision: https://reviews.llvm.org/D104801
This diff changes llvm-ifs to use unified IFS file format
and perform other renaming changes in preparation for the
merging between elfabi/ifs.
Differential Revision: https://reviews.llvm.org/D99810
This change implements unified text stub format and command line
interface proposed in the elfabi/ifs merge plan.
Differential Revision: https://reviews.llvm.org/D99399
Although this combine checks that there's no load folding barriers between
the loads that it's trying to merge, it was inserting the load at the
MIRBuilder's default insertion point, which is the G_OR use inst.
This was causing a miscompile in the test suite's
SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-bswap-2
Differential Revision: https://reviews.llvm.org/D106251
Some template functions were missing '&&' in function arguments,
therefore these were always taken by value after template instantiation.
This patch adds the double ampersand to introduce proper perfect
forwarding.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D106148
As encountered on D106053, we need to be very explicit that the Assertion nodes don't hold true for a poison value (or for specific poisoned vector elements).
Differential Revision: https://reviews.llvm.org/D106257
This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works.
Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D104180
Avoid a crash when using instruction referencing if x87 floating point
instructions are used. These instructions are significantly mutated when
they're rewritten from referring to registers, to referring to
floating-point-stack positions. As a result, their operands are re-ordered,
and (InstrRef) LiveDebugValues asserts when it sees a DBG_INSTR_REF
referring to a non-reg non-def register operand.
To fix this, drop the instruction numbers, and thus variable locations.
This patch adds a helper utility do do that.
Dropping the variable locations is sub-optimal, but applying DBG_VALUEs to
the $fp0 and similar registers is dropped on emission too. It seems we've
never done well at describing variables that live in x87 registers, at all.
Differential Revision: https://reviews.llvm.org/D105657
The last use of isNON_TRUNCStore was removed on Oct 10, 2018 in commit
07acc992dc39edfccc5a4b773c3dcf8a5bf6d893.
isTRUNCStore seems to be unused for at least 10 years.
Adds support for MachO static initializers/deinitializers and eh-frame
registration via the ORC runtime.
This commit introduces cooperative support code into the ORC runtime and ORC
LLVM libraries (especially the MachOPlatform class) to support macho runtime
features for JIT'd code. This commit introduces support for static
initializers, static destructors (via cxa_atexit interposition), and eh-frame
registration. Near-future commits will add support for MachO native
thread-local variables, and language runtime registration (e.g. for Objective-C
and Swift).
The llvm-jitlink tool is updated to use the ORC runtime where available, and
regression tests for the new MachOPlatform support are added to compiler-rt.
Notable changes on the ORC runtime side:
1. The new macho_platform.h / macho_platform.cpp files contain the bulk of the
runtime-side support. This includes eh-frame registration; jit versions of
dlopen, dlsym, and dlclose; a cxa_atexit interpose to record static destructors,
and an '__orc_rt_macho_run_program' function that defines running a JIT'd MachO
program in terms of the jit- dlopen/dlsym/dlclose functions.
2. Replaces JITTargetAddress (and casting operations) with ExecutorAddress
(copied from LLVM) to improve type-safety of address management.
3. Adds serialization support for ExecutorAddress and unordered_map types to
the runtime-side Simple Packed Serialization code.
4. Adds orc-runtime regression tests to ensure that static initializers and
cxa-atexit interposes work as expected.
Notable changes on the LLVM side:
1. The MachOPlatform class is updated to:
1.1. Load the ORC runtime into the ExecutionSession.
1.2. Set up standard aliases for macho-specific runtime functions. E.g.
___cxa_atexit -> ___orc_rt_macho_cxa_atexit.
1.3. Install the MachOPlatformPlugin to scrape LinkGraphs for information
needed to support MachO features (e.g. eh-frames, mod-inits), and
communicate this information to the runtime.
1.4. Provide entry-points that the runtime can call to request initializers,
perform symbol lookup, and request deinitialiers (the latter is
implemented as an empty placeholder as macho object deinits are rarely
used).
1.5. Create a MachO header object for each JITDylib (defining the __mh_header
and __dso_handle symbols).
2. The llvm-jitlink tool (and llvm-jitlink-executor) are updated to use the
runtime when available.
3. A `lookupInitSymbolsAsync` method is added to the Platform base class. This
can be used to issue an async lookup for initializer symbols. The existing
`lookupInitSymbols` method is retained (the GenericIRPlatform code is still
using it), but is deprecated and will be removed soon.
4. JIT-dispatch support code is added to ExecutorProcessControl.
The JIT-dispatch system allows handlers in the JIT process to be associated with
'tag' symbols in the executor, and allows the executor to make remote procedure
calls back to the JIT process (via __orc_rt_jit_dispatch) using those tags.
The primary use case is ORC runtime code that needs to call bakc to handlers in
orc::Platform subclasses. E.g. __orc_rt_macho_jit_dlopen calling back to
MachOPlatform::rt_getInitializers using __orc_rt_macho_get_initializers_tag.
(The system is generic however, and could be used by non-runtime code).
The new ExecutorProcessControl::JITDispatchInfo struct provides the address
(in the executor) of the jit-dispatch function and a jit-dispatch context
object, and implementations of the dispatch function are added to
SelfExecutorProcessControl and OrcRPCExecutorProcessControl.
5. OrcRPCTPCServer is updated to support JIT-dispatch calls over ORC-RPC.
6. Serialization support for StringMap is added to the LLVM-side Simple Packed
Serialization code.
7. A JITLink::allocateBuffer operation is introduced to allocate writable memory
attached to the graph. This is used by the MachO header synthesis code, and will
be generically useful for other clients who want to create new graph content
from scratch.
This reverts commit 2a419a0b9957ebac9e11e4b43bc9fbe42a9207df.
The result of a shufflevector must not propagate poison from any element
other than the one noted in the shuffle mask.
The regressions outside of fptoui-may-overflow.ll can probably be
recovered some other way; for example, using isGuaranteedNotToBePoison.
See discussion on https://reviews.llvm.org/D106053 for more background.
Differential Revision: https://reviews.llvm.org/D106222
This patch relands https://reviews.llvm.org/D104799, but fixes the
memory handling causing leak sanitizer failures.
This reverts commit a56fe117e04f7d4b953a4226af412dad59425fb5.
At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string.
Move the include down to any cpp implementation where std::string is actually used.
This API is incompatible with opaque pointers and deprecated in
favor of the version that accepts an explicit element type.
Also remove the separate overload for a single index, as this is
already covered by the ArrayRef overload.
At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string.
Move the include down to any cpp implementation where std::string is actually used.
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184