It is possible to merge reuse and reorder shuffles and reduce the total
cost of the vectorization tree/number of final instructions.
Differential Revision: https://reviews.llvm.org/D94992
This prepares codegen for a change that will remove the identical
folds from IR because they are not poison-safe. See
D93065 / D97360
for details.
We already generically support scalar types, and there are various
target-specific transforms that overlap the vector folds. For example,
x86 recognizes the and patterns, but not or. We can end up with 1
extra instruction there, but I think that is still preferred over the
blendv alternative that loads a constant vector.
If this is not optimal, then it should be fixed with a later transform
(this change is not expected to result in any regressions because
InstCombine currently does the same thing).
Removing custom code and supporting undefs in constant-pattern-matching
can be follow-up changes.
Differential Revision: https://reviews.llvm.org/D97730
lli aims to provide both, RuntimeDyld and JITLink, as the dynamic linkers/loaders for it's JIT implementations. And they both offer debugging via the GDB JIT interface, which builds on the two well-known symbol names `__jit_debug_descriptor` and `__jit_debug_register_code`. As these symbols must be unique accross the linked executable, we can only define them in one of the libraries and make the other depend on it. OrcTargetProcess is a minimal stub for embedding a JIT client in remote executors. For the moment it seems reasonable to have the definition there and let ExecutionEngine depend on it, until we find a better solution.
This is the second commit for the reviewed patch.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97339
Add a new ObjectLinkingLayer plugin `DebugObjectManagerPlugin` and infrastructure to handle creation of `DebugObject`s as well as their registration in OrcTargetProcess. The current implementation only covers ELF on x86-64, but the infrastructure is not limited to that.
The journey starts with a new `LinkGraph` / `JITLinkContext` pair being created for a `MaterializationResponsibility` in ORC's `ObjectLinkingLayer`. It sends a `notifyMaterializing()` notification, which is forwarded to all registered plugins. The `DebugObjectManagerPlugin` aims to create a `DebugObject` form the provided target triple and object buffer. (Future implementations might create `DebugObject`s from a `LinkGraph` in other ways.) On success it will track it as the pending `DebugObject` for the `MaterializationResponsibility`.
This patch only implements the `ELFDebugObject` for `x86-64` targets. It follows the RuntimeDyld approach for debug object setup: it captures a copy of the input object, parses all section headers and prepares to patch their load-address fields with their final addresses in target memory. It instructs the plugin to report the section load-addresses once they are available. The plugin overrides `modifyPassConfig()` and installs a JITLink post-allocation pass to capture them.
Once JITLink emitted the finalized executable, the plugin emits and registers the `DebugObject`. For emission it requests a new `JITLinkMemoryManager::Allocation` with a single read-only segment, copies the object with patched section load-addresses over to working memory and triggers finalization to target memory. For registration, it notifies the `DebugObjectRegistrar` provided in the constructor and stores the previously pending`DebugObject` as registered for the corresponding MaterializationResponsibility.
The `DebugObjectRegistrar` registers the `DebugObject` with the target process. `llvm-jitlink` uses the `TPCDebugObjectRegistrar`, which calls `llvm_orc_registerJITLoaderGDBWrapper()` in the target process via `TargetProcessControl` to emit a `jit_code_entry` compatible with the GDB JIT interface [1]. So far the implementation only supports registration and no removal. It appears to me that it wouldn't raise any new design questions, so I left this as an addition for the near future.
[1] https://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97335
The argument value determines the dynamic linker to use (`default`, `rtdyld` or `jitlink`). The JITLink implementation only supports in-process JITing for now. This is the first commit for the reviewed patch.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97339
AsmParser may have no LLVMContext attached to it, which means after
5de2d189e6ad466a1f0616195e8c524a4eb3cbc0 everything goes to stderr.
Restore the old behavior.
The default expansion of CONCAT_VECTORS goes through the stack. This
patch avoids that penalty by custom-lowering CONCAT_VECTORS to a series
of INSERT_SUBVECTOR nodes. Futher optimizations are possible, but this
is a good start.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97692
When lli runs the below IR, it emits in-memory debug objects and registers them with the GDB JIT interface. The tests dump and check the registered information. IR has limited ability to produce complex output in a portable way. Instead the tests rely on built-in functions implemented in lli. They use a new command line flag `-generate=function-name` to instruct the ORC JIT to expose the built-in function with the given name to the JITed program.
`debug-descriptor-elf-minimal.ll` calls `__dump_jit_debug_descriptor()` to reflect the list of debug entries issued for itself after emitting the main module. The output is textual and can be checked straight away.
`debug-objects-elf-minimal.ll` calls `__dump_jit_debug_objects()`, which instructs lli to walk through the list of debug entries and append the encountered in-memory objects to the program output. We feed this output into llvm-dwarfdump to parse the DWARF in each file and dump their structures.
We can do the same for JITLink once D97335 has landed.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97694
-amdgpu-inline-max-bb option could lead to a suboptimal
codegen preventing inlining of really simple functions
including pure wrapper calls. Relax the cutoff by allowing
to call a function with a single block on the grounds
that it will not increase total number of blocks after
inlining.
Differential Revision: https://reviews.llvm.org/D97744
Currently ARM backend validates the range of branch targets before the
layout of fragments is finalized. This causes build failure if symbolic
expressions are used, with the exception of a single symbolic value.
For example, "b.w ." works but "b.w . + 2" currently fails to
assemble. This fixes the issue by delaying this check (in
ARMAsmParser::validateInstruction) of b.w instructions until the symbol
expressions are resolved (in ARMAsmBackend::adjustFixupValue).
Link:
https://github.com/ClangBuiltLinux/linux/issues/1286
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D97568
The situation with inline asm/MC error reporting is kind of messy at the
moment. The errors from MC layout are not reliably propagated and users
have to specify an inlineasm handler separately to get inlineasm
diagnose. The latter issue is not a correctness issue but could be improved.
* Kill LLVMContext inlineasm diagnose handler and migrate it to use
DiagnoseInfo/DiagnoseHandler.
* Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This
covers use cases like inlineasm, MC, and any clients using SourceMgr.
* Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step
is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all
use cases, only one of them is used.
* If LLVMContext is available, let MCContext uses LLVMContext's diagnose
handler; if LLVMContext is not available, MCContext uses its own default
diagnose handler which just prints SMDiagnostic.
* Change a few clients(Clang, llc, lldb) to use the new way of reporting.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D97449
The current narrowing code for G_PHI can only handle the case
where the size is a multiple of the narrow size. If this is not
the case, fall back to SDAG instead of asserting.
Original patch by shepmaster.
Differential Revision: https://reviews.llvm.org/D92446
Generic code should probably not introduce G_INSERT/G_EXTRACT. The
mirror unpackRegs should also be removed, but AMDGPU still has a use
remaining which needs to be fixed.
Remove a rule which allows larger scalar types than the destination vector
element type.
This appears to be irrelevant now that we have G_BUILD_VECTOR_TRUNC. Plus,
making a G_BUILD_VECTOR which satisfies this introduces a verifier failure
anyway.
Differential Revision: https://reviews.llvm.org/D97727
def of the adrp before the ldr.
Apparently this pass used to have liveness analysis but it was removed for
scompile time reasons. This workaround prevents the LOH from being emitted
unless the ADD and LDR are adjacent.
Fixes https://github.com/JuliaLang/julia/issues/39820
Differential Revision: https://reviews.llvm.org/D97571
`__llvm_prf_vnodes` and `__llvm_prf_names` are used by runtime but not
referenced via relocation in the translation unit.
With `-z start-stop-gc` (D96914 https://sourceware.org/bugzilla/show_bug.cgi?id=27451),
the linker no longer lets `__start_/__stop_` references retain them.
Place `__llvm_prf_vnodes` and `__llvm_prf_names` in `llvm.used` to make
them retained by the linker.
This patch changes most existing `UsedVars` cases to `CompilerUsedVars`
to reflect the ideal state - if the binary format properly supports
section based GC (dead stripping), `llvm.compiler.used` should be sufficient.
`__llvm_prf_vnodes` and `__llvm_prf_names` are switched to `UsedVars`
since we want them to be unconditionally retained by both compiler and linker.
Behaviors on other COFF/Mach-O are not affected.
Differential Revision: https://reviews.llvm.org/D97649
- This patch adds in the distinction between jg[*] and jl[*] pc-relative
mnemonics based on the variant/dialect.
- Under the hlasm variant, we use the jl[*] family of mnemonics and under
the att (GNU as) variant, we use the jg[*] family of mnemonics.
- jgnop which was added in https://reviews.llvm.org/D92185, is now restricted
to att variant. jlnop is introduced and restricted to hlasm variant.
- The br[*]l additional mnemonics are mapped to either jl[*]/jg[*] based on
the variant.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D97581
defusechain_iterator has an operator*() and operator->() that return
references to a MachineOperand, but its "reference" and "pointer"
typedefs are set as if the iterator returns a MachineInstr reference.
This causes compilation errors when defusechain_iterator is used in
generic code that uses the "reference" and "pointer" typedefs.
This patch fixes this by updating the typedefs to use MachineOperand
instead of MachineInstr.
Reviewed By: mkitzan
Differential Revision: https://reviews.llvm.org/D97522