This partially reverts 838490de7ed, which broke some Solaris bots. Apparently
Solaris defines int8_t as char rather than signed char, which made the
SerializationTypeName<char> specialization a redefinition.
This partial revert isolates use of uint8_t buffers to ORC-RPC handling of
wrapper functions only. The TargetProcessControl::runWrapper method will
continue to use char buffers.
Provides ObjectTransformLayer APIs, a getter to access the
ObjectTransformLayer member of LLJIT, and the DumpObjects utility
to make construction of a dump-to-disk transform easy.
An example showing how the new APIs can be used has been added in
llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.
This patch handles one particular case of one-iteration loops for which SCEV
cannot straightforwardly prove BECount = 1. The idea of the optimization is to
symbolically execute conditional branches on the 1st iteration, moving in topoligical
order, and only visiting blocks that may be reached on the first iteration. If we find out
that we never reach header via the latch, then the backedge can be broken.
This implementation uses InstSimplify. SCEV version was rejected due to high
compile time impact.
Differential Revision: https://reviews.llvm.org/D102615
Reviewed By: nikic
InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.
Differential Revision: https://reviews.llvm.org/D104193
This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining.
The callsite which size is smaller would have a higher priority.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D104028
This looks like not a practical pattern in our codebase (it could fail
in some sandbox environement).
Instead we print it via standard output, and it is controled by the
-attributor-print-call-graph, this follows a similiar pattern of attributor-print-dep.
Allow mangled names to include an arbitrary dot suffix, akin to vendor
specific suffix in Itanium mangling.
Primary motivation is a support for symbols renamed during ThinLTO
import / promotion (ThinLTO is the default configuration for optimized
builds in rustc).
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104358
InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.
Differential Revision: https://reviews.llvm.org/D104193
This reverts commit 76d0747e0807307780ba84cbd7e5c80b20c26bd7.
If a group has `__llvm_prf_vals` due to static value profiler counters
(`NS!=0`), we cannot make `__llvm_prf_data` private, because a prevailing text
section may reference `__llvm_prf_data` and will cause a `relocation refers to a
discarded section` linker error.
Note: while a `__profc_` group is non-prevailing, it may be referenced by a
prevailing text section due to inlining.
```
group section [ 66] `.group' [__profc__ZN5clang20EmitClangDeclContextERN4llvm12RecordKeeperERNS0_11raw_ostreamE] contains 4 sections:
[Index] Name
[ 67] __llvm_prf_cnts
[ 68] __llvm_prf_vals
[ 69] __llvm_prf_data
[ 70] .rela__llvm_prf_data
```
This should fix PR50683. The wrong assumption was that we
could always know what the callee is when we replace a call site
argument with undef. We wanted to know that to remove the `noundef`
that might be attached to the argument. Since no callee means we
did the propagation on the caller site, there is no need to remove
an attribute. It is only needed if we replace all uses and therefore
pass `undef` instead of the value that was passed in otherwise.
Users might want to run initialize for a set of AAs without an
intermediate update step. Running update eagerly is not a requirement
anyway so we make it optional.
To allow outside AAs that simplify values we need to ensure all value
simplification goes through the Attributor, not AAValueSimplify (or any
of the other AAs we have already like AAPotentialValues). This patch
also introduces an interface for the outside AAs to register
simplification callbacks for an IRPosition. To make this work as
expected we have to pass IRPositions instead of Values in
AAValueSimplify, which makes sense by itself.
If we simplify values we sometimes end up with type mismatches. If the
value is a constant we can often cast it though to still allow
propagation. The logic is now put into a helper and it replaces some
ad hoc things we did before.
This also introduces the AA namespace for abstract attribute related
functions and types.
Differential Revision: https://reviews.llvm.org/D103856
If the target stack is not accessible between different running
"threads" we have to make sure not to create allocas for mallocs
that might be used by multiple "threads". The "use check" is
sufficient to prevent this but if we apply the "free check" we have
to make sure the pointer is not communicated to others before
the free is reached.
Differential Revision: https://reviews.llvm.org/D98608
The initial use for AAExecutionDomain was to determine if a single
thread executes a block. While this is sometimes informative most
of the time, and for other reasons, we actually want to know if it
is the "initial thread". Thus, the thread that started execution on
the current device. The deduction needs to be adjusted in a follow
up as the methods we use right not are looking for the OpenMP thread
id which is resets whenever a thread enters a parallel region. What
we basically want is to look for `llvm.nvvm.read.ptx.sreg.ntid.x` and
equivalent functions.
We invalidated AAReachabilityImpl directly which is not helpful and
confusing as we still used it regardless. We now avoid invalidating it
(not needed anyway) and add checks for the state. This has by itself no
actual effect but prepares for later extensions.
The current naming scheme adds the `dfs$` prefix to all
DFSan-instrumented functions. This breaks mangling and prevents stack
trace printers and other tools from automatically demangling function
names.
This new naming scheme is mangling-compatible, with the `.dfsan`
suffix being a vendor-specific suffix:
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-structure
With this fix, demangling utils would work out-of-the-box.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104494
The patch https://reviews.llvm.org/D101469 is intended to enable loop unrolling,
not interleaved access vectorization. The method bool enableInterleavedAccessVectorization()
should not be implemented.
The instruction can be 16-bit aligned while targeting 32-bit aligned
code. To calculate the target address correctly, the address of the
instruction has to be adjusted.
Differential Revision: https://reviews.llvm.org/D104446
This patch is to address https://bugs.llvm.org/show_bug.cgi?id=48857.
Previous attempts can be found in D104007 and D101980.
A lot of discussions can be found in those two patches.
To summarize the bug:
When Clang emits IR for coroutines, the first thing it does is to make a copy of every argument to the local stack, so that uses of the arguments in the function will all refer to the local copies instead of the arguments directly.
However, in some cases we find that arguments are still directly used:
When Clang emits IR for a function that has pass-by-value arguments, sometimes it emits an argument with byval attribute. A byval attribute is considered to be local to the function (just like alloca) and hence it can be easily determined that it does not alias other values. If in the IR there exists a memcpy from a byval argument to a local alloca, and then from that local alloca to another alloca, MemCpyOpt will optimize out the first memcpy because byval argument's content will not change. This causes issues because after a coroutine suspension, the byval argument may die outside of the function, and latter uses will lead to memory use-after-free.
This is only a problem for arguments with either byval attribute or noalias attribute, because only these two kinds are considered local. Arguments without these two attributes will be considered to alias coro_suspend and hence we won't have this problem. So we need to be able to deal with these two attributes in coroutines properly.
For noalias arguments, since coro_suspend may potentially change the value of any argument outside of the function, we simply shouldn't mark any argument in a coroutiune as noalias. This can be taken care of in CoroEarly pass.
For byval arguments, if such an argument needs to live across suspensions, we will have to copy their value content to the frame, not just the pointer.
Differential Revision: https://reviews.llvm.org/D104184
This attribute computes the optimistic live call edges using the attributor
liveness information. This attribute will be used for deriving a
inter-procedural function reachability attribute.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104059
For ELF, since all counters/data are in a section group (either `comdat any` or
`comdat noduplicates`), and the signature for `comdat any` is `__profc_`, the
D1003372 optimization prerequisite (linker GC cannot discard data variables
while the text section is retained) is always satisified, we can make __profd_
unconditionally private.
Reviewed By: davidxl, rnk
Differential Revision: https://reviews.llvm.org/D103717
This pass emits a floating point compare and a conditional branch,
but if strictfp is enabled we don't emit a constrained compare
intrinsic.
The backend also won't expand the readonly sqrt call this pass inserts
to a sqrt instruction under strictfp. So we end up with 2 libcalls as
seen here. https://godbolt.org/z/oax5zMEWd
Fix these things by disabling the pass.
Differential Revision: https://reviews.llvm.org/D104479
These other platforms are unsupported and untested.
They could be re-added later based on MSan code.
Reviewed By: gbalats, stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104481
The old version of this code would blindly perform arithmetic without
paying attention to whether the types involved were pointers or
integers. This could lead to weird expressions like negating a pointer.
Explicitly handle simple cases involving pointers, like "x < y ? x : y".
In all other cases, coerce the operands of the comparison to integer
types. This avoids the weird cases, while handling most of the
interesting cases.
Differential Revision: https://reviews.llvm.org/D103660
The target specific expression handling was slightly regressed by
bbea64250f65480d787e1c5ff45c4de3ec2dcda8. This restores the proper
sub-expression evaluation to allow for constant folding within the
expression. We explicitly discard the layout and assembler when
evaluating the expression to avoid any symbolic computation and instead
using the `evaluateAsRelocatable` to canonicalise and constant fold
only.
We can also simplify the expression handling - none of the target
variants support symbolic difference. This simplifies the logic for
that and adds additional tests to ensure that we do not accidentally
regress here in the future.
Reviewed By: maskray
Differential Revision: https://reviews.llvm.org/D104473
This fixes a GISEL vs SDAG regression that showed up at -Os in 256.bzip2
In `_getAndMoveToFrontDecode`:
gisel:
```
and w9, w0, #0xff
orr w9, w9, w8, lsl #8
```
sdag:
```
bfi w0, w8, #8, #24
```
Differential revision: https://reviews.llvm.org/D103291
Fold all exits based on known trip count/multiple information from
SCEV. Previously only the latch exit or the single exit were folded.
This doesn't yet eliminate ULO.TripCount and ULO.TripMultiple
entirely: They're still used to a) decide whether runtime unrolling
should be performed and b) for ORE remarks. However, the core
unrolling logic is independent of them now.
Differential Revision: https://reviews.llvm.org/D104203
This patch will allow developers to remove unwanted instruction Defs (most likely from within a target specific InstrPostProcess) by setting that Def's RegisterID to 0.
Differential Revision: https://reviews.llvm.org/D104433
This really isn't talking about vectors in general,
but only about either fixed or scalable vectors,
and it's pretty confusing to see it state
that there aren't any vectors :)