The swap of the operands can affect later transforms that
are expecting a constant as operand 1. I don't think we
can trigger a bug with the current code, but I hit that
problem while drafting a new transform for min/max intrinsics.
Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.
This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.
When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.
Differential Revision: https://reviews.llvm.org/D99409
I do not see any bit-width restriction from the point of the
LLVM Lang Ref - Operand Bundles on the types of the deopt bundle
operands. Statepoint Lowering seems to be able to work with any
types.
This patch relaxes the two related assertions and adds a new test
for this change.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100006
This reverts commit a547b4e26b311e417cd51100e379693f51a3f448,
relanding commit 31d219d2997fed1b7dc97e0adf170d5aaf65883e,
which was reverted because there was a conflicting inverse transform,
which was causing an endless combine loop, which has now been adjusted.
Original commit message:
https://alive2.llvm.org/ce/z/67w-wQ
We prefer `add`s over `sub`, and this particular xform
allows further folds to happen:
Fixes https://bugs.llvm.org/show_bug.cgi?id=49858
I.e., if any/all of the consants is an expression, don't do it.
Since those constants won't reduce into an immediate,
but would be left as an constant expression, they could cause
endless combine loops after 31d219d2997fed1b7dc97e0adf170d5aaf65883e
added an inverse transformation.
LLVM test CodeGen/PowerPC/ppc-disable-non-volatile-cr.ll tries to check
for the absence of a sequence of instructions with several CHECK-NOT
with one of those directives using a variable defined in another.
However CHECK-NOT are checked independently so that is using a variable
defined in a pattern that should not occur in the input.
This commit changes occurence of the variable for the regex used in its
definition, thereby making each CHECK-NOT independent.
Reviewed By: NeHuang, nemanjai
Differential Revision: https://reviews.llvm.org/D99880
LLVM test Transforms/Coroutine/coro-split-sink-lifetime-O2.ll tries to
check for the absence of a sequence of instructions with several
CHECK-NOT with one of those directives using a variable defined in
another. However CHECK-NOT are checked independently so that is using a
variable defined in a pattern that should not occur in the input.
This commit simplifies the CHECK-NOT block to only check for the
presence of any lifetime start marker since that is effectively what
the test was testing at the moment.
Reviewed By: junparser
Differential Revision: https://reviews.llvm.org/D99856
The test case added in 258f055ed936 was lacking two important details for the test infrastructure. ae217bf1f327 added the executable to LLVM_TEST_DEPENDS in CMake to make sure the exectubale gets built before we run the test suite. This patch adds a ToolSubst for the executable in LIT, which replaces the tool invokation in the RUN line with an absolute path. It makes sure we don't run accidentally run some other tool from the user's PATH. The test works without it in case LLVM's main binary directory happens to be the working directory (which is default apparently). Configurations that don't build the examples ignore failures for this ToolSubst (and won't run the test).
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D99931
D88631 introduced a set of knobs to tweak how the stack protector is codegen'd for x86 targets, including the offset from the base register where the stack cookie is located. The `StackProtectorGuardOffset` field in `TargetOptions` was left uninitialized instead of being reset to its neutral value -1, making it possible to emit nonsensical code if the frontend doesn't change the field value at all before feeding the `TargetOptions` to the target machine initializer.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D99952
A value from reachable block may come to a Phi node as its input from
unreachable block. This may confuse matchSimpleRecurrence which
has no access to DomTree and can falsely recognize something as a recurrency
because of this effect, as the attached test shows.
Patch `ae7b1e` deals with half of this problem, but it only accounts from
the case when an unreachable instruction comes to Phi as an input.
This patch provides a generalization by checking that no Phi block's
predecessor is unreachable (no matter what the input is).
Differential Revision: https://reviews.llvm.org/D99929
Reviewed By: reames
Consider the .debug_pubnames and .debug_pubtypes their own kind of
accelerator and stop emitting them together with the Apple-style
accelerator tables. The only reason we were still emitting both was for
(byte-for-byte) compatibility with dsymutil-classic.
- This patch adds a new accelerator table kind "Pub" which can be
specified with --accelerator=Pub.
- This patch removes the ability to emit both pubnames/types and apple
style accelerator tables. I don't think anyone is relying on that but
it's worth pointing out.
- This patch removes the --minimize option and makes this behavior the
default. Specifying the flag will result in a warning but won't abort
the program.
Differential revision: https://reviews.llvm.org/D99907
Looking at the Doxygen-generated documentation for the llvm namespace
currently shows all sorts of random comments from different parts of the
codebase. These are mostly caused by:
- File doc comments that aren't marked with \file, so they're attached to
the next declaration, which is usually "namespace llvm {".
- Class doc comments placed before the namespace rather than before the
class.
- Code comments before the namespace that (in my opinion) shouldn't be
extracted by doxygen at all.
This commit fixes these comments. The generated doxygen documentation now
has proper docs for several classes and files, and the docs for the llvm
and llvm::detail namespaces are now empty.
Reviewed By: thakis, mizvekov
Differential Revision: https://reviews.llvm.org/D96736
We encountered a hang in our internal code base. I'm having trouble
creating a test case because the test that hit it was testing some
code that is not upstream.
Summary:
The function SplitCriticalEdge (called by SplitEdge) can return a nullptr in
cases where the edge is a critical. SplitEdge uses SplitCriticalEdge assuming it
can always split all critical edges, which is an incorrect assumption.
The three cases where the function SplitCriticalEdge will return a nullptr is:
1. DestBB is an exception block
2. Options.IgnoreUnreachableDests is set to true and
isa(DestBB->getFirstNonPHIOrDbgOrLifetime()) is not equal to a nullptr
3. LoopSimplify form must be preserved (Options.PreserveLoopSimplify is true)
and it cannot be maintained for a loop due to indirect branches
For each of these situations they are handled in the following way:
1. Modified the function ehAwareSplitEdge originally from
llvm/lib/Transforms/Coroutines/CoroFrame.cpp to handle the cases when the DestBB
is an exception block. This function is called directly in SplitEdge.
SplitEdge does not call SplitCriticalEdge in this case
2. Options.IgnoreUnreachableDests is set to false by default, so this situation
does not apply.
3. Return a nullptr in this situation since the SplitCriticalEdge also returned
nullptr. Nothing we can do in this case.
Reviewed By: asbirlea
Differential Revision:https://reviews.llvm.org/D94619
Follow up to a6d2a8d6f5. These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.
LLVM test CodeGen/AArch64/speculation-hardening.ll tries to check for
the absence of a sequence of instructions with several CHECK-NOT with
one of those directives using a variable defined in another. However
CHECK-NOT are checked independently so that is using a variable defined
in a pattern that should not occur in the input.
This commit removes the dependency between those CHECK-NOT by replacing
single occurence of the undefined variable by a regex match, and
multiple occurences by a definition followed by a use.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D99866
Follow up to a6d2a8d6f5. This covers all the public interfaces of the bundle related code. I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.
Fixes the ASan RISC-V memory mapping (originally introduced by D87580 and
D87581). This should be an improvement both in terms of first principles
soundness and observed test failures --- test failures would occur
non-deterministically depending on the ASLR random offset.
On RISC-V Linux (64-bit), `TASK_UNMAPPED_BASE` is currently defined as
`PAGE_ALIGN(TASK_SIZE / 3)`. The non-power-of-two divisor makes the result
be the not very round number 0x1555556000. That address had to be further
rounded to ensure page alignment after the shadow scale shifting is applied.
Still, that value explains why the mapping table may look less regular than
expected.
Further cleanups:
- Moved the mapping table comment, to ensure that the two Linux/AArch64
tables stayed together;
- Removed mention of Sv48. Neither the original mapping nor this one are
compatible with an actual Linux Sv48 address space (mainline Linux still
operates Sv48 in Sv39 mode). A future patch can improve this;
- Removed the additional comments, for consistency.
Differential Revision: https://reviews.llvm.org/D97646
Previously, 34-bit constants were materialized in selectI64Imm(), and we relied
on td pattern matching to instead produce a pli. This becomes problematic as
there is no guarantee that the 34-bit constant will reach the td pattern
selection for pli. It is also possible for other transformations (such as complex
bit permutations) to also produce and utilize the 34-bit constant materialized
through selectI64Imm().
This patch instead produces pli on Power10 directly whenever the constant fits
within 34-bits.
Differential Revision: https://reviews.llvm.org/D99906
Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class. A follow up change will do the same for the newer assumption query/bundle mechanisms.
performScalarPREInsertion() inserts instructions into blocks that we
need to tell ImplicitControlFlowTracking about, otherwise the ICF cache
may be invalid.
Fixes PR49193.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99909
The current code does not properly handle vector indices unless they are
the first index.
At the moment LangRef gives the impression that the vector index must be
the one and only index (https://llvm.org/docs/LangRef.html#getelementptr-instruction).
But vector indices can appear at any position and according to the
verifier there may be multiple vector indices. If that's the case, the
number of elements must match.
This patch updates SimplifyGEPInst to properly handle those additional
cases.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99961
Many of the operands are handled the same or in the same order
for all these intrinsics. Factor out the code for selecting and
pushing them into the Operands vector.
Differential Revision: https://reviews.llvm.org/D99923
This allows frontend and backend diagnostic files to all go into the
same place. Have it control the Windows (mini-)dump location.
Differential Revision: https://reviews.llvm.org/D99199
Commit 22ce5eb051591b828b1ce4238624b6e95d334a5b, changed checks in
GVN/big-endian.ll into CHECK-NOT. The intent was to check that a
succession of lines does not occur but each CHECK-NOT is checked
independently. In other word, one CHECK-NOT uses a variable defined
in a pattern (the one defining the variable) that should not occur in
the input. The bug was then copied over in NewGVN/big-endian.ll.
This commit only checks for the absence of i16 load which rules out the
presence of the whole sequence and does not involve an undefined
variable.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99581
This patch adds support for TLS variables to the XCOFF object writer:
- Add TData and TBSS sections
- Add CsectGroups for the mapping classes XCOFF::XMC_TL and XCOFF::XMC_UL
- Add XMC_UL in the enum entry of CsectStorageMapping class to print the string
while reading the symbol properties for TLS variables
- Fix the starting address of TData and TBSS sections
Reviewed by: hubert.reinterpretcast, DiggerLin
Differential Revision: https://reviews.llvm.org/D98946