This patch makes FoldBranchToCommonDest merge branch conditions into `select i1` rather than `and/or i1` when it is called by SimplifyCFG.
It is known that merging conditions into and/or is poison-unsafe, and this is towards making things *more* correct by removing possible miscompilations.
Currently, InstCombine simply consumes these selects into and/or of i1 (which is also unsafe), so the visible effect would be very small. The unsafe select -> and/or transformation will be removed in the future.
There has been efforts for updating optimizations to support the select form as well, and they are linked to D93065.
The safe transformation is fired when it is called by SimplifyCFG only. This is done by setting the new `PoisonSafe` argument as true.
Another place that calls FoldBranchToCommonDest is LoopSimplify. `PoisonSafe` flag is set to false in this case because enabling it has a nontrivial impact in performance because SCEV is more conservative with select form and InductiveRangeCheckElimination isn't aware of select form of and/or i1.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D95026
Hello all,
I'm trying to fix unsafe propagation of poison values in and/or conditions by using
equivalent select forms (`select i1 A, i1 B, i1 false` and `select i1 A, i1 true, i1 false`)
instead.
D93065 has links to patches for this.
This patch allows unswitch to happen if the condition is in this form as well.
`collectHomogenousInstGraphLoopInvariants` is updated to keep traversal if
Root and the visiting I matches both m_LogicalOr()/m_LogicalAnd().
Other than this, the remaining changes are almost straightforward and simply replaces
Instruction::And/Or check with match(m_LogicalOr()/m_LogicalAnd()).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D97756
In "DWARF Extensions For Heterogeneous Debugging" document that the
DWARF generic type has a target architecture defined endianity.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D98126
For many directives, the following diagnostics
* `error: unexpected token`
* `error: unexpected token in '.abort' directive"`
are replaced with `error: expected newline`.
`unexpected token` may make the user think a different token is needed.
`expected newline` is clearer about the expected token.
For `in '...' directive`, the directive name is not useful because the next line
replicates the error line which includes the directive.
As a resolution to https://sourceware.org/bugzilla/show_bug.cgi?id=25295 , GNU as
from binutils 2.35 supports the optional third argument for the .symver directive.
'remove' for a non-default version is useful:
`.symver def_v1, def@v1, remove` => def_v1 is not retained in the symbol table.
Previously the user has to strip the original symbol or specify a `local:`
version node in a version script to localize the symbol.
`.symver def, def@@v1, remove` and `.symver def, def@@@v1, remove` are supported
as well, though they are identical to `.symver def, def@@@v1`.
local/hidden are not useful so this patch does not implement them.
When materializing an available load value, do not explicitly
materialize the undef values from dead blocks. Doing so will
will force creation of a phi with an undef operand, even if there
is a dominating definition. The phi will be folded away on
subsequent GVN iterations, but by then we may have already
poisoned MDA cache slots.
Simply don't register these values in the first place, and let
SSAUpdater do its thing.
What this test illustrates is that GVN inserts an unnecessary
phi node initially, which prevents alias analysis from establishing
NoAlias, and MDA caches that result. We would be able to fully fold
this after another -gvn run with clean MDA.
Add support to widen call instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer.
Previously call instructions got handled by wrong recipes and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139.
Patch by Mauri Mustonen <mauri.mustonen@tuni.fi>
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D97278
It's just a wrong thing to do.
We introduce inttoptr where there were none, which results in
loosing all provenance information because we no longer have a GEP{i,},
and pessimize all future optimizations,
because we are basically not allowed to look past `inttoptr`.
(gep i8* X, -(ptrtoint Y)) *is* the canonical form.
So just drop this fold.
Noticed while reviewing D98120.
These intrinsics, not the icmp+select are the canonical form nowadays,
so we might as well directly emit them.
This should not cause any regressions, but if it does,
then then they would needed to be fixed regardless.
Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`,
but that is a pessimization, not a correctness issue.
Additionally, the non-intrinsic form has issues with undef,
see https://reviews.llvm.org/D88287#2587863
Patch adds support for passing vector call operands to variadic
functions. Arguments which are fixed shadow GPRs and stack space even
when they are passed in vector registers, while arguments passed through
ellipses are passed in properly aligned GPRs if available and on the
stack once all GPR arguments registers are consumed.
Differential Revision: https://reviews.llvm.org/D97956
We have the `enable-loopinterchange` option in legacy pass manager but not in NPM.
Add `LoopInterchange` pass to the optimization pipeline (at the same position as before)
when `enable-loopinterchange` is turned on.
Reviewed By: aeubanks, fhahn
Differential Revision: https://reviews.llvm.org/D98116
GVN basically doesn't handle phi nodes at all. This is for a reason - we can't value number their inputs since the predecessor blocks have probably not been visited yet.
However, it also creates a significant pass ordering problem. As it stands, instcombine and simplifycfg ends up implementing CSE of phi nodes. This means that for any series of CSE opportunities intermixed with phi nodes, we end up having to alternate instcombine/simplifycfg and gvn to make progress.
This patch handles the simplest case by simply preprocessing the phi instructions in a block, and CSEing them if they are syntactically identical. This turns out to be powerful enough to handle many cases in a single invocation of GVN since blocks which use the cse'd phi results are visited after the block containing the phi. If there's a CSE opportunity in one the phi predecessors required to recognize the phi CSE opportunity, that will require a second iteration on the function. (Still within a single run of gvn though.)
Compile time wise, this could go either way. On one hand, we're potentially causing GVN to iterate over the function more. On the other, we're cutting down on iterations between two passes and potentially shrinking the IR aggressively. So, a bit unclear what to expect.
Note that this does still rely on instcombine to canonicalize block order of the phis, but that's a one time transformation independent of the values incoming to the phi.
Differential Revision: https://reviews.llvm.org/D98080
As a pragmatic tradeoff, the ease of updating the tests outweighs the slightly easier to understand test conditions. Where revevant, debug output was converted to comments to help human understanding.
That review is extracted from D69372.
It fixes https://bugs.llvm.org/show_bug.cgi?id=42219 bug.
For the noimplicitfloat mode, the compiler mustn't generate
floating-point code if it was not asked directly to do so.
This rule does not work with variable function arguments currently.
Though compiler correctly guards block of code, which copies xmm vararg
parameters with a check for %al, it does not protect spills for xmm registers.
Thus, such spills are generated in non-protected areas and could break code,
which does not expect floating-point data. The problem happens in -O0
optimization mode. With this optimization level there is used
FastRegisterAllocator, which spills virtual registers at basic block boundaries.
Register Allocator does not protect spills with additional control-flow modifications.
Thus to resolve that problem, it is suggested to not copy incoming physical
registers into virtual registers. Instead, store incoming physical xmm registers
into the memory from scratch.
Differential Revision: https://reviews.llvm.org/D80163
There seems to be an impedance mismatch between what the type
system considers an aggregate (structs and arrays) and what
constants consider an aggregate (structs, arrays and vectors).
Rather than adjusting the type check, simply drop it entirely,
as getAggregateElement() is well-defined for non-aggregates: It
simply returns null in that case.
Instead of handling a number of special cases for selects, handle
this generally when inferring ranges from conditions. We already
infer ranges from `x + C pred C2` to `x`, so doing the same for
`x pred C2` to `x + C` is straightforward.
It moved the logic for CMake target arguments into llvm_ExternalProject_Add().
No handling was added for CMAKE_CROSSCOMPILING, which has a separate set of compiler_args.
This broke crosscompiling, as now the runtimes builds defaulted to the compiler's default.
I've also added passing of CMAKE_ASM_COMPILER, which was missing before although we were passing the triple for it.
Reviewed By: zero9178
Differential Revision: https://reviews.llvm.org/D97855
These tests didn't test the pattern they were supposed to, because
%a instead of %add was used in the select, which turned this into
a normal min/max).
Noticed this when commenting out the clamp handling code did not
result in any test failures...
This reverts commit e58d68fcd06ddc7743e0419c0b364df3d44121b6.
This reinstates commit fc28f600e558c1344618bda149a068d6162b6f0b
with a fix to initialize HasShaderCyclesRegister. See
https://reviews.llvm.org/D97928.