Previously ConstantFoldExtractElementInstruction() would only work with
insertelement instructions, not contants. This properly handles
insertelement constants as well.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85865
This recommits the following patches now that D85684 has landed
1cf6f210a2e [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2d [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc9 [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2f [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329af [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
I think this is the last remaining translation of an existing
instcombine transform for the corresponding cmp+sel idiom.
This interpretation is more general though - we can remove
mismatched signed/unsigned combinations in addition to the
more obvious cases.
min/max(X, Y) must produce X or Y as the result, so this is
just another clause in the existing transform that was already
matching a min/max of min/max.
I skimmed the existing users of these matchers and don't see any problems
(eg, the caller assumes the matched value was a select instruction without checking).
So I think we can generalize the matching to allow the new intrinsics or the cmp+select idioms.
I did not find any unit tests for the matchers, so added some basics there. The instsimplify
tests are adapted from existing tests for the cmp+select pattern and cover the folds in
simplifyICmpWithMinMax().
Differential Revision: https://reviews.llvm.org/D85230
https://rise4fun.com/Alive/pZEr
Name: mul nuw with icmp eq
Pre: (C2 %u C1) != 0
%a = mul nuw i8 %x, C1
%r = icmp eq i8 %a, C2
=>
%r = false
Name: mul nuw with icmp ne
Pre: (C2 %u C1) != 0
%a = mul nuw i8 %x, C1
%r = icmp ne i8 %a, C2
=>
%r = true
There are potentially several other transforms we need to add based on:
D51625
...but it doesn't look like there was follow-up to that patch.
This revision adds the following peephole optimization
and it's negation:
%a = urem i64 %x, %y
%b = icmp ule i64 %a, %x
====>
%b = true
With John Regehr's help this optimization was checked with Alive2
which suggests it should be valid.
This pattern occurs in the bound checks of Rust code, the program
const N: usize = 3;
const T = u8;
pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) {
let len = slice.len() / N;
slice.split_at(len * N)
}
the method call slice.split_at will check that len * N is within
the bounds of slice, this bounds check is after some transformations
turned into the urem seen above and then LLVM fails to optimize it
any further. Adding this optimization would cause this bounds check
to be fully optimized away.
ref: https://github.com/rust-lang/rust/issues/74938
Differential Revision: https://reviews.llvm.org/D85092
This is based on the existing code for the non-intrinsic idioms
in InstCombine.
The vector constant constraint is non-obvious: undefs should be
ok in the outer call, but they can't propagate safely from the
inner call in all cases. Example:
https://alive2.llvm.org/ce/z/-2bVbM
define <2 x i8> @src(<2 x i8> %x) {
%0:
%m = umin <2 x i8> %x, { 7, undef }
%m2 = umin <2 x i8> { 9, 9 }, %m
ret <2 x i8> %m2
}
=>
define <2 x i8> @tgt(<2 x i8> %x) {
%0:
%m = umin <2 x i8> %x, { 7, undef }
ret <2 x i8> %m
}
Transformation doesn't verify!
ERROR: Value mismatch
Example:
<2 x i8> %x = < undef, undef >
Source:
<2 x i8> %m = < #x00 (0) [based on undef value], #x00 (0) >
<2 x i8> %m2 = < #x00 (0), #x00 (0) >
Target:
<2 x i8> %m = < #x07 (7), #x10 (16) >
Source value: < #x00 (0), #x00 (0) >
Target value: < #x07 (7), #x10 (16) >
It's always safe to pick the earlier abs regardless of the nsw flag. We'll just lose it if it is on the outer abs but not the inner abs.
Differential Revision: https://reviews.llvm.org/D85053
abs() should be rare enough that using value tracking is not going
to be a compile-time cost burden, so use it to reduce a variety of
potential patterns. We do this in DAGCombiner too.
Differential Revision: https://reviews.llvm.org/D85043
This is the main icmp simplification shortcoming seen in D84655.
Alive2 agrees that the basic examples are correct at least:
define <2 x i1> @src(<2 x i8> %x) {
%0:
%r = icmp sle <2 x i8> { undef, 128 }, %x
ret <2 x i1> %r
}
=>
define <2 x i1> @tgt(<2 x i8> %x) {
%0:
ret <2 x i1> { 1, 1 }
}
Transformation seems to be correct!
define <2 x i1> @src(<2 x i32> %X) {
%0:
%A = or <2 x i32> %X, { 63, 63 }
%B = icmp ult <2 x i32> %A, { undef, 50 }
ret <2 x i1> %B
}
=>
define <2 x i1> @tgt(<2 x i32> %X) {
%0:
ret <2 x i1> { 0, 0 }
}
Transformation seems to be correct!
https://alive2.llvm.org/ce/z/omt2eehttps://alive2.llvm.org/ce/z/GW4nP_
Differential Revision: https://reviews.llvm.org/D84762
This is a simple patch that makes canCreateUndefOrPoison use
Instruction::isBinaryOp because BinaryOperator inherits Instruction.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84596
.. in isGuaranteedNotToBeUndefOrPoison.
This caused early exit of isGuaranteedNotToBeUndefOrPoison, making it return
imprecise result.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84251
This is a step towards trying to remove unnecessary FP compares
with infinity when compiling with -ffinite-math-only or similar.
I'm intentionally not checking FMF on the fcmp itself because
I'm assuming that will go away eventually.
The analysis part of this was added with rGcd481136 for use with
isKnownNeverNaN. Similarly, that could be an enhancement here to
get predicates like 'one' and 'ueq'.
Differential Revision: https://reviews.llvm.org/D84035