This transform was added with rL351346, and we had
an escape for shufps, but we also want one for
unpckps vs. vpermps because vpermps doesn't take
an immediate shuffle index operand.
llvm-svn: 352333
Although this is longer code, this is no-functional-change-intended.
The goal is to untangle the conditions under which we bail out, so
that's easier to adjust.
llvm-svn: 352320
While i have no intention of actually commiting regeneration
of the check lines in these test files with update_llc_test_checks,
lack of that whitespace breaks that util, which is mildly inconvenient.
llvm-svn: 352318
Add generic costs calculation for SADDSAT/SSUBSAT intrinsics, this uses generic costs for sadd_with_overflow/ssub_with_overflow, an extra sign comparison + a selects based on the sign/overflow.
This completes PR40316
Differential Revision: https://reviews.llvm.org/D57239
llvm-svn: 352315
This fixes loads like 's1 = load %p (load 1 from %p)' being combined with an
extend into an illegal 's8 = g_extload %p (load 1 from %p)' which doesn't do any
extension, by avoiding touching those < s8 size loads.
This bug was uncovered by a verifier update r351584, which I reverted it to keep
the bots green.
llvm-svn: 352311
The add+and sequence followed by a branch can
happen e.g. when looping over the set bits of an integer:
```
while (x != 0) {
func(x & ~x);
x &= x - 1;
}
```
Reviewed By: ctopper
Differential Revision: https://reviews.llvm.org/D57296
llvm-svn: 352306
def32 here means the producing instruction zeroed bits 63:32. We already do this for zext, but it looks like we can get an and+anyext sometimes.
Spotted in the diffs from D33587.
llvm-svn: 352303
Bitcast and certain Ptr2Int/Int2Ptr instructions will not alter the
value of their operand and can therefore be looked through when we
determine non-nullness.
Differential Revision: https://reviews.llvm.org/D54956
llvm-svn: 352293
Fix issue noted in D57281 that only tested the one use for the SDValue (the result flag), not the entire SUB.
I've added the getNode() to make it clearer what is intended than just the -> redirection.
llvm-svn: 352291
As discussed on PR24545, we should try to commute X86::COND_A 'icmp ugt' cases to X86::COND_B 'icmp ult' to more optimally bind the carry flag output to a SBB instruction.
Differential Revision: https://reviews.llvm.org/D57281
llvm-svn: 352289
We often generate X86ISD::SBB(X, 0) for carry flag arithmetic.
I had tried to create test cases for the ADC equivalent (which often uses the same pattern) but haven't managed to find anything yet.
Differential Revision: https://reviews.llvm.org/D57169
llvm-svn: 352288
This reduces a bit of duplication between the combining and
lowering places that use it, but the primary motivation is
to make it easier to rearrange the lowering logic and solve
PR40434:
https://bugs.llvm.org/show_bug.cgi?id=40434
llvm-svn: 352280