1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
llvm-mirror/test/CodeGen/X86/pr22338.ll
Sanjay Patel 011ef76da2 [x86] use more shift or LEA for select-of-constants (2nd try)
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.

We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the 
   push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a 
   post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
   that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310717
2017-08-11 15:44:14 +00:00

56 lines
1.6 KiB
LLVM

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=i686-unknown-linux-gnu | FileCheck %s --check-prefix=X86
; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu | FileCheck %s --check-prefix=X64
define i32 @fn() {
; X86-LABEL: fn:
; X86: # BB#0: # %entry
; X86-NEXT: xorl %eax, %eax
; X86-NEXT: cmpl $1, %eax
; X86-NEXT: setne %al
; X86-NEXT: sete %cl
; X86-NEXT: negl %eax
; X86-NEXT: addb %cl, %cl
; X86-NEXT: shll %cl, %eax
; X86-NEXT: .p2align 4, 0x90
; X86-NEXT: .LBB0_1: # %bb1
; X86-NEXT: # =>This Inner Loop Header: Depth=1
; X86-NEXT: testl %eax, %eax
; X86-NEXT: je .LBB0_1
; X86-NEXT: # BB#2: # %bb2
; X86-NEXT: retl
;
; X64-LABEL: fn:
; X64: # BB#0: # %entry
; X64-NEXT: xorl %eax, %eax
; X64-NEXT: cmpl $1, %eax
; X64-NEXT: setne %al
; X64-NEXT: sete %cl
; X64-NEXT: negl %eax
; X64-NEXT: addb %cl, %cl
; X64-NEXT: shll %cl, %eax
; X64-NEXT: .p2align 4, 0x90
; X64-NEXT: .LBB0_1: # %bb1
; X64-NEXT: # =>This Inner Loop Header: Depth=1
; X64-NEXT: testl %eax, %eax
; X64-NEXT: je .LBB0_1
; X64-NEXT: # BB#2: # %bb2
; X64-NEXT: retq
entry:
%cmp1 = icmp ne i32 undef, 1
%cmp2 = icmp eq i32 undef, 1
%sel1 = select i1 %cmp1, i32 0, i32 2
%sel2 = select i1 %cmp2, i32 2, i32 0
%sext = sext i1 %cmp1 to i32
%shl1 = shl i32 %sext, %sel1
%shl2 = shl i32 %sext, %sel2
%tobool = icmp eq i32 %shl1, 0
br label %bb1
bb1: ; preds = %bb1, %entry
br i1 %tobool, label %bb1, label %bb2
bb2: ; preds = %bb1
ret i32 %shl2
}