1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
llvm-mirror/lib
Jim Grosbach 7a17678ea4 X86: Constant fold converting vector setcc results to float.
Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the SSE code is generated as:
LCPI0_0:
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  cvtdq2ps  %xmm0, %xmm0
  retq

After, the code is improved to:
LCPI0_0:
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  retq

The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.

llvm-svn: 213342
2014-07-18 00:40:56 +00:00
..
Analysis Rectify r213231. Use proper version of 'ComputeNumSignBits'. 2014-07-17 19:07:00 +00:00
AsmParser Update the MemoryBuffer API to use ErrorOr. 2014-07-06 17:43:13 +00:00
Bitcode Roundtrip the inalloca bit on allocas through bitcode 2014-07-16 01:34:27 +00:00
CodeGen AArch64: Constant fold converting vector setcc results to float. 2014-07-18 00:40:52 +00:00
DebugInfo Revert "Introduce a string_ostream string builder facilty" 2014-06-26 22:52:05 +00:00
ExecutionEngine [MCJIT] Fix the alignment requirements for ARM and AArch64 which were mistakenly 2014-07-17 23:11:30 +00:00
IR Remove unnecessary/redundant std::move 2014-07-16 17:09:21 +00:00
IRReader Update the MemoryBuffer API to use ErrorOr. 2014-07-06 17:43:13 +00:00
LineEditor [CMake] Use LINK_LIBS instead of target_link_libraries(). 2014-02-26 06:41:29 +00:00
Linker Include <tuple> to make buildbots happy 2014-06-27 18:38:12 +00:00
LTO Prune Redundant libdeps in CMake's target_link_libraries and LLVMBuild.txt. 2014-07-15 11:37:03 +00:00
MC ms inline asm: Don't add x86 segment registers to the clobber list. 2014-07-17 20:24:55 +00:00
Object [RuntimeDyld] Revert r211652 - MachO object GDB registration support. 2014-07-15 19:35:22 +00:00
Option Generic: add range-adapter for option parsing. 2014-07-09 13:03:37 +00:00
ProfileData Update the MemoryBuffer API to use ErrorOr. 2014-07-06 17:43:13 +00:00
Support Drop the udis86 wrapper from llvm::sys 2014-07-17 20:05:29 +00:00
TableGen [TableGen] Allow shift operators to take bits<n> 2014-07-17 17:04:27 +00:00
Target X86: Constant fold converting vector setcc results to float. 2014-07-18 00:40:56 +00:00
Transforms [ASan] Don't instrument load/stores with !nosanitize metadata. 2014-07-17 18:48:12 +00:00
CMakeLists.txt ProfileData: Introduce the InstrProfReader interface and a text reader 2014-03-21 17:24:48 +00:00
LLVMBuild.txt ProfileData: Introduce the InstrProfReader interface and a text reader 2014-03-21 17:24:48 +00:00
Makefile ProfileData: Introduce the InstrProfReader interface and a text reader 2014-03-21 17:24:48 +00:00