1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Go to file
Sanjay Patel 981e90bafa [InstCombine] allow more than one use for vector bitcast folding with selects
The motivating example for this transform is similar to D20774 where bitcasts interfere
with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to 
produce min and max ops:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %bc1 = bitcast <4 x float> %a to <4 x i32>
  %bc2 = bitcast <4 x float> %b to <4 x i32>
  %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2
  %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1
  %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>*
  store <4 x i32> %sel1, <4 x i32>* %bc3
  %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>*
  store <4 x i32> %sel2, <4 x i32>* %bc4
  ret void
}

With this patch, we move the selects up to use the input args which allows getting rid of
all of the bitcasts:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
  %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a
  store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16
  store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16
  ret void
}

The asm for x86 SSE then improves from:

movaps  %xmm0, %xmm2
cmpltps %xmm1, %xmm2
movaps  %xmm2, %xmm3
andnps  %xmm1, %xmm3
movaps  %xmm2, %xmm4
andnps  %xmm0, %xmm4
andps %xmm2, %xmm0
orps  %xmm3, %xmm0
andps %xmm1, %xmm2
orps  %xmm4, %xmm2
movaps  %xmm0, (%rdi)
movaps  %xmm2, (%rsi)

To:

movaps  %xmm0, %xmm2
minps %xmm1, %xmm2
maxps %xmm0, %xmm1
movaps  %xmm2, (%rdi)
movaps  %xmm1, (%rsi)

The TODO comments show that we're limiting this transform only to vectors and only to bitcasts
because we need to improve other transforms or risk creating worse codegen.

Differential Revision: http://reviews.llvm.org/D21190

llvm-svn: 273011
2016-06-17 16:46:50 +00:00
bindings Remove the ScalarReplAggregates pass 2016-06-15 00:19:09 +00:00
cmake Add support for collating profiles for use with code coverage 2016-06-13 23:33:48 +00:00
docs LangRef: Note expectations when loading with extra alignment 2016-06-16 16:33:41 +00:00
examples [MCJIT] Update MCJIT and get the fibonacci example working again. 2016-06-11 05:47:04 +00:00
include Refactor and cleanup Assembly Parsing / Lexing 2016-06-17 16:06:17 +00:00
lib [InstCombine] allow more than one use for vector bitcast folding with selects 2016-06-17 16:46:50 +00:00
projects Remove autoconf support 2016-01-26 21:29:08 +00:00
resources In MSVC builds embed a VERSIONINFO resource in our exe and DLL files. 2015-06-12 15:58:29 +00:00
test [InstCombine] allow more than one use for vector bitcast folding with selects 2016-06-17 16:46:50 +00:00
tools Resubmit "[pdb] Change type visitor pattern to be dynamic." 2016-06-16 18:22:27 +00:00
unittests [PM] Run clang-format over various parts of the new pass manager code 2016-06-17 07:15:29 +00:00
utils Remove the ScalarReplAggregates pass 2016-06-15 00:19:09 +00:00
.arcconfig
.clang-format
.clang-tidy Don't use misc-unused-parameters check on LLVM. 2016-04-13 08:58:52 +00:00
.gitignore Minor updates to gitignore so that symlinks are ignored in the projects dir. 2015-07-07 20:24:58 +00:00
CMakeLists.txt Add support for collating profiles for use with code coverage 2016-06-13 23:33:48 +00:00
CODE_OWNERS.TXT Sort my entry in CODE_OWNERS.TXT 2016-05-26 23:10:37 +00:00
configure Remove autoconf support 2016-01-26 21:29:08 +00:00
CREDITS.TXT Update my email address. 2016-05-10 16:23:54 +00:00
LICENSE.TXT Update copyright year to 2016. 2016-03-30 22:41:06 +00:00
llvm.spec.in [Sparc] Implement i64 load/store support for 32-bit sparc. 2015-08-10 19:11:39 +00:00
LLVMBuild.txt
README.txt Revert previous test commit. 2016-01-04 19:13:29 +00:00

Low Level Virtual Machine (LLVM)
================================

This directory and its subdirectories contain source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments.

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the documentation provided in docs/ for further
assistance with LLVM, and in particular docs/GettingStarted.rst for getting
started with LLVM and docs/README.txt for an overview of LLVM's
documentation setup.

If you are writing a package for LLVM, see docs/Packaging.rst for our
suggestions.