llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

History

Sanjay Patel b4b4a9aeb1 [InstCombine] transform more extract/insert pairs into shuffles (PR2109) This is an extension of the shuffle combining from r203229: http://reviews.llvm.org/rL203229 The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in. The motivation is to finally solve PR2109: https://llvm.org/bugs/show_bug.cgi?id=2109 For that example, the IR becomes: %1 = bitcast <2 x i32>* %P to <2 x float>* %ld1 = load <2 x float>, <2 x float>* %1, align 8 %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5> ret <4 x float> %i2 And x86 SSE output improves from: movq (%rdi), %xmm1 ## xmm1 = mem[0],zero movdqa %xmm1, %xmm2 shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3] shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0] shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2] shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0] shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0] retq To the almost optimal: movhpd (%rdi), %xmm0 Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples. Differential Revision: http://reviews.llvm.org/D15096 llvm-svn: 256394		2015-12-24 21:17:56 +00:00
..
CMakeLists.txt
InstCombineAddSub.cpp	[InstCombine] Fix indentation. NFC.	2015-12-21 01:02:28 +00:00
InstCombineAndOrXor.cpp	getParent() ^ 3 == getModule() ; NFCI	2015-12-14 17:24:23 +00:00
InstCombineCalls.cpp	getParent() ^ 3 == getModule() ; NFCI	2015-12-14 17:24:23 +00:00
InstCombineCasts.cpp	[InstCombine] Adding "\n" to debug output. NFC.	2015-12-17 19:53:41 +00:00
InstCombineCompares.cpp	getParent() ^ 3 == getModule() ; NFCI	2015-12-14 17:24:23 +00:00
InstCombineInternal.h	[InstCombine] Make MatchBSwap also match bit reversals	2015-12-11 10:04:51 +00:00
InstCombineLoadStoreAlloca.cpp	[OperandBundles] Have InstCombine play nice with operand bundles	2015-12-23 09:58:41 +00:00
InstCombineMulDivRem.cpp	InstCombine: Remove ilist iterator implicit conversions, NFC	2015-10-13 16:59:33 +00:00
InstCombinePHI.cpp	[InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPads	2015-11-07 00:52:53 +00:00
InstCombineSelect.cpp	[InstCombine] Call getCmpPredicateForMinMax only with a valid SPF	2015-12-05 23:44:22 +00:00
InstCombineShifts.cpp	don't repeat function names in comments; NFC	2015-11-02 22:34:55 +00:00
InstCombineSimplifyDemanded.cpp	[InstCombine] Teach SimplifyDemandedVectorElts how to handle ConstantVector select masks with ConstantExpr elements (PR24922)	2015-10-06 10:34:53 +00:00
InstCombineVectorOps.cpp	[InstCombine] transform more extract/insert pairs into shuffles (PR2109)	2015-12-24 21:17:56 +00:00
InstructionCombining.cpp	Instcombine: destructor loads of structs that do not contains padding	2015-12-15 01:44:07 +00:00
LLVMBuild.txt
Makefile