1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
llvm-mirror/test/CodeGen
Andrea Di Biagio d2869df8d3 [X86][DAG] Disable target specific combine on INSERTPS dag nodes at -O0.
This patch disables target specific combine on X86ISD::INSERTPS dag nodes
if optlevel is CodeGenOpt::None.

The backend currently implements a target specific combine rule that converts
a vector load used by an INSERTPS dag node into a scalar load plus a
scalar_to_vector. This allows ISel to select a single INSERTPSrm instead of
two instructions (i.e. a vector load plus INSERTPSrr).

However, the existing target combine rule on INSERTPS nodes only works under
the assumption that ISel will always be able to match an INSERTPSrm. This is
not true in general at -O0, since the backend only allows folding a load into
the memory operand of an instruction if the optimization level is not
CodeGenOpt::None.

In the example below:

//
__m128 test(__m128 a, __m128 *b) {
  __m128 c = _mm_insert_ps(a, *b, 1 << 6);
  return c;
}
//

Before this patch, at -O0, the backend would have canonicalized the load to 'b'
into a scalar load plus scalar_to_vector. Later on, ISel would have selected an
INSERTPSrr leaving the insertps mask in an inconsistent state:

  movss 4(%rdi), %xmm1
  insertps  $64, %xmm1, %xmm0 # xmm0 = xmm1[1],xmm0[1,2,3].

With this patch, the backend avoids folding the vector load into the operand of
the INSERTPS. The new codegen at -O0 is:

  movaps (%rdi), %xmm1
  insertps  $64, %xmm1, %xmm0 # %xmm1[1],xmm0[1,2,3].

llvm-svn: 226277
2015-01-16 14:55:26 +00:00
..
AArch64 IR: Move MDLocation into place 2015-01-14 22:27:36 +00:00
ARM Revert r226242 - Revert Revert Don't create new comdats in CodeGen 2015-01-16 08:38:45 +00:00
CPP
Generic getMangledTypeStr: clarify how it mangles types, and add tests 2015-01-14 23:05:17 +00:00
Hexagon [Hexagon] Updating indexed load-extend patterns and changing test to new expected output. 2015-01-15 21:07:52 +00:00
Inputs IR: Move MDLocation into place 2015-01-14 22:27:36 +00:00
Mips [mips] Fix a typo in the compare patterns for MIPS32r6/MIPS64r6. 2015-01-15 15:41:03 +00:00
MSP430
NVPTX Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. 2015-01-14 15:36:28 +00:00
PowerPC [PowerPC] Adjust PatchPoints for ppc64le 2015-01-16 04:40:58 +00:00
R600 R600/SI: Add patterns for v_cvt_{flr|rpi}_i32_f32 2015-01-15 23:58:35 +00:00
SPARC Use the integrated assembler by default on SPARC. 2015-01-14 07:53:39 +00:00
SystemZ Use the integrated assembler as default on SystemZ 2015-01-13 19:45:16 +00:00
Thumb IR: Move MDLocation into place 2015-01-14 22:27:36 +00:00
Thumb2 [ARM] Fix a bug in constant island pass that was triggering an assertion. 2015-01-08 20:44:50 +00:00
X86 [X86][DAG] Disable target specific combine on INSERTPS dag nodes at -O0. 2015-01-16 14:55:26 +00:00
XCore IR: Move MDLocation into place 2015-01-14 22:27:36 +00:00