Use scalar BFE with constant shift and offset when possible.
This is complicated by the fact that the scalar version packs
the two operands of the vector version into one.
llvm-svn: 206558
Having i128 as a legal type complicates the legalization phase. v4i32
is already a legal type, so we will use that instead.
This fixes several piglit tests.
llvm-svn: 206500
Print in decimal for inline immediates, and hex otherwise. Use hex
always for offsets in addressing offsets.
This approximately matches what the shader compiler does.
llvm-svn: 206335
The TargetLowering::expandMUL() helper contains lowering code extracted
from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more
ISD::MUL patterns without having to use a library call.
llvm-svn: 206037
Moving these patterns from TableGen files to PerformDAGCombine()
should allow us to generate better code by eliminating unnecessary
shifts and extensions earlier.
This also fixes a bug where the MAD pattern was calling
SimplifyDemandedBits with a 24-bit mask on the first operand
even when the full pattern wasn't being matched. This occasionally
resulted in some instructions being incorrectly deleted from the
program.
v2:
- Fix bug with 64-bit mul
llvm-svn: 205731
Acording to AMD documentation, the correct opcode for
BFE_INT is 0x5, not 0x4
Fixes Arithm/Absdiff.Mat/3 OpenCV test
Patch by: Bruno Jiménez
llvm-svn: 205562
Try to match scalar and first like the other instructions.
Expand 64-bit ands to a pair of 32-bit ands since that is not
available on the VALU.
llvm-svn: 204660
It isn't actually used now, and probably never will be, plus it makes
tests less annoying. I also think SC prints GDS instructions as a
separate instruction name.
llvm-svn: 204270
The type of the immediates should not matter as long as the encoding is
equivalent to the encoding of one of the legal inline constants.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 204056
This instructions writes to an 32-bit SGPR. This change required adding
the 32-bit VCC_LO and VCC_HI registers, because the full VCC register
is 64 bits.
This fixes verifier errors on several of the indirect addressing piglit
tests.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 204055
LDS instructions are pseudo instructions which model
the OQAP defs and uses within a single instruction.
This fixes a hang in the opencv MedianFilter tests.
llvm-svn: 203818
These are sometimes created by the shrink to boolean optimization in the
globalopt pass.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 203280