1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

12 Commits

Author SHA1 Message Date
Alexander Timofeev
58f63f7a9c [AMDGPU] Preliminary patch for divergence driven instruction selection. Immediate selection predicate changed
Differential revision: https://reviews.llvm.org/D51734
Reviewers: rampitec

llvm-svn: 341928
2018-09-11 11:56:50 +00:00
Sanjay Patel
b6a12c439f [DAGCombiner] extend(ifpositive(X)) -> shift-right (not X)
This is almost the same as an existing IR canonicalization in instcombine, 
so I'm assuming this is a good early generic DAG combine too.

The motivation comes from reduced bit-hacking for select-of-constants in IR 
after rL331486. We want to restore that functionality in the DAG as noted in
the commit comments for that change and the llvm-dev discussion here:
http://lists.llvm.org/pipermail/llvm-dev/2018-July/124433.html

The PPC and AArch tests show that those targets are already doing something 
similar. x86 will be neutral in the minimal case and generally better when 
this pattern is extended with other ops as shown in the signbit-shift.ll tests.

Note the asymmetry: we don't include the (extend (ifneg X)) transform because 
it already exists in SimplifySelectCC(), and that is verified in the later 
unchanged tests in the signbit-shift.ll files. Without the 'not' op, the 
general transform to use a shift is always a win because that's a single 
instruction.

Alive proofs:
https://rise4fun.com/Alive/ysli

Name: if pos, get -1
  %c = icmp sgt i16 %x, -1
  %r = sext i1 %c to i16
  =>
  %n = xor i16 %x, -1
  %r = ashr i16 %n, 15

Name: if pos, get 1
  %c = icmp sgt i16 %x, -1
  %r = zext i1 %c to i16
  =>
  %n = xor i16 %x, -1
  %r = lshr i16 %n, 15

Differential Revision: https://reviews.llvm.org/D48970

llvm-svn: 337130
2018-07-15 16:27:07 +00:00
Yaxun Liu
512b9cf38e [AMDGPU] Switch to the new addr space mapping by default
This requires corresponding clang change.

Differential Revision: https://reviews.llvm.org/D40955

llvm-svn: 324101
2018-02-02 16:07:16 +00:00
Stanislav Mekhanoshin
4633c07b46 [AMDGPU] Preserve operand order in SIFoldOperands
SIFoldOperands can commute operands even if no folding was done.
This change is to preserve IR is no folding was done.

Differential Revision: https://reviews.llvm.org/D33802

llvm-svn: 304625
2017-06-03 00:41:52 +00:00
Stanislav Mekhanoshin
e23fa40f7f [AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.

Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.

Differential Revision: https://reviews.llvm.org/D33583

llvm-svn: 304219
2017-05-30 16:49:24 +00:00
Matt Arsenault
dd9ab77318 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Matt Arsenault
d47230b13f AMDGPU: Always allocate emergency stack slot at offset 0
This allows us to ensure that 0 is never a valid pointer
to a user object, and ensures that the offset is always legal
without needing a register to access it. This comes at the cost
of usable offsets and wasted stack space.

llvm-svn: 295877
2017-02-22 21:05:25 +00:00
Matt Arsenault
c2c2a10170 AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.

Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.

The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.

llvm-svn: 289306
2016-12-10 00:39:12 +00:00
Matt Arsenault
3a9a1ac61b AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.

llvm-svn: 282832
2016-09-30 01:50:20 +00:00
Matt Arsenault
90cfd1ccda AMDGPU: Support commuting a FrameIndex operand
llvm-svn: 281369
2016-09-13 19:03:12 +00:00
Matt Arsenault
e676b40286 AMDGPU: Remove some old intrinsic uses from tests
llvm-svn: 260493
2016-02-11 06:02:01 +00:00
Tom Stellard
3f1708598e R600 -> AMDGPU rename
llvm-svn: 239657
2015-06-13 03:28:10 +00:00