1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00
Commit Graph

22 Commits

Author SHA1 Message Date
Matt Arsenault
cd69621a21 AMDGPU: More bits of frame index are known to be zero
The maximum private allocation for the whole GPU is 4G,
so the maximum possible index for a single workitem is the
maximum size divided by the smallest granularity for a dispatch.

This increases the number of known zero high bits, which
enables more offset folding. The maximum private size per
workitem with this is 128M but may be smaller still.

llvm-svn: 262153
2016-02-27 20:26:57 +00:00
Matt Arsenault
6ecfce9ba2 AMDGPU: Split vi-insts subtarget feature
This will be more useful for marking builtins acceptable for which
subtargets.

llvm-svn: 262121
2016-02-27 08:53:55 +00:00
Matt Arsenault
1bcd37150d AMDGPU: Implement readcyclecounter
This matches the behavior of the HSAIL clock instruction.
s_realmemtime is used if the subtarget supports it, and falls
back to s_memtime if not.

Also introduces new intrinsics for each of s_memtime / s_memrealtime.

llvm-svn: 262119
2016-02-27 08:53:46 +00:00
Matt Arsenault
628b2818b6 AMDGPU: Set element_size in private resource descriptor
Introduce a subtarget feature for this, and leave the default with
the current behavior which assumes up to 16-byte loads/stores can
be used. The field also seems to have the ability to be set to 2 bytes,
but I'm not sure what that would be used for.

llvm-svn: 260651
2016-02-12 02:40:47 +00:00
Matt Arsenault
28a667546a AMDGPU: Match some med3 patterns
llvm-svn: 259089
2016-01-28 20:53:42 +00:00
Marek Olsak
0cb3416583 AMDGPU/SI: Stoney has only 16 LDS banks
Summary:
This is a candidate for stable, along with all patches that add the "stoney"
processor.

Reviewers: tstellarAMD

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16485

llvm-svn: 258922
2016-01-27 11:19:45 +00:00
Matt Arsenault
f5c703425b AMDGPU: Tidy minor td file issues
Make comments and indentation more consistent.

Rearrange a few things to be in a more consistent order,
such as organizing subtarget features from those describing
an actual device property, and those used as options.

llvm-svn: 258789
2016-01-26 04:49:22 +00:00
Matt Arsenault
c3ec12b749 AMDGPU: Remove Feature64BitPtr
This is a leftover from AMDIL that doesn't do anything
and doesn't belong here.

llvm-svn: 258606
2016-01-23 05:32:14 +00:00
Tom Stellard
a693ddcf2d AMDGPU/SI: Pass whether to use the SI scheduler via Target Attribute
Summary:
Currently the SI scheduler can be selected via command line option,
but it turned out it would be better if it was selectable via a Target Attribute.

This patch adds "si-scheduler" attribute to the backend.

Reviewers: tstellarAMD, echristo

Subscribers: echristo, arsenm

Differential Revision: http://reviews.llvm.org/D16192

llvm-svn: 258386
2016-01-21 04:28:34 +00:00
Matt Arsenault
1478312f09 AMDGPU: Add subtarget feature for instruction rates
llvm-svn: 258085
2016-01-18 21:13:50 +00:00
Changpeng Fang
a20f9886e2 AMDGPU/SI: Update ISA version for FIJI
llvm-svn: 257666
2016-01-13 20:39:25 +00:00
Nicolai Haehnle
7d4706e2a1 AMDGPU: add +xnack feature
Summary:
Enabling this feature will account for the two SGPRs used by the hardware
to store the XNACK_MASK physically.

The hardware only requires this reservation when the XNACK feature is
explicitly enabled. At some point, HSA will probably want to do that, but
it does increase SGPR register pressure, so leave it disabled by default
for now (but do add a small test).

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15869

llvm-svn: 256794
2016-01-04 23:35:53 +00:00
Tom Stellard
9454729563 AMDGPU/SI: Fix encoding of flat instructions on VI
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15735

llvm-svn: 256360
2015-12-24 03:18:18 +00:00
Changpeng Fang
8727650a31 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256282
2015-12-22 20:55:23 +00:00
Rafael Espindola
6880ff7f5d Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

llvm-svn: 256275
2015-12-22 19:46:44 +00:00
Changpeng Fang
36fd3caea4 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256273
2015-12-22 19:32:28 +00:00
Tom Stellard
718e3450df AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets
Summary:
We can safely assume that the high bit of scratch offsets will never
be set, because this would require at least 128 GB of GPU memory.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11225

llvm-svn: 242433
2015-07-16 19:40:07 +00:00
Matt Arsenault
36294af135 AMDGPU/SI: Add debugging subtarget feature for DS offsets
We don't have a good way to detect most situations where
DS offsets are usable on SI, so add an option to force using
them even if unsafe for debugging performance problems.

llvm-svn: 241462
2015-07-06 16:01:58 +00:00
Tom Stellard
daced4c4cc AMDGPU/SI: Add hsa code object directives
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10757

llvm-svn: 240831
2015-06-26 21:15:07 +00:00
Tom Stellard
3f1708598e R600 -> AMDGPU rename
llvm-svn: 239657
2015-06-13 03:28:10 +00:00
Tom Stellard
39f7e52397 Revert "AMDGPU: Add core backend files for R600/SI codegen v6"
This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea.

llvm-svn: 160303
2012-07-16 18:19:53 +00:00
Tom Stellard
9f326179fc AMDGPU: Add core backend files for R600/SI codegen v6
llvm-svn: 160270
2012-07-16 14:17:08 +00:00