Matt Arsenault
630768ab44
AMDGPU: Add max-mix-insts subtarget feature
...
llvm-svn: 316553
2017-10-25 07:00:51 +00:00
Konstantin Zhuravlyov
c9bc3fd2a9
AMDGPU: Do not emit deprecated notes for code object v3
...
Differential Revision: https://reviews.llvm.org/D38749
llvm-svn: 315810
2017-10-14 15:59:07 +00:00
Matt Arsenault
4fa0c47019
AMDGPU: Fix incorrect selection of pseudo-branches
...
These should only be used if the machine structurizer is enabled.
llvm-svn: 315357
2017-10-10 20:22:07 +00:00
Matt Arsenault
13035c4903
AMDGPU: Remove global isGCN predicates
...
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.
Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.
llvm-svn: 314742
2017-10-03 00:06:41 +00:00
Matt Arsenault
886adf43d0
AMDGPU: Fix typos
...
llvm-svn: 314715
2017-10-02 20:31:18 +00:00
Matt Arsenault
a0c03c6e92
AMDGPU: Start selecting v_mad_mix_f32
...
llvm-svn: 312732
2017-09-07 18:05:07 +00:00
Matt Arsenault
960a469e7e
AMDGPU: Add ds_{read|write}_addtid_b32 definitions
...
llvm-svn: 312349
2017-09-01 18:38:02 +00:00
Matt Arsenault
d48237c09b
AMDGPU: Add most d16 load/store instruction definitions
...
Doesn't include the tied operand necessary for the loads,
but is enough for the assembler to work.
llvm-svn: 312347
2017-09-01 18:36:06 +00:00
Konstantin Zhuravlyov
7d9fe6e6ce
AMDGPU: Fix gfx801 features
...
gfx801 has 1/2 rate F64, Fast F32 FMA
Differential Revision: https://reviews.llvm.org/D36981
llvm-svn: 311694
2017-08-24 20:03:07 +00:00
Dmitry Preobrazhensky
be2eb2d0a8
[AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes
...
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152
Reviewers: SamWot, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D36674
llvm-svn: 311006
2017-08-16 13:51:56 +00:00
Matt Arsenault
5dfc642fbd
AMDGPU: Cleanup subtarget features
...
Try to avoid mutually exclusive features. Don't use
a real default GPU, and use a fake "generic". The goal
is to make it easier to see which set of features are
incompatible between feature strings.
Most of the test changes are due to random scheduling changes
from not having a default fullspeed model.
llvm-svn: 310258
2017-08-07 14:58:04 +00:00
Matt Arsenault
6a3de519fd
AMDGPU: Fix typo in feature description
...
llvm-svn: 310217
2017-08-06 18:13:23 +00:00
Matt Arsenault
06dfe9929d
AMDGPU: Add instruction definitions for some scratch_* instructions
...
Omit atomics for now since they probably aren't useful.
llvm-svn: 308747
2017-07-21 15:36:16 +00:00
Matt Arsenault
4165e37f67
AMDGPU: Add encoding for carryless add/sub instructions
...
llvm-svn: 308639
2017-07-20 17:42:47 +00:00
Sam Kolton
48e96ee80f
[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
...
Summary:
1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it.
2. There were several problems with support of VOPC instructions in SDWA peephole pass.
Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl
Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye
Differential Revision: https://reviews.llvm.org/D34626
llvm-svn: 306413
2017-06-27 15:02:23 +00:00
Matt Arsenault
06d9b51c75
AMDGPU: Whitespace fixes
...
llvm-svn: 306265
2017-06-26 03:01:36 +00:00
Sam Kolton
076a1edc25
[AMDGPU] SDWA: add support for GFX9 in peephole pass
...
Summary:
Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
Added several subtarget features for GFX9 SDWA.
This diff also contains changes from D34026.
Depends D34026
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34241
llvm-svn: 305986
2017-06-22 06:26:41 +00:00
Matt Arsenault
8ecc22a003
AMDGPU: Start adding global_* instructions
...
llvm-svn: 305838
2017-06-20 19:54:14 +00:00
Wei Ding
26ab347e05
AMDGPU : Fix ISA Version Definitions.
...
Differential Revision: http://reviews.llvm.org/D28531
llvm-svn: 305137
2017-06-10 03:53:19 +00:00
Konstantin Zhuravlyov
55508a871a
AMDGPU: Make auto waitcnt before barrier a feature
...
Differential Revision: https://reviews.llvm.org/D33793
llvm-svn: 304571
2017-06-02 17:40:26 +00:00
Sam Kolton
82a8c72e68
[AMDGPU] SDWA: Add assembler support for GFX9
...
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493
Reviewers: vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D33132
llvm-svn: 303620
2017-05-23 10:08:55 +00:00
Matt Arsenault
d3375cc250
AMDGPU: Add new subtarget features for gfx9 flat instructions
...
Flat instructions gain an immediate offset, and 2 new
sets of segment specific flat instructions are added.
llvm-svn: 302729
2017-05-10 21:19:05 +00:00
Sam Kolton
587bdc31c1
[AMDGPU] DPP: add support for GFX9
...
Reviewers: artem.tamazov
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32588
llvm-svn: 301551
2017-04-27 15:42:38 +00:00
Konstantin Zhuravlyov
11c939320c
AMDGPU/GFX9: Enable FastFMAF32
...
Differential Revision: https://reviews.llvm.org/D32363
llvm-svn: 301029
2017-04-21 19:57:53 +00:00
Marek Olsak
4bf1e53d20
AMDGPU: Always use VGPR indexing on GFX9
...
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr
Differential Revision: https://reviews.llvm.org/D31157
llvm-svn: 298396
2017-03-21 17:00:32 +00:00
Matt Arsenault
96b9e12990
AMDGPU: Add VOP3P instruction format
...
Add a few non-VOP3P but instructions related to packed.
Includes hack with dummy operands for the benefit of the assembler
llvm-svn: 296368
2017-02-27 18:49:11 +00:00
Matt Arsenault
73b8eb1cc6
AMDGPU: Redefine clamp node as clamp 0.0-1.0
...
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.
Also allow using clamp with f16, and use knowledge
of dx10_clamp.
llvm-svn: 295788
2017-02-21 23:35:48 +00:00
Matt Arsenault
1f7b67f9b4
AMDGPU: Fix assembler subtarget predicate for gfx9
...
This was accepting GFX9 instructions on VI.
llvm-svn: 295557
2017-02-18 19:12:26 +00:00
Matt Arsenault
a207e31c14
AMDGPU: Merge initial gfx9 support
...
llvm-svn: 295554
2017-02-18 18:29:53 +00:00
Wei Ding
3609e1230f
AMDGPU : Add trap handler support.
...
Differential Revision: http://reviews.llvm.org/D26010
llvm-svn: 294692
2017-02-10 02:15:29 +00:00
Tom Stellard
f2ec17e0e6
Re-commit AMDGPU/GlobalISel: Add support for simple shaders
...
Fix build when global-isel is disabled and fix a warning.
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293551
2017-01-30 21:56:46 +00:00
Tom Stellard
d839aa304c
Revert "AMDGPU/GlobalISel: Add support for simple shaders"
...
This reverts commit r293503.
Revert while I investigate some of the buildbot failures.
llvm-svn: 293509
2017-01-30 17:42:41 +00:00
Tom Stellard
ca8f087f31
AMDGPU/GlobalISel: Add support for simple shaders
...
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293503
2017-01-30 17:09:15 +00:00
Matt Arsenault
9317a1de75
AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands
...
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 293310
2017-01-27 17:42:26 +00:00
Matt Arsenault
81a9bfe915
Enable FeatureFlatForGlobal on Volcanic Islands
...
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
llvm-svn: 292982
2017-01-24 22:02:15 +00:00
Matt Arsenault
bd33194651
AMDGPU: Combine fp16/fp64 subtarget features
...
The same control register controls both, and are set to
the same defaults. Keep the old names around as aliases.
llvm-svn: 292837
2017-01-23 22:31:03 +00:00
Sam Kolton
1310b4c7b3
[AMDGPU] Add subtarget features for SDWA/DPP
...
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28900
llvm-svn: 292596
2017-01-20 10:01:25 +00:00
Marek Olsak
68f46f5c8e
AMDGPU/SI: Remove XNACK feature from CI
...
Summary: CI doesn't have XNACK.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27175
llvm-svn: 289263
2016-12-09 19:49:58 +00:00
Marek Olsak
191936bbf2
AMDGPU/SI: Don't reserve XNACK when it's disabled
...
Summary:
This frees 2 additional scalar registers.
These are results from all of my 3 patches combined:
Polaris:
Spilled SGPRs: 2231 -> 1517 (-32.00 %)
Tonga:
Spilled SGPRs: 3829 -> 2608 (-31.89 %)
Spilled VGPRs: 100 -> 84 (-16.00 %)
Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader
limited to 64 VGPRs.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27151
llvm-svn: 289262
2016-12-09 19:49:54 +00:00
Konstantin Zhuravlyov
a5d550fe9d
[AMDGPU] Add f16 support (VI+)
...
Differential Revision: https://reviews.llvm.org/D25975
llvm-svn: 286753
2016-11-13 07:01:11 +00:00
Tom Stellard
fca8e2011d
AMDGPU: Add VI i16 support
...
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 286464
2016-11-10 16:02:37 +00:00
Tom Stellard
8e72cd5271
Revert "AMDGPU: Add VI i16 support"
...
This reverts commit r285939 and r285948. These broke some conformance tests.
llvm-svn: 285995
2016-11-04 13:06:34 +00:00
Tom Stellard
1eb5b9fee5
AMDGPU: Add VI i16 support
...
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 285939
2016-11-03 17:13:50 +00:00
Matt Arsenault
bb971d2e8b
AMDGPU: Whitespace fixes
...
llvm-svn: 285659
2016-11-01 00:55:14 +00:00
Matt Arsenault
3ee7b5cf1b
AMDGPU: Use 1/2pi inline imm on VI
...
I'm guessing at how it is supposed to be printed
llvm-svn: 285490
2016-10-29 04:05:06 +00:00
Matt Arsenault
a0090a0113
AMDGPU: Add definitions for scalar store instructions
...
Also add glc bit to the scalar loads since they exist on VI
and change the caching behavior.
This currently has an assembler bug where the glc bit is incorrectly
accepted on SI/CI which do not have it.
llvm-svn: 285463
2016-10-28 21:55:15 +00:00
Yaxun Liu
f73bd0c195
AMDGPU: Refactor processor definition to use ISA version features
...
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.
Refactor processor definition to use ISA version features.
Fixed ISA version for stoney.
Based on Laurent Morichetti's patch.
Differential Revision: https://reviews.llvm.org/D25919
llvm-svn: 285210
2016-10-26 16:37:56 +00:00
Tom Stellard
19c9255423
AMDGPU/SI: Don't allow unaligned scratch access
...
Summary: The hardware doesn't support this.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25523
llvm-svn: 284257
2016-10-14 18:10:39 +00:00
Matt Arsenault
cb0c02c980
AMDGPU: Add instruction definitions for VGPR indexing
...
VI added a second method of indexing into VGPRs
besides using v_movrel*
llvm-svn: 284027
2016-10-12 18:00:51 +00:00
Changpeng Fang
1632d75b59
AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.
...
Differential Revision:
http://reviews.llvm.org/D25454
Reviewers:
tstellarAMD
llvm-svn: 283893
2016-10-11 16:00:47 +00:00