llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Neil Henning	c6adc5adda	[AMDGPU] Extend the SI Load/Store optimizer to combine more things. I've extended the load/store optimizer to be able to produce dwordx3 loads and stores, This change allows many more load/stores to be combined, and results in much more optimal code for our hardware. Differential Revision: https://reviews.llvm.org/D54042 llvm-svn: 348937	2018-12-12 16:15:21 +00:00
Matt Arsenault	a3f67ed4ba	AMDGPU: Remove llvm.SI.buffer.load.dword llvm-svn: 348616	2018-12-07 17:46:20 +00:00
Konstantin Zhuravlyov	d3b7f25801	AMDGPU/NFC: Split MUBUF_Pseudo_Atomics into RTN/NO_RTN multiclasses llvm-svn: 346357	2018-11-07 21:21:32 +00:00
Nicolai Haehnle	05511dfb1b	AMDGPU: Remove dead TableGen code Summary: Change-Id: Ic1f2c1d0cf9e90a0baa9fc6bacd0d3c386069fb0 Reviewers: tpr Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53318 Change-Id: Ib4d143c898801e5cf6cb9999a495d62c91ae77fb llvm-svn: 344691	2018-10-17 12:14:26 +00:00
Matt Arsenault	14734dd899	AMDGPU: Remove remnants of old address space mapping llvm-svn: 341165	2018-08-31 05:49:54 +00:00
Tim Renouf	5d475b8d5b	[AMDGPU] Allow int types for MUBUF vdata Summary: Previously the new llvm.amdgcn.raw/struct.buffer.load/store intrinsics only allowed float types for the data to be loaded or stored, which sometimes meant the frontend needed to generate a bitcast. In this, the new intrinsics copied the old buffer intrinsics. This commit extends the new intrinsics to allow int types as well. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D50315 Change-Id: I8202af2d036455553681dcbb3d7d32ae273f8f85 llvm-svn: 340270	2018-08-21 11:08:12 +00:00
Tim Renouf	524551cd05	[AMDGPU] New buffer intrinsics Summary: This commit adds new intrinsics llvm.amdgcn.raw.buffer.load llvm.amdgcn.raw.buffer.load.format llvm.amdgcn.raw.buffer.load.format.d16 llvm.amdgcn.struct.buffer.load llvm.amdgcn.struct.buffer.load.format llvm.amdgcn.struct.buffer.load.format.d16 llvm.amdgcn.raw.buffer.store llvm.amdgcn.raw.buffer.store.format llvm.amdgcn.raw.buffer.store.format.d16 llvm.amdgcn.struct.buffer.store llvm.amdgcn.struct.buffer.store.format llvm.amdgcn.struct.buffer.store.format.d16 llvm.amdgcn.raw.buffer.atomic.* llvm.amdgcn.struct.buffer.atomic.* with the following changes from the llvm.amdgcn.buffer.* intrinsics: * there are separate raw and struct versions: raw does not have an index arg and sets idxen=0 in the instruction, and struct always sets idxen=1 in the instruction even if the index is 0, to allow for the fact that gfx9 does bounds checking differently depending on whether idxen is set; * there is a combined cachepolicy arg (glc+slc) * there are now only two offset args: one for the offset that is included in bounds checking and swizzling, to be split between the instruction's voffset and immoffset fields, and one for the offset that is excluded from bounds checking and swizzling, to go into the instruction's soffset field. The AMDISD::BUFFER_* SD nodes always have an index operand, all three offset operands, combined cachepolicy operand, and an extra idxen operand. The obsolescent llvm.amdgcn.buffer.* intrinsics continue to work. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50306 Change-Id: If897ea7dc34fcbf4d5496e98cc99a934f62fc205 llvm-svn: 340269	2018-08-21 11:07:10 +00:00
Tim Renouf	9efa53b686	[AMDGPU] New tbuffer intrinsics Summary: This commit adds new intrinsics llvm.amdgcn.raw.tbuffer.load llvm.amdgcn.struct.tbuffer.load llvm.amdgcn.raw.tbuffer.store llvm.amdgcn.struct.tbuffer.store with the following changes from the llvm.amdgcn.tbuffer.* intrinsics: * there are separate raw and struct versions: raw does not have an index arg and sets idxen=0 in the instruction, and struct always sets idxen=1 in the instruction even if the index is 0, to allow for the fact that gfx9 does bounds checking differently depending on whether idxen is set; * there is a combined format arg (dfmt+nfmt) * there is a combined cachepolicy arg (glc+slc) * there are now only two offset args: one for the offset that is included in bounds checking and swizzling, to be split between the instruction's voffset and immoffset fields, and one for the offset that is excluded from bounds checking and swizzling, to go into the instruction's soffset field. The AMDISD::TBUFFER_* SD nodes always have an index operand, all three offset operands, combined format operand, combined cachepolicy operand, and an extra idxen operand. The tbuffer pseudo- and real instructions now also have a combined format operand. The obsolescent llvm.amdgcn.tbuffer.* and llvm.SI.tbuffer.store intrinsics continue to work. V2: Separate raw and struct intrinsics. V3: Moved extract_glc and extract_slc defs to a more sensible place. V4: Rebased on D49995. V5: Only two separate offset args instead of three. V6: Pseudo- and real instructions have joint format operand. V7: Restored optionality of dfmt and nfmt in assembler. V8: Addressed minor review comments. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49026 Change-Id: If22ad77e349fac3a5d2f72dda53c010377d470d4 llvm-svn: 340268	2018-08-21 11:06:05 +00:00
Nicolai Haehnle	a859c18433	AMDGPU: Turn D16 for MIMG instructions into a regular operand Summary: This allows us to reduce the number of different machine instruction opcodes, which reduces the table sizes and helps flatten the TableGen multiclass hierarchies. We can do this because for each hardware MIMG opcode, we have a full set of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata and vaddr registers. Instead of having separate D16 machine instructions, a packed D16 instructions loading e.g. 4 components can simply use the same V2 opcode variant that non-D16 instructions use. We still require a TSFlag for D16 buffer instructions, because the D16-ness of buffer instructions is part of the opcode. Renaming the flag should help avoid future confusion. The one non-obvious code change is that for gather4 instructions, the disassembler can no longer automatically decide whether to use a V2 or a V4 variant. The existing logic which choose the correct variant for other MIMG instruction is extended to cover gather4 as well. As a bonus, some of the assembler error messages are now more helpful (e.g., complaining about a wrong data size instead of a non-existing instruction). While we're at it, delete a whole bunch of dead legacy TableGen code. Change-Id: I89b02c2841c06f95e662541433e597f5d4553978 Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47434 llvm-svn: 335222	2018-06-21 13:36:01 +00:00
Matt Arsenault	30c299fdc4	AMDGPU: Make v4i16/v4f16 legal Some image loads return these, and it's awkward working around them not being legal. llvm-svn: 334835	2018-06-15 15:15:46 +00:00
Dmitry Preobrazhensky	224cb6c72d	[AMDGPU][MC][GFX8][GFX9] Allow LDS direct reads for BUFFER_LOAD_DWORDX2/X3/X4 See bug 37653: https://bugs.llvm.org/show_bug.cgi?id=37653 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D47885 llvm-svn: 334609	2018-06-13 15:32:46 +00:00
Nicolai Haehnle	5a12112c99	AMDGPU: Make various NamedOperands upper case Summary: Avoid name clashes with the corresponding bit fields in the instruction encoding. Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f Reviewers: arsenm, rampitec, kzhuravl Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47433 llvm-svn: 333905	2018-06-04 14:45:20 +00:00
Nicolai Haehnle	9a21ccd3ad	TableGen: Streamline the semantics of NAME Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900	2018-06-04 14:26:05 +00:00
Matt Arsenault	ee27f88e14	AMDGPU: Make v2i16/v2f16 legal on VI This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9. Partially cleans up the mess used for lowering of the d16 operations. Making v4f16 legal will help clean this up more, but this requires additional work. llvm-svn: 332953	2018-05-22 06:32:10 +00:00
Konstantin Zhuravlyov	b4c283b570	AMDGPU: Add D16 instructions preserve unused bits feature - Predicate D16 patterns on this new feature - Added this new feature to gfx900/2/4 Differential Revision: https://reviews.llvm.org/D46366 llvm-svn: 331551	2018-05-04 20:06:57 +00:00
Dmitry Preobrazhensky	c262a145db	[AMDGPU][MC] Enabled instruction TBUFFER_LOAD_FORMAT_XYZ for SI/CI See bug 36958: https://bugs.llvm.org/show_bug.cgi?id=36958 Differential Revision: https://reviews.llvm.org/D45099 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329197	2018-04-04 13:54:55 +00:00
Dmitry Preobrazhensky	336c64c5d0	[AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835 Differential Revision: https://reviews.llvm.org/D44825 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328707	2018-03-28 14:53:13 +00:00
Dmitry Preobrazhensky	d45a36011a	[AMDGPU][MC][GFX8] Added BUFFER_STORE_LDS_DWORD Instruction See bug 36558: https://bugs.llvm.org/show_bug.cgi?id=36558 Differential Revision: https://reviews.llvm.org/D43950 Reviewers: artem.tamazov, arsenm llvm-svn: 327299	2018-03-12 17:29:24 +00:00
Dmitry Preobrazhensky	a6d039d869	[AMDGPU][MC] Added lds support for MUBUF instructions See bug 28234: https://bugs.llvm.org/show_bug.cgi?id=28234 Differential Revision: https://reviews.llvm.org/D43472 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 325676	2018-02-21 13:13:48 +00:00
Changpeng Fang	d15f8bd42f	AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature. Reviewers: Matt and Brian Differential Revision: https://reviews.llvm.org/D42548 llvm-svn: 323988	2018-02-01 18:41:33 +00:00
Changpeng Fang	1f9a550e2b	AMDGPU/SI: Fix typos in d16 support patch the buffer intrinsics. llvm-svn: 322906	2018-01-18 22:57:57 +00:00
Changpeng Fang	14b06e6060	AMDGPU/SI: Add d16 support for buffer intrinsics. Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. llvm-svn: 322402	2018-01-12 21:12:19 +00:00
Matt Arsenault	eaea634604	AMDGPU: Select d16 loads into low component of register llvm-svn: 318005	2017-11-13 00:22:09 +00:00
Marek Olsak	3170d60e41	AMDGPU: Lower buffer store and atomic intrinsics manually Summary: Without this, SIMemoryLegalizer inserts s_waitcnt vmcnt(0) before every buffer store and atomic instruction. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39060 llvm-svn: 317754	2017-11-09 01:52:48 +00:00
Marek Olsak	5cb35b4bac	AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset Summary: Apps that benefit: - alien isolation - bioshock infinite - civilization: beyond earth - company of heroes 2 - dirt showdown - dota 2 - F1 2015 - grid autosport - hitman - legend of grimrock - serious sam 3: bfe - shadow warrior - talos principle - total war: warhammer - UE4 demos: effects cave, elemental, sun temple Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38914 llvm-svn: 317038	2017-10-31 21:06:42 +00:00
Matt Arsenault	13035c4903	AMDGPU: Remove global isGCN predicates These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742	2017-10-03 00:06:41 +00:00
Matt Arsenault	8057a50cc4	AMDGPU: Match load d16 hi instructions Also starts selecting global loads for constant address in some cases. Some end up selecting to mubuf still, which requires investigation. We still get sub-optimal regalloc and extra waitcnts inserted due to not really tracking the liveness of the separate register halves. llvm-svn: 313716	2017-09-20 05:01:53 +00:00
Matt Arsenault	0000361c39	AMDGPU: Cleanup load/store PatFrags Try to use a consistent naming scheme. llvm-svn: 313713	2017-09-20 03:43:35 +00:00
Matt Arsenault	23103260e3	AMDGPU: Match store d16_hi instructions llvm-svn: 313712	2017-09-20 03:20:09 +00:00
Matt Arsenault	40dcb3af13	AMDGPU: Don't legalize i16 extloads to i32 with legal i16 Keeping non-i16 extloads makes it easier to match some new gfx9 load instructions. llvm-svn: 312699	2017-09-07 05:37:34 +00:00
Matt Arsenault	d48237c09b	AMDGPU: Add most d16 load/store instruction definitions Doesn't include the tied operand necessary for the loads, but is enough for the assembler to work. llvm-svn: 312347	2017-09-01 18:36:06 +00:00
Konstantin Zhuravlyov	d6234de26c	AMDGPU: Implement memory model llvm-svn: 308781	2017-07-21 21:19:23 +00:00
Konstantin Zhuravlyov	c8a1b14e0b	AMDGPU: Introduce maybeAtomic instruction flag Testing is in the follow up change llvm-svn: 308779	2017-07-21 21:05:45 +00:00
Matt Arsenault	c6db825207	AMDGPU: Rename _RTN atomic instructions Move the _RTN to the end of the name. It reads better if the other addressing mode components line up with the non-RTN version. It is also more convenient to define saddr variants of FLAT atomics to have the RTN last, and it is good to have a consistent naming scheme. llvm-svn: 308674	2017-07-20 21:06:04 +00:00
David Stuttard	a19aa0bc03	[AMDGPU] Add intrinsics for tbuffer load and store Intrinsic already existed for llvm.SI.tbuffer.store Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.* Added CodeGen tests for the 2 new variants added. Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr Differential Revision: https://reviews.llvm.org/D30687 llvm-svn: 306031	2017-06-22 16:29:22 +00:00
Matt Arsenault	f3cd945997	AMDGPU: Change mubuf soffset register when SP relative Check the MachinePointerInfo for whether the access is supposed to be relative to the stack pointer. No tests because this is used in later commits implementing calls. llvm-svn: 303301	2017-05-17 21:02:58 +00:00
Matt Arsenault	b7fb34ab2b	AMDGPU: Select scratch mubuf offsets when pointer is a constant In call sequence setups, there may not be a frame index base and the pointer is a constant offset from the frame pointer / scratch wave offset register. llvm-svn: 301230	2017-04-24 19:40:59 +00:00
Matt Arsenault	7c5896b3a9	AMDGPU: Remove llvm.SI.vs.load.input llvm-svn: 299391	2017-04-03 21:45:13 +00:00
Yaxun Liu	da52f0e643	[AMDGPU] Get address space mapping by target triple environment As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846	2017-03-27 14:04:01 +00:00
Tom Stellard	2c0dd4ec69	AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.* Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184	2016-12-20 17:19:44 +00:00
Tom Stellard	fca8e2011d	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464	2016-11-10 16:02:37 +00:00
Tom Stellard	8e72cd5271	Revert "AMDGPU: Add VI i16 support" This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995	2016-11-04 13:06:34 +00:00
Tom Stellard	1eb5b9fee5	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939	2016-11-03 17:13:50 +00:00
Matt Arsenault	bb971d2e8b	AMDGPU: Whitespace fixes llvm-svn: 285659	2016-11-01 00:55:14 +00:00
Matt Arsenault	342b4f2c0e	AMDGPU: Rename glc operand type While trying to add the glc bit to SMEM instructions on VI with the new refactoring I ran into some kind of shadowing problem for the glc operand when using the pseudoinstruction as a multiclass parameter. Everywhere that currently uses it defines the operand to have the same name as its type, i.e. glc:$glc which works. For some reason now it conflicts, and its up evaluating to the wrong thing. For the real encoding classes, let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated and still visible in the Inst initializer in the expanded td file. In other cases I got a a different error about an illegal operand where this was using { 0 } initializer from the bits<1> glc initializer instead of evaluating it as false in the if. For consistency all of the operand types should probably be captialized to avoid conflicting with the variable names unless somebody has a better idea of how to fix this. llvm-svn: 285462	2016-10-28 21:55:08 +00:00
Tom Stellard	6924e02442	AMDGPU/SI: Handle hazard with > 8 byte VMEM stores Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25577 llvm-svn: 285359	2016-10-27 23:05:31 +00:00
Artem Tamazov	d87d4169f2	[AMDGPU][mc] Add support for buffer_load_dwordx3, buffer_store_dwordx3. Partially fixes Bug 28232. Lit tests added. Differential Revision: https://reviews.llvm.org/D25367 llvm-svn: 283567	2016-10-07 15:53:16 +00:00
Peter Collingbourne	a067ba4392	Target: Remove unused patterns and transforms. NFC. llvm-svn: 283515	2016-10-07 00:30:49 +00:00
Matt Arsenault	8ed36a9e0a	AMDGPU: Partially fix reported code size for some instructions These ones need to have the size on the pseudo instruction set for getInstSizeInBytes to work correctly. These also have a statically known size. llvm-svn: 283437	2016-10-06 10:13:23 +00:00

1 2

52 Commits