1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
llvm-mirror/test/CodeGen/AMDGPU
Nirav Dave 075ae0197d In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after fixing overly aggressive load-store forwarding optimization.

Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.

Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).

Additional Minor Changes:

   1. Finishes removing unused AliasLoad code
   2. Unifies the the chain aggregation in the merged stores across
      code paths
   3. Re-add the Store node to the worklist after calling
      SimplifyDemandedBits.
   4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
      arbitrary, but seemed sufficient to not cause regressions in
      tests.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations

Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 289221
2016-12-09 16:15:12 +00:00
..
GlobalISel GlobalISel: move type information to MachineRegisterInfo. 2016-09-09 11:46:34 +00:00
32-bit-local-address-space.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
add_i64.ll
add_i128.ll AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes 2016-10-14 10:30:00 +00:00
add-debug.ll
add.i16.ll AMDGPU: Select i16 instructions to VOP3 forms 2016-12-09 06:19:12 +00:00
add.ll
addrspacecast-constantexpr.ll
addrspacecast.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
amdgcn.bitcast.ll AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements. 2016-09-02 20:13:19 +00:00
amdgcn.private-memory.ll
amdgpu-codegenprepare-fdiv.ll [AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions 2016-09-28 20:05:39 +00:00
amdgpu-codegenprepare-i16-to-i32.ll AMDGPU: Fix crash on i16 constant expression 2016-12-06 23:18:06 +00:00
amdgpu-shader-calling-convention.ll
amdgpu.private-memory.ll AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg 2016-12-08 14:08:02 +00:00
amdgpu.work-item-intrinsics.deprecated.ll
and-gcn.ll
and.ll AMDGPU: Improve splitting 64-bit bit ops by constants 2016-09-14 15:19:03 +00:00
annotate-kernel-features-hsa.ll
annotate-kernel-features.ll
anonymous-gv.ll AMDGPU/SI: Don't crash on anonymous GlobalValues 2016-09-26 17:29:25 +00:00
anyext.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
array-ptr-calc-i32.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
array-ptr-calc-i64.ll
atomic_cmp_swap_local.ll
atomic_load_add.ll
atomic_load_sub.ll
attr-amdgpu-flat-work-group-size.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
attr-amdgpu-num-sgpr.ll RegisterCoalscer: Only coalesce complete reserved registers. 2016-12-01 22:39:51 +00:00
attr-amdgpu-num-vgpr.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
attr-amdgpu-waves-per-eu.ll [AMDGPU] Wave and register controls 2016-09-06 20:29:10 +00:00
attr-unparseable.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
basic-branch.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
basic-loop.ll
bfe_uint.ll
bfi_int.ll
bfm.ll
big_alu.ll
bitcast-vector-extract.ll AMDGPU: Push bitcasts through build_vector 2016-09-17 15:44:16 +00:00
bitreverse-inline-immediates.ll AMDGPU: Use brev for materializing SGPR constants 2016-11-01 23:14:20 +00:00
bitreverse.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
br_cc.f16.ll AMDGPU: Make f16 ConstantFP legal 2016-12-08 20:14:46 +00:00
branch-condition-and.ll [AMDGPU] Fix multiple vreg definitions in si-lower-control-flow 2016-11-22 01:42:34 +00:00
branch-relax-spill.ll BranchRelaxation: Support expanding unconditional branches 2016-10-06 16:20:41 +00:00
branch-relaxation.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
branch-uniformity.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
bswap.ll AMDGPU: Improve splitting 64-bit bit ops by constants 2016-09-14 15:19:03 +00:00
bug-vopc-commute.ll
build_vector.ll
call_fs.ll
call.ll
calling-conventions.ll
captured-frame-index.ll AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg 2016-12-08 14:08:02 +00:00
cayman-loop-bug.ll
cf_end.ll
cf-loop-on-constant.ll AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches 2016-11-29 00:46:46 +00:00
cf-stack-bug.ll
cgp-addressing-modes-flat.ll
cgp-addressing-modes.ll Reapply "AMDGPU: Don't use offen if it is 0" 2016-10-26 15:08:16 +00:00
cgp-bitfield-extract.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
ci-use-flat-for-global.ll
cndmask-no-def-vcc.ll ScheduleDAGInstrs: Add condjump deps to addSchedBarrierDeps() 2016-11-11 01:34:21 +00:00
coalescer_distribute.ll
coalescer_remat.ll AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches 2016-11-29 00:46:46 +00:00
coalescer-subrange-crash.ll Do not consider subreg defs as reads when computing subrange liveness 2016-09-02 19:48:55 +00:00
codegen-prepare-addrmode-sext.ll
combine_vloads.ll
commute_modifiers.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
commute-compares.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
commute-shifts.ll
complex-folding.ll
concat_vectors.ll
constant-fold-mi-operands.ll AMDGPU: Improve splitting 64-bit bit ops by constants 2016-09-14 15:19:03 +00:00
control-flow-fastregalloc.ll AMDGPU/SI: Add back reverted SGPR spilling code, but disable it 2016-11-25 17:37:09 +00:00
convergent-inlineasm.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
copy-illegal-type.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
copy-to-reg.ll
ctlz_zero_undef.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
ctlz.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
ctpop64.ll AMDGPU: Improve splitting 64-bit bit ops by constants 2016-09-14 15:19:03 +00:00
ctpop.ll
cttz_zero_undef.ll
cube.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
cvt_f32_ubyte.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
cvt_flr_i32_f32.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
cvt_rpi_i32_f32.ll
dagcombine-reassociate-bug.ll
dagcombiner-bug-illegal-vec4-int-to-fp.ll
debug.ll
debugger-emit-prologue.ll
debugger-insert-nops.ll In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. 2016-12-09 16:15:12 +00:00
debugger-reserve-regs.ll
default-fp-mode.ll AMDGPU : Add a function to enable and disable IEEEBit for SC and shader 2016-10-19 22:34:49 +00:00
disconnected-predset-break-bug.ll
drop-mem-operand-move-smrd.ll
ds_read2_offset_order.ll AMDGPU: Run LoadStoreVectorizer pass by default 2016-09-09 22:29:28 +00:00
ds_read2_superreg.ll
ds_read2.ll [AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads. 2016-11-03 14:37:13 +00:00
ds_read2st64.ll AMDGPU/SI: Canonicalize offset order for merged DS instructions 2016-08-26 21:36:47 +00:00
ds_write2.ll AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler 2016-08-29 19:15:22 +00:00
ds_write2st64.ll
ds-negative-offset-addressing-mode-loop.ll
ds-sub-offset.ll
dynamic_stackalloc.ll
elf.ll
elf.r600.ll
else.ll AMDGPU: Split SILowerControlFlow into two pieces 2016-08-22 19:33:16 +00:00
empty-function.ll
endcf-loop-header.ll
exceed-max-sgprs.ll AMDGPU: Diagnose using too many SGPRs 2016-10-28 20:31:47 +00:00
extend-bit-ops-i16.ll AMDGPU/SI: Remove zero_extend patterns for i16 ops selected to 32-bit insts 2016-11-18 13:53:34 +00:00
extload-align.ll [DAG] Fix incorrect alignment of ext load. 2016-09-22 17:28:43 +00:00
extload-private.ll Reapply "AMDGPU: Don't use offen if it is 0" 2016-10-26 15:08:16 +00:00
extload.ll
extract_vector_elt-f64.ll
extract_vector_elt-i8.ll
extract_vector_elt-i16.ll
extract_vector_elt-i64.ll
extract-vector-elt-build-vector-combine.ll
extractelt-to-trunc.ll
fabs.f16.ll AMDGPU: Fix f16 fabs/fneg 2016-11-15 02:25:28 +00:00
fabs.f64.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fabs.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fadd64.ll
fadd.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fadd.ll
fcanonicalize.ll
fceil64.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fceil.ll
fcmp64.ll
fcmp-cnd.ll
fcmp-cnde-int-args.ll
fcmp.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fcmp.ll
fconst64.ll
fcopysign.f32.ll AMDGPU: Use brev for materializing SGPR constants 2016-11-01 23:14:20 +00:00
fcopysign.f64.ll AMDGPU: Use brev for materializing SGPR constants 2016-11-01 23:14:20 +00:00
fdiv.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fdiv.f64.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
fdiv.ll AMDGPU : Add S_SETREG instructions to fix fdiv precision issues. 2016-12-07 02:42:15 +00:00
fetch-limits.r600.ll
fetch-limits.r700+.ll
ffloor.f64.ll AMDGPU: Move cndmask pseudo to be isel pseudo 2016-08-27 01:00:37 +00:00
ffloor.ll
flat_atomics_i64.ll
flat_atomics.ll
flat-address-space.ll AMDGPU/SI: Don't emit multi-dword flat memory ops when they might access scratch 2016-10-26 14:38:47 +00:00
flat-scratch-reg.ll AMDGPU : Add XNACK feature to GPUs that support it. 2016-09-06 19:55:17 +00:00
floor.ll
fma-combine.ll [DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default 2016-12-02 16:06:18 +00:00
fma.f64.ll
fma.ll
fmad.ll
fmax3.f64.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
fmax3.ll
fmax_legacy.f64.ll
fmax_legacy.ll
fmax.ll
fmaxnum.f64.ll
fmaxnum.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fmed3.ll
fmin3.ll
fmin_legacy.f64.ll
fmin_legacy.ll
fmin.ll
fminnum.f64.ll
fminnum.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fmul64.ll
fmul-2-combine-multi-use.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fmul.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fmul.ll
fmuladd.ll
fnearbyint.ll
fneg-fabs.f16.ll AMDGPU: Fix f16 fabs/fneg 2016-11-15 02:25:28 +00:00
fneg-fabs.f64.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fneg-fabs.ll AMDGPU: Use brev for materializing SGPR constants 2016-11-01 23:14:20 +00:00
fneg.f16.ll AMDGPU: Fix f16 fabs/fneg 2016-11-15 02:25:28 +00:00
fneg.f64.ll
fneg.ll AMDGPU: Fix f16 fabs/fneg 2016-11-15 02:25:28 +00:00
fp16_to_fp.ll
fp32_to_fp16.ll
fp_to_sint.f64.ll
fp_to_sint.ll [AMDGPU] Promote f16/i16 conversions to f32/i32 2016-11-17 04:00:46 +00:00
fp_to_uint.f64.ll
fp_to_uint.ll [AMDGPU] Promote f16/i16 conversions to f32/i32 2016-11-17 04:00:46 +00:00
fp-classify.ll
fpext.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fpext.ll
fptosi.f16.ll [AMDGPU] Promote f16/i16 conversions to f32/i32 2016-11-17 04:00:46 +00:00
fptoui.f16.ll [AMDGPU] Promote f16/i16 conversions to f32/i32 2016-11-17 04:00:46 +00:00
fptrunc.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fptrunc.ll [AMDGPU] Add missing test for rL287203 2016-11-17 04:33:20 +00:00
fract.f64.ll AMDGPU: Move cndmask pseudo to be isel pseudo 2016-08-27 01:00:37 +00:00
fract.ll
frem.ll
fsqrt.f64.ll
fsqrt.ll
fsub64.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
fsub.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
fsub.ll
ftrunc.f64.ll
ftrunc.ll
gep-address-space.ll
global_atomics_i64.ll
global_atomics.ll
global_smrd_cfg.ll [AMDGPU] Scalarization of global uniform loads. 2016-12-08 17:28:47 +00:00
global_smrd.ll [AMDGPU] Scalarization of global uniform loads. 2016-12-08 17:28:47 +00:00
global-constant.ll [AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only) 2016-10-20 18:12:38 +00:00
global-directive.ll
global-extload-i16.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
global-variable-relocs.ll [AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables 2016-10-14 04:37:34 +00:00
gv-const-addrspace.ll
gv-offset-folding.ll
half.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
hoist-cond.ll AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches 2016-11-29 00:46:46 +00:00
hsa-default-device.ll
hsa-fp-mode.ll AMDGPU : Add a function to enable and disable IEEEBit for SC and shader 2016-10-19 22:34:49 +00:00
hsa-func.ll
hsa-globals.ll [AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only) 2016-10-20 18:12:38 +00:00
hsa-group-segment.ll
hsa-note-no-func.ll AMDGPU: Refactor processor definition to use ISA version features 2016-10-26 16:37:56 +00:00
hsa.ll [AMDGPU] Mark .note section SHF_ALLOC so lld creates a segment for it 2016-10-17 22:40:15 +00:00
i1-copy-implicit-def.ll AMDGPU: Remove unnecessary and on conditional branch 2016-11-07 19:09:33 +00:00
i1-copy-phi.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
i8-to-double-to-float.ll
icmp64.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
icmp-select-sete-reverse-args.ll
image-attributes.ll
image-resource-id.ll
imm.ll AMDGPU: Fix formatting of 1/2pi immediate 2016-11-15 00:04:33 +00:00
indirect-addressing-si-noopt.ll Replace subregister uses when processing tied operands 2016-08-26 06:31:32 +00:00
indirect-addressing-si.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
indirect-private-64.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
infinite-loop-evergreen.ll llvm/test/CodeGen/AMDGPU/infinite-loop-evergreen.ll REQUIRES +Asserts. 2016-09-12 04:27:28 +00:00
infinite-loop.ll
inline-asm.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
inline-calls.ll AMDGPU/SI: Handle aliases in AMDGPUAlwaysInlinePass 2016-08-31 11:18:33 +00:00
inline-constraints.ll AMDGPU/SI: Add back reverted SGPR spilling code, but disable it 2016-11-25 17:37:09 +00:00
inlineasm-illegal-type.ll AMDGPU: Fix crash on illegal type for inlineasm 2016-11-18 04:42:57 +00:00
input-mods.ll
insert_subreg.ll
insert_vector_elt.ll In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. 2016-12-09 16:15:12 +00:00
invalid-addrspacecast.ll
invalid-opencl-version-metadata1.ll AMDGPU: Emit runtime metadata as a note element in .note section 2016-11-10 21:18:49 +00:00
invalid-opencl-version-metadata2.ll AMDGPU: Emit runtime metadata as a note element in .note section 2016-11-10 21:18:49 +00:00
invalid-opencl-version-metadata3.ll AMDGPU: Emit runtime metadata as a note element in .note section 2016-11-10 21:18:49 +00:00
invariant-load-no-alias-store.ll AMDGPU: Run LoadStoreVectorizer pass by default 2016-09-09 22:29:28 +00:00
jump-address.ll
kcache-fold.ll
kernarg-stack-alignment.ll
kernel-args.ll AMDGPU/SI: Set correct value for amd_kernel_code_t::kernarg_segment_alignment 2016-12-06 21:53:10 +00:00
large-alloca-compute.ll [AMDGPU] Assembler: rename amd_kernel_code_t asm names according to spec 2016-09-09 10:08:02 +00:00
large-alloca-graphics.ll
large-constant-initializer.ll
large-work-group-promote-alloca.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
lds-alignment.ll
lds-initializer.ll
lds-m0-init-in-loop.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
lds-oqap-crash.ll
lds-output-queue.ll
lds-size.ll
lds-zero-initializer.ll
legalizedag-bug-expand-setcc.ll
lit.local.cfg
literals.ll
llvm.amdgcn.atomic.dec.ll
llvm.amdgcn.atomic.inc.ll
llvm.amdgcn.buffer.atomic.ll
llvm.amdgcn.buffer.load.format.ll
llvm.amdgcn.buffer.load.ll
llvm.amdgcn.buffer.store.format.ll
llvm.amdgcn.buffer.store.ll
llvm.amdgcn.buffer.wbinvl1.ll
llvm.amdgcn.buffer.wbinvl1.sc.ll
llvm.amdgcn.buffer.wbinvl1.vol.ll AMDGPU/GCN: Exit early in hazard recognizer if there is no vreg argument 2016-11-15 23:55:15 +00:00
llvm.amdgcn.class.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.class.ll
llvm.amdgcn.cos.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.cos.ll
llvm.amdgcn.cubeid.ll
llvm.amdgcn.cubema.ll
llvm.amdgcn.cubesc.ll
llvm.amdgcn.cubetc.ll
llvm.amdgcn.dispatch.id.ll
llvm.amdgcn.dispatch.ptr.ll
llvm.amdgcn.div.fixup.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.div.fixup.ll
llvm.amdgcn.div.fmas.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
llvm.amdgcn.div.scale.ll
llvm.amdgcn.ds.bpermute.ll
llvm.amdgcn.ds.permute.ll
llvm.amdgcn.ds.swizzle.ll
llvm.amdgcn.fcmp.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
llvm.amdgcn.fdiv.fast.ll
llvm.amdgcn.fmul.legacy.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
llvm.amdgcn.fract.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.fract.ll
llvm.amdgcn.frexp.exp.f16.ll [AMDGPU] Change frexp.exp intrinsic to return i16 for f16 input 2016-11-18 22:31:08 +00:00
llvm.amdgcn.frexp.exp.ll [AMDGPU] Change frexp.exp intrinsic to return i16 for f16 input 2016-11-18 22:31:08 +00:00
llvm.amdgcn.frexp.mant.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.frexp.mant.ll
llvm.amdgcn.groupstaticsize.ll
llvm.amdgcn.icmp.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
llvm.amdgcn.image.atomic.ll
llvm.amdgcn.image.gather4.ll AMDGPU/SI: Support data types other than V4f32 in image intrinsics 2016-11-14 18:33:18 +00:00
llvm.amdgcn.image.getlod.ll
llvm.amdgcn.image.ll AMDGPU/SI: Support data types other than V4f32 in image intrinsics 2016-11-14 18:33:18 +00:00
llvm.amdgcn.image.sample.ll AMDGPU/SI: Support data types other than V4f32 in image intrinsics 2016-11-14 18:33:18 +00:00
llvm.amdgcn.image.sample.o.ll
llvm.amdgcn.interp.ll AMDGPU/SI: Don't mark VINTRP instructions as mayLoad 2016-12-09 15:57:15 +00:00
llvm.amdgcn.kernarg.segment.ptr.ll AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size 2016-09-23 01:33:26 +00:00
llvm.amdgcn.ldexp.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.ldexp.ll
llvm.amdgcn.lerp.ll
llvm.amdgcn.log.clamp.ll
llvm.amdgcn.mbcnt.ll
llvm.amdgcn.mov.dpp.ll
llvm.amdgcn.mqsad.pk.u16.u8.ll
llvm.amdgcn.mqsad.u32.u8.ll AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type. 2016-09-09 19:31:51 +00:00
llvm.amdgcn.msad.u8.ll
llvm.amdgcn.ps.live.ll
llvm.amdgcn.qsad.pk.u16.u8.ll
llvm.amdgcn.queue.ptr.ll
llvm.amdgcn.rcp.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.rcp.legacy.ll
llvm.amdgcn.rcp.ll
llvm.amdgcn.readfirstlane.ll AMDGPU/SI: Add back reverted SGPR spilling code, but disable it 2016-11-25 17:37:09 +00:00
llvm.amdgcn.readlane.ll AMDGPU/SI: Add back reverted SGPR spilling code, but disable it 2016-11-25 17:37:09 +00:00
llvm.amdgcn.rsq.clamp.ll AMDGPU: Fix immediate folding logic when shrinking instructions 2016-09-09 23:32:53 +00:00
llvm.amdgcn.rsq.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.rsq.legacy.ll
llvm.amdgcn.rsq.ll
llvm.amdgcn.s.barrier.ll
llvm.amdgcn.s.dcache.inv.ll
llvm.amdgcn.s.dcache.inv.vol.ll
llvm.amdgcn.s.dcache.wb.ll
llvm.amdgcn.s.dcache.wb.vol.ll
llvm.amdgcn.s.decperflevel.ll
llvm.amdgcn.s.getreg.ll DAG: Ignore call site attributes when emitting target intrinsic 2016-11-21 22:56:42 +00:00
llvm.amdgcn.s.incperflevel.ll
llvm.amdgcn.s.memrealtime.ll
llvm.amdgcn.s.memtime.ll
llvm.amdgcn.s.sleep.ll
llvm.amdgcn.s.waitcnt.ll AMDGPU/SI: Change mimg intrinsic signatures 2016-10-12 16:35:29 +00:00
llvm.amdgcn.sad.hi.u8.ll
llvm.amdgcn.sad.u8.ll
llvm.amdgcn.sad.u16.ll
llvm.amdgcn.sffbh.ll
llvm.amdgcn.sin.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.amdgcn.sin.ll
llvm.amdgcn.trig.preop.ll
llvm.amdgcn.wave.barrier.ll [AMDGPU] Add wave barrier builtin 2016-11-15 19:00:15 +00:00
llvm.amdgcn.workgroup.id.ll AMDGPU/SI: Add support for triples with the mesa3d operating system 2016-09-16 21:34:26 +00:00
llvm.amdgcn.workitem.id.ll AMDGPU/SI: Add support for triples with the mesa3d operating system 2016-09-16 21:34:26 +00:00
llvm.AMDGPU.bfe.i32.ll
llvm.AMDGPU.bfe.u32.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
llvm.AMDGPU.clamp.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
llvm.AMDGPU.cube.ll
llvm.AMDGPU.kill.ll
llvm.amdgpu.kilp.ll
llvm.ceil.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.cos.f16.ll AMDGPU: Fix formatting of 1/2pi immediate 2016-11-15 00:04:33 +00:00
llvm.cos.ll
llvm.dbg.value.ll AMDGPU: Disallow exec as SMEM instruction operand 2016-11-29 19:39:53 +00:00
llvm.exp2.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.exp2.ll
llvm.floor.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.fma.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.fmuladd.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.log2.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.log2.ll
llvm.maxnum.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.memcpy.ll [AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only) 2016-10-20 18:12:38 +00:00
llvm.minnum.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.pow.ll
llvm.r600.dot4.ll
llvm.r600.group.barrier.ll
llvm.r600.read.local.size.ll
llvm.r600.recipsqrt.clamped.ll
llvm.r600.recipsqrt.ieee.ll
llvm.r600.tex.ll
llvm.rint.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.rint.f64.ll
llvm.rint.ll
llvm.round.f64.ll AMDGPU: Use brev for materializing SGPR constants 2016-11-01 23:14:20 +00:00
llvm.round.ll AMDGPU: Use brev for materializing SGPR constants 2016-11-01 23:14:20 +00:00
llvm.SI.export.ll AMDGPU: Change how exp is printed 2016-12-05 20:31:49 +00:00
llvm.SI.fs.interp.ll AMDGPU/SI: Don't mark VINTRP instructions as mayLoad 2016-12-09 15:57:15 +00:00
llvm.SI.gather4.ll
llvm.SI.getlod.ll
llvm.SI.image.ll
llvm.SI.image.sample-masked.ll
llvm.SI.image.sample.ll
llvm.SI.image.sample.o.ll
llvm.SI.load.dword.ll
llvm.SI.packf16.ll
llvm.SI.sendmsg-m0.ll
llvm.SI.sendmsg.ll
llvm.SI.tbuffer.store.ll
llvm.sin.f16.ll AMDGPU: Fix formatting of 1/2pi immediate 2016-11-15 00:04:33 +00:00
llvm.sin.ll
llvm.sqrt.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
llvm.trunc.f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
load-constant-f64.ll
load-constant-i1.ll
load-constant-i8.ll AMDGPU/R600: Enable Load combine 2016-08-27 19:09:43 +00:00
load-constant-i16.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
load-constant-i32.ll
load-constant-i64.ll
load-global-f32.ll
load-global-f64.ll
load-global-i1.ll
load-global-i8.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
load-global-i16.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
load-global-i32.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
load-global-i64.ll
load-input-fold.ll
load-local-f32.ll
load-local-f64.ll
load-local-i1.ll
load-local-i8.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
load-local-i16.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
load-local-i32.ll AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler 2016-08-29 19:15:22 +00:00
load-local-i64.ll
load-weird-sizes.ll
local-64.ll AMDGPU/SI: Canonicalize offset order for merged DS instructions 2016-08-26 21:36:47 +00:00
local-atomics64.ll
local-atomics.ll
local-memory.amdgcn.ll AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler 2016-08-29 19:15:22 +00:00
local-memory.ll
local-memory.r600.ll
local-stack-slot-bug.ll AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg 2016-12-08 14:08:02 +00:00
local-stack-slot-offset.ll AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg 2016-12-08 14:08:02 +00:00
loop_break.ll AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches 2016-11-29 00:46:46 +00:00
loop-address.ll
loop-idiom.ll
lower-range-metadata-intrinsic-call.ll
lshl.ll
lshr.ll
mad24-get-global-id.ll
mad_int24.ll
mad_uint24.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
mad-combine.ll
mad-sub.ll
madak.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
madmk.ll AMDGPU: Support commuting with immediate in src0 2016-09-08 17:19:29 +00:00
max3.ll
max-literals.ll
max.i16.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
max.ll
mem-builtins.ll
merge-store-crash.ll
merge-store-usedef.ll AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies 2016-10-27 08:15:07 +00:00
merge-stores.ll In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. 2016-12-09 16:15:12 +00:00
min3.ll
min.ll
missing-store.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
move-addr64-rsrc-dead-subreg-writes.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
move-to-valu-atomicrmw.ll
movreld-bug.ll AMDGPU: Fix Two Address problems with v_movreld 2016-10-24 14:56:02 +00:00
mubuf-shader-vgpr.ll AMDGPU: Fix legalization of MUBUF instructions in shaders 2016-11-18 11:55:52 +00:00
mubuf.ll
mul_int24.ll AMDGPU/SI: Fix crash caused by r284267 2016-10-21 20:25:11 +00:00
mul_uint24-amdgcn.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
mul_uint24-r600.ll [AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions 2016-09-28 20:05:39 +00:00
mul.ll
mulhu.ll
multilevel-break.ll AMDGPU: Allow some control flow intrinsics to be CSEd 2016-09-16 22:11:18 +00:00
no-hsa-graphics-shaders.ll
no-initializer-constant-addrspace.ll
no-shrink-extloads.ll
opencl-image-metadata.ll
operand-folding.ll
operand-spacing.ll
or.ll Revert "AMDGPU: Enable ConstrainCopy DAG mutation" 2016-11-17 16:41:49 +00:00
over-max-lds-size.ll
packetizer.ll
parallelandifcollapse.ll
parallelorifcollapse.ll
partially-dead-super-register-immediate.ll
predicate-dp4.ll
predicates.ll
private-access-no-objects.ll AMDGPU: Fix using incorrect private resource with no allocation 2016-10-28 19:43:31 +00:00
private-element-size.ll In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. 2016-12-09 16:15:12 +00:00
private-memory-atomics.ll
private-memory-broken.ll
private-memory-r600.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-array-allocation.ll
promote-alloca-bitcast-function.ll
promote-alloca-globals.ll
promote-alloca-invariant-markers.ll
promote-alloca-lifetime.ll
promote-alloca-mem-intrinsics.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-no-opts.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-padding-size-estimate.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-shaders.ll
promote-alloca-stored-pointer-value.ll AMDGPU: Run LoadStoreVectorizer pass by default 2016-09-09 22:29:28 +00:00
promote-alloca-to-lds-icmp.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-to-lds-phi.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-to-lds-select.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
promote-alloca-unhandled-intrinsic.ll
promote-alloca-volatile.ll
pv-packing.ll
pv.ll
r600-constant-array-fixup.ll AMDGPU/R600: Fix fixups used for constant arrays 2016-08-29 19:01:48 +00:00
r600-encoding.ll
r600-export-fix.ll [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine 2016-09-28 06:13:58 +00:00
r600-infinite-loop-bug-while-reorganizing-vector.ll
r600.bitcast.ll AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements. 2016-09-02 20:13:19 +00:00
r600.private-memory.ll
r600.work-item-intrinsics.ll
r600cfg.ll
rcp-pattern.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
read_register.ll AMDGPU/SI: Add back reverted SGPR spilling code, but disable it 2016-11-25 17:37:09 +00:00
read-register-invalid-subtarget.ll
read-register-invalid-type-i32.ll
read-register-invalid-type-i64.ll
readcyclecounter.ll
README
reduce-load-width-alignment.ll
reduce-store-width-alignment.ll
reg-coalescer-sched-crash.ll
register-count-comments.ll
rename-disconnected-bug.ll
reorder-stores.ll
ret_jump.ll
ret.ll AMDGPU: Change how exp is printed 2016-12-05 20:31:49 +00:00
rotl.i64.ll
rotl.ll
rotr.i64.ll
rotr.ll
rsq.ll
runtime-metadata.ll AMDGPU: Emit runtime metadata as a note element in .note section 2016-11-10 21:18:49 +00:00
rv7x0_count3.ll
s_addk_i32.ll AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32 2016-09-08 17:35:41 +00:00
s_movk_i32.ll AMDGPU: Fix immediate folding logic when shrinking instructions 2016-09-09 23:32:53 +00:00
s_mulk_i32.ll AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32 2016-09-08 17:35:41 +00:00
sad.ll Revert "AMDGPU: Enable ConstrainCopy DAG mutation" 2016-11-17 16:41:49 +00:00
saddo.ll
salu-to-valu.ll AMDGPU/SI: Don't move copies of immediates to the VALU 2016-12-06 21:13:30 +00:00
sampler-resource-id.ll
scalar_to_vector.ll
schedule-fs-loop-nested-if.ll
schedule-fs-loop-nested.ll
schedule-fs-loop.ll
schedule-global-loads.ll AMDGPU: Run LoadStoreVectorizer pass by default 2016-09-09 22:29:28 +00:00
schedule-if-2.ll
schedule-if.ll
schedule-kernel-arg-loads.ll
schedule-vs-if-nested-loop-failure.ll
schedule-vs-if-nested-loop.ll
scheduler-subrange-crash.ll Do not consider subreg defs as reads when computing subrange liveness 2016-09-02 19:48:55 +00:00
scratch-buffer.ll AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass 2016-11-16 18:42:17 +00:00
sdiv.ll [AMDGPU] Expand vector mulhu/mulhs 2016-11-01 10:26:48 +00:00
sdivrem24.ll
sdivrem64.ll
select64.ll
select-i1.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
select-vectors.ll Revert "AMDGPU: Enable ConstrainCopy DAG mutation" 2016-11-17 16:41:49 +00:00
select.f16.ll [AMDGPU] Handle f16 select{_cc} 2016-11-16 03:16:26 +00:00
select.ll
selectcc-cnd.ll
selectcc-cnde-int.ll
selectcc-icmp-select-float.ll
selectcc-opt.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
selectcc.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
selected-stack-object.ll
set-dx10.ll
setcc64.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
setcc-equivalent.ll
setcc-opt.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
setcc.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
seto.ll
setuo.ll
sext-eliminate.ll
sext-in-reg-failure-r600.ll AMDGPU: Fix introducing stack access on unaligned v16i8 2016-08-31 21:52:27 +00:00
sext-in-reg.ll
sgpr-control-flow.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
sgpr-copy-duplicate-operand.ll
sgpr-copy.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
shared-op-cycle.ll
shift-and-i64-ubfe.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
shift-and-i128-ubfe.ll AMDGPU: Improve splitting 64-bit bit ops by constants 2016-09-14 15:19:03 +00:00
shift-i64-opts.ll
shl_add_constant.ll AMDGPU/SI: Improve register allocation hints for sopk instructions 2016-08-29 13:06:10 +00:00
shl_add_ptr.ll
shl.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
si-annotate-cf-noloop.ll AMDGPU: Remove unnecessary and on conditional branch 2016-11-07 19:09:33 +00:00
si-annotate-cf.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
si-annotate-cfg-loop-assert.ll
si-instr-info-correct-implicit-operands.ll AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass 2016-11-16 18:42:17 +00:00
si-literal-folding.ll AMDGPU: Fix immediate folding logic when shrinking instructions 2016-09-09 23:32:53 +00:00
si-lod-bias.ll
si-lower-control-flow-unreachable-block.ll BranchRelaxation: Support expanding unconditional branches 2016-10-06 16:20:41 +00:00
si-scheduler.ll
si-sgpr-spill.ll AMDGPU: Fix using incorrect private resource with no allocation 2016-10-28 19:43:31 +00:00
si-spill-cf.ll
si-spill-sgpr-stack.ll AMDGPU: Use wider scalar spills for SGPR spilling 2016-12-02 00:54:45 +00:00
si-triv-disjoint-mem-access.ll In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. 2016-12-09 16:15:12 +00:00
si-vector-hang.ll
sign_extend.ll AMDGPU/SI: Fix pattern for i16 = sign_extend i1 2016-11-15 21:25:56 +00:00
sint_to_fp.f64.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
sint_to_fp.i64.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
sint_to_fp.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
sitofp.f16.ll [AMDGPU] Promote f16/i16 conversions to f32/i32 2016-11-17 04:00:46 +00:00
skip-if-dead.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
smed3.ll
sminmax.ll
smrd-vccz-bug.ll AMDGPU: Remove unnecessary and on conditional branch 2016-11-07 19:09:33 +00:00
smrd.ll
sopk-compares.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
spill-alloc-sgpr-init-bug.ll AMDGPU: Fix using incorrect private resource with no allocation 2016-10-28 19:43:31 +00:00
spill-m0.ll AMDGPU: Use wider scalar spills for SGPR spilling 2016-12-02 00:54:45 +00:00
spill-scavenge-offset.ll
spill-wide-sgpr.ll AMDGPU: Use wider scalar spills for SGPR spilling 2016-12-02 00:54:45 +00:00
split-scalar-i64-add.ll
split-smrd.ll
split-vector-memoperand-offsets.ll AMDGPU: Cleanup some xfailed tests 2016-11-02 17:24:54 +00:00
sra.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
srem.ll
srl.ll
ssubo.ll
store_typed.ll
store-barrier.ll
store-global.ll AMDGPU: Cleanup some xfailed tests 2016-11-02 17:24:54 +00:00
store-local.ll AMDGPU/R600: Expand unaligned writes to local and global AS 2016-09-02 19:07:06 +00:00
store-v3i64.ll AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler 2016-08-29 19:15:22 +00:00
store-vector-ptrs.ll
structurize1.ll
structurize.ll
sub.i16.ll AMDGPU: Select i16 instructions to VOP3 forms 2016-12-09 06:19:12 +00:00
sub.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
subreg-coalescer-crash.ll
subreg-coalescer-undef-use.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
subreg-eliminate-dead.ll
swizzle-export.ll
target-cpu.ll [AMDGPU] Wave and register controls 2016-09-06 20:22:28 +00:00
tex-clause-antidep.ll
texture-input-merge.ll
trap.ll
trunc-bitcast-vector.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
trunc-cmp-constant.ll [AMDGPU] Promote uniform (i1, i16] operations to i32 2016-10-07 14:22:58 +00:00
trunc-store-f64-to-f16.ll AMDGPU: Implement expansion of f16 = FP_TO_FP16 f64 2016-11-01 16:31:48 +00:00
trunc-store-i1.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
trunc-store.ll
trunc-vector-store-assertion-failure.ll
trunc.ll Revert "AMDGPU: Enable ConstrainCopy DAG mutation" 2016-11-17 16:41:49 +00:00
tti-unroll-prefs.ll
uaddo.ll
udiv.ll [AMDGPU] Expand vector mulhu/mulhs 2016-11-01 10:26:48 +00:00
udivrem24.ll
udivrem64.ll
udivrem.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
uint_to_fp.f64.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
uint_to_fp.i64.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
uint_to_fp.ll AMDGPU: Use unsigned compare for eq/ne 2016-09-30 01:50:20 +00:00
uitofp.f16.ll [AMDGPU] Promote f16/i16 conversions to f32/i32 2016-11-17 04:00:46 +00:00
umed3.ll
unaligned-load-store.ll AMDGPU: Fix introducing stack access on unaligned v16i8 2016-08-31 21:52:27 +00:00
undefined-subreg-liverange.ll
unhandled-loop-condition-assertion.ll
uniform-branch-intrinsic-cond.ll
uniform-cfg.ll AMDGPU: Don't required structured CFG 2016-12-06 01:02:51 +00:00
uniform-crash.ll
uniform-loop-inside-nonuniform.ll AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches 2016-11-29 00:46:46 +00:00
unify-metadata.ll [AMDGPU] Add amdgpu-unify-metadata pass 2016-12-08 19:46:04 +00:00
unigine-liveness-crash.ll Do not consider subreg defs as reads when computing subrange liveness 2016-09-02 19:48:55 +00:00
unknown-processor.ll
unroll.ll
unsupported-cc.ll
urecip.ll
urem.ll
use-sgpr-multiple-times.ll AMDGPU/SI: Implement a custom MachineSchedStrategy 2016-08-29 19:42:52 +00:00
usubo.ll
v1i64-kernel-arg.ll AMDGPU: Refactor kernel argument lowering 2016-09-16 21:53:00 +00:00
v_cndmask.ll
v_cvt_pk_u8_f32.ll
v_mac_f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
v_mac.ll
v_madak_f16.ll [AMDGPU] Add f16 support (VI+) 2016-11-13 07:01:11 +00:00
valu-i1.ll AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches 2016-11-29 00:46:46 +00:00
vector-alloca.ll
vector-extract-insert.ll
vertex-fetch-encoding.ll
vgpr-spill-emergency-stack-slot-compute.ll AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg 2016-12-08 14:08:02 +00:00
vgpr-spill-emergency-stack-slot.ll AMDGPU/SI: Add back reverted SGPR spilling code, but disable it 2016-11-25 17:37:09 +00:00
vi-removed-intrinsics.ll
vop-shrink.ll
vselect64.ll
vselect.ll
vtx-fetch-branch.ll
vtx-schedule.ll
wait.ll
waitcnt-flat.ll
wqm.ll ScheduleDAGInstrs: Add condjump deps to addSchedBarrierDeps() 2016-11-11 01:34:21 +00:00
write_register.ll
write-register-vgpr-into-sgpr.ll
wrong-transalu-pos-fix.ll
xfail.r600.bitcast.ll AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements. 2016-09-02 20:13:19 +00:00
xor.ll Revert "AMDGPU: Enable ConstrainCopy DAG mutation" 2016-11-17 16:41:49 +00:00
zero_extend.ll AMDGPU: Add VI i16 support 2016-11-10 16:02:37 +00:00
zext-i64-bit-operand.ll

+==============================================================================+
| How to organize the lit tests                                                |
+==============================================================================+

- If you write a test for matching a single DAG opcode or intrinsic, it should
  go in a file called {opcode_name,intrinsic_name}.ll (e.g. fadd.ll)

- If you write a test that matches several DAG opcodes and checks for a single
  ISA instruction, then that test should go in a file called {ISA_name}.ll (e.g.
  bfi_int.ll

- For all other tests, use your best judgement for organizing tests and naming
  the files.

+==============================================================================+
| Naming conventions                                                           |
+==============================================================================+

- Use dash '-' and not underscore '_' to separate words in file names, unless
  the file is named after a DAG opcode or ISA instruction that has an
  underscore '_' in its name.