1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
llvm-mirror/lib/Target/AMDGPU
Nicolai Haehnle 8f90c2a416 AMDGPU/SI: Fold operands with sub-registers
Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.

Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.

Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.

With the IR generated by current Mesa, the changes are obviously relatively
minor:

7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)

Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)

Reviewers: tstellarAMD, arsenm, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15875

llvm-svn: 257074
2016-01-07 17:10:29 +00:00
..
AsmParser AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI 2015-12-21 18:44:27 +00:00
InstPrinter Remove extra forward declarations and scrub includes for all in tree InstPrinters. NFC 2015-12-25 22:10:01 +00:00
MCTargetDesc AMDGPU/SI: Fix warning introduced by r255204 2015-12-10 03:10:46 +00:00
TargetInfo
Utils AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI 2015-12-21 18:44:27 +00:00
AMDGPU.h AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions 2015-12-15 20:55:55 +00:00
AMDGPU.td AMDGPU: add +xnack feature 2016-01-04 23:35:53 +00:00
AMDGPUAlwaysInlinePass.cpp
AMDGPUAnnotateKernelFeatures.cpp AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsic 2015-11-26 00:43:29 +00:00
AMDGPUAnnotateUniformValues.cpp AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions 2015-12-15 20:55:55 +00:00
AMDGPUAsmPrinter.cpp AMDGPU/SI: xnack_mask is always reserved on VI 2016-01-07 17:10:20 +00:00
AMDGPUAsmPrinter.h AMDGPU/SI: Emit constant arrays in the .text section 2015-12-10 02:13:01 +00:00
AMDGPUCallingConv.td
AMDGPUDiagnosticInfoUnsupported.cpp
AMDGPUDiagnosticInfoUnsupported.h
AMDGPUFrameLowering.cpp
AMDGPUFrameLowering.h AMDGPU: Create emergency stack slots during frame lowering 2015-11-06 18:17:45 +00:00
AMDGPUInstrInfo.cpp
AMDGPUInstrInfo.h
AMDGPUInstrInfo.td AMDGPU: Use generic bitreverse intrinsic 2015-12-14 17:25:38 +00:00
AMDGPUInstructions.td AMDGPU/SI: Consolidate FLAT patterns 2016-01-05 02:26:37 +00:00
AMDGPUIntrinsicInfo.cpp
AMDGPUIntrinsicInfo.h
AMDGPUIntrinsics.td AMDGPU: Switch barrier intrinsics to using convergent 2015-12-19 01:46:41 +00:00
AMDGPUISelDAGToDAG.cpp AMDGPU/SI: Use flat for global load/store when targeting HSA 2015-12-22 20:55:23 +00:00
AMDGPUISelLowering.cpp AMDGPU: Use generic bitreverse intrinsic 2015-12-14 17:25:38 +00:00
AMDGPUISelLowering.h AMDGPU: Use generic bitreverse intrinsic 2015-12-14 17:25:38 +00:00
AMDGPUMachineFunction.cpp AMDGPU/SI: Add getShaderType() function to Utils/ 2015-12-15 16:26:16 +00:00
AMDGPUMachineFunction.h AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL 2015-11-06 11:45:14 +00:00
AMDGPUMCInstLower.cpp AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI 2015-12-21 18:44:27 +00:00
AMDGPUMCInstLower.h
AMDGPUOpenCLImageTypeLoweringPass.cpp
AMDGPUPromoteAlloca.cpp Revert "Change memcpy/memset/memmove to have dest and source alignments." 2015-11-19 05:56:52 +00:00
AMDGPURegisterInfo.cpp
AMDGPURegisterInfo.h
AMDGPURegisterInfo.td
AMDGPUSubtarget.cpp AMDGPU: add +xnack feature 2016-01-04 23:35:53 +00:00
AMDGPUSubtarget.h AMDGPU: add +xnack feature 2016-01-04 23:35:53 +00:00
AMDGPUTargetMachine.cpp AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions 2015-12-15 20:55:55 +00:00
AMDGPUTargetMachine.h
AMDGPUTargetObjectFile.cpp AMDGPU/SI: Emit constant variables in the .hsatext section when targeting HSA 2015-12-15 22:39:36 +00:00
AMDGPUTargetObjectFile.h AMDGPU/SI: Emit constant arrays in the .text section 2015-12-10 02:13:01 +00:00
AMDGPUTargetTransformInfo.cpp AMDGPU: Fix getRegisterBitWidth for vectors 2015-12-24 05:14:55 +00:00
AMDGPUTargetTransformInfo.h AMDGPU: Override getCFInstrCost 2015-12-16 18:37:19 +00:00
AMDILCFGStructurizer.cpp Normalize MBB's successors' probabilities in several locations. 2015-12-13 09:26:17 +00:00
AMDKernelCodeT.h
CaymanInstructions.td
CIInstructions.td AMDGPU/SI: Consolidate FLAT patterns 2016-01-05 02:26:37 +00:00
CMakeLists.txt AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions 2015-12-15 20:55:55 +00:00
EvergreenInstructions.td
LLVMBuild.txt
Makefile
Processors.td AMDGPU: Add stony support 2015-11-13 17:06:32 +00:00
R600ClauseMergePass.cpp
R600ControlFlowFinalizer.cpp
R600Defines.h
R600EmitClauseMarkers.cpp
R600ExpandSpecialInstrs.cpp
R600InstrFormats.td
R600InstrInfo.cpp
R600InstrInfo.h
R600Instructions.td
R600Intrinsics.td
R600ISelLowering.cpp
R600ISelLowering.h
R600MachineFunctionInfo.cpp
R600MachineFunctionInfo.h
R600MachineScheduler.cpp
R600MachineScheduler.h
R600OptimizeVectorRegisters.cpp
R600Packetizer.cpp [Packetizer] Add AliasAnalysis as a parameter to the packetizer 2015-12-14 20:35:13 +00:00
R600RegisterInfo.cpp
R600RegisterInfo.h
R600RegisterInfo.td
R600Schedule.td
R600TextureIntrinsicsReplacer.cpp
R700Instructions.td
SIAnnotateControlFlow.cpp
SIDefines.h
SIFixControlFlowLiveIntervals.cpp
SIFixSGPRCopies.cpp AMDGPU/SI: Fold operands with sub-registers 2016-01-07 17:10:29 +00:00
SIFixSGPRLiveRanges.cpp
SIFoldOperands.cpp AMDGPU/SI: Fold operands with sub-registers 2016-01-07 17:10:29 +00:00
SIFrameLowering.cpp AMDGPU/SI: Do not move scratch resource register on Tonga & Iceland 2016-01-05 20:42:49 +00:00
SIFrameLowering.h AMDGPU: Remove SIPrepareScratchRegs 2015-11-30 21:15:53 +00:00
SIInsertWaits.cpp
SIInstrFormats.td
SIInstrInfo.cpp AMDGPU/SI: Fold operands with sub-registers 2016-01-07 17:10:29 +00:00
SIInstrInfo.h AMDGPU: Fix off-by-one in SIRegisterInfo::eliminateFrameIndex 2015-12-17 16:46:42 +00:00
SIInstrInfo.td AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA 2016-01-05 03:40:16 +00:00
SIInstructions.td AMDGPU: Remove redundant let mayLoad = 1 2016-01-05 04:50:28 +00:00
SIIntrinsics.td
SIISelLowering.cpp AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions 2015-12-15 20:55:55 +00:00
SIISelLowering.h AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions 2015-12-15 20:55:55 +00:00
SILoadStoreOptimizer.cpp
SILowerControlFlow.cpp
SILowerI1Copies.cpp
SIMachineFunctionInfo.cpp AMDGPU: Avoid assertions after SGPR spilling failed 2016-01-04 15:50:01 +00:00
SIMachineFunctionInfo.h AMDGPU: Rework how private buffer passed for HSA 2015-11-30 21:16:03 +00:00
SIRegisterInfo.cpp AMDGPU/SI: Fold operands with sub-registers 2016-01-07 17:10:29 +00:00
SIRegisterInfo.h AMDGPU: Optimize VOP2 operand legalization 2015-12-01 19:57:17 +00:00
SIRegisterInfo.td AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI 2015-12-21 18:44:27 +00:00
SISchedule.td
SIShrinkInstructions.cpp
SITypeRewriter.cpp AMDGPU/SI: Fix crash when inline assembly is used in a graphics shader 2016-01-06 22:01:04 +00:00
VIInstrFormats.td
VIInstructions.td AMDGPU/SI: Move VI SMEM pattern back into VIInstructions.td 2016-01-04 20:23:10 +00:00