mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-22 10:42:39 +01:00
e40bb79a12
Adds NVPTX builtins and intrinsics for the CUDA PTX `wmma.load`, `wmma.store`, `wmma.mma`, and `mma` instructions added in PTX 6.5 and 7.0. PTX ISA description of - `wmma.load`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-ld - `wmma.store`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-st - `wmma.mma`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma - `mma`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-mma Overview of `wmma.mma` and `mma` matrix shape/type combinations added with specific PTX versions: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-shape Authored-by: Steffen Larsen <steffen.larsen@codeplay.com> Co-Authored-by: Stuart Adams <stuart.adams@codeplay.com> Reviewed By: tra Differential Revision: https://reviews.llvm.org/D104847 |
||
---|---|---|
.. | ||
ADT | ||
Analysis | ||
AsmParser | ||
BinaryFormat | ||
Bitcode | ||
Bitstream | ||
CodeGen | ||
Config | ||
DebugInfo | ||
Demangle | ||
DWARFLinker | ||
ExecutionEngine | ||
FileCheck | ||
Frontend | ||
FuzzMutate | ||
InterfaceStub | ||
IR | ||
IRReader | ||
LineEditor | ||
Linker | ||
LTO | ||
MC | ||
MCA | ||
Object | ||
ObjectYAML | ||
Option | ||
Passes | ||
ProfileData | ||
Remarks | ||
Support | ||
TableGen | ||
Target | ||
Testing/Support | ||
TextAPI | ||
ToolDrivers | ||
Transforms | ||
WindowsManifest | ||
WindowsResource | ||
XRay | ||
CMakeLists.txt | ||
InitializePasses.h | ||
LinkAllIR.h | ||
LinkAllPasses.h | ||
module.extern.modulemap | ||
module.install.modulemap | ||
module.modulemap | ||
module.modulemap.build | ||
Pass.h | ||
PassAnalysisSupport.h | ||
PassInfo.h | ||
PassRegistry.h | ||
PassSupport.h |