The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is
not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support
boolean values.
llvm-svn: 133873
The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates.
llvm-svn: 133814
parameters if SM >= 2.0
- Update test cases to be more robust against register allocation changes
- Bump up the number of registers to 128 per type
- Include Python script to re-generate register file with any number of
registers
llvm-svn: 133736
used by Clang. To help Clang integration, the PTX target has been split
into two targets: ptx32 and ptx64, depending on the desired pointer size.
- Add GCCBuiltin class to all intrinsics
- Split PTX target into ptx32 and ptx64
llvm-svn: 129851
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)
llvm-svn: 127895
- Allow i16, i32, i64, float, and double types, using the native .u16,
.u32, .u64, .f32, and .f64 PTX types.
- Allow loading/storing of all primitive types.
- Allow primitive types to be passed as parameters.
- Allow selection of PTX Version and Shader Model as sub-target attributes.
- Merge integer/floating-point test cases for load/store.
- Use .u32 instead of .s32 to conform to output from NVidia nvcc compiler.
Patch by Justin Holewinski
llvm-svn: 126824