The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is
not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support
boolean values.
llvm-svn: 133873
Move the target-specific RecordRelocation logic out of the generic MC
MachObjectWriter and into the target-specific object writers. This allows
nuking quite a bit of target knowledge from the supposedly target-independent
bits in lib/MC.
llvm-svn: 133844
Sorry, this was a bad idea. Within clang these builtins are in a separate
"ARM" namespace, but the actual builtin names should clearly distinguish that
they are target specific.
llvm-svn: 133832
The fixup value comes in as the whole 32-bit value, so for the lo16 fixup,
the upper bits need to be masked off. Previously we assumed the masking had
already been done and asserted.
rdar://9635991
llvm-svn: 133818
The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates.
llvm-svn: 133814
instructions can be used to match combinations of multiply/divide and VCVT
(between floating-point and integer, Advanced SIMD). Basically the VCVT
immediate operand that specifies the number of fraction bits corresponds to a
floating-point multiply or divide by the corresponding power of 2.
For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a
combination of VMUL and VCVT (floating-point to integer) as follows:
Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
vmul.f32 d16, d17, d16
vcvt.s32.f32 d16, d16
becomes:
vcvt.s32.f32 d16, d16, #3
Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a
combinations of VCVT (integer to floating-point) and VDIV as follows:
Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
vcvt.f32.s32 d16, d16
vdiv.f32 d16, d17, d16
becomes:
vcvt.f32.s32 d16, d16, #3
llvm-svn: 133813
.file and .loc directives.
Ideally, we would utilize the existing support in AsmPrinter for this, but
I cannot find a way to get .file and .loc directives to print without the
rest of the associated DWARF sections, which ptxas cannot handle.
llvm-svn: 133812
enables SelectionDAG::getLoad at MipsISelLowering.cpp:1914 to return a
pre-existing node instead of redundantly create a new node every time it is
called.
llvm-svn: 133811
This caused linker errors when linking both libLLVMX86Desc and libLLVMX86CodeGen
into a single binary (for example when building a monolithic libLLVM shared library).
llvm-svn: 133791