From c038c5cb7c37c715b33569c9e8c386b0f07729dc Mon Sep 17 00:00:00 2001 From: Justin Lebar Date: Mon, 21 Mar 2016 23:05:15 +0000 Subject: [PATCH] [CUDA] Add documentation explaining how to detect clang vs nvcc. llvm-svn: 264002 --- docs/CompileCudaWithLLVM.rst | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/docs/CompileCudaWithLLVM.rst b/docs/CompileCudaWithLLVM.rst index 159e0c0e5b3..bb1ce041e51 100644 --- a/docs/CompileCudaWithLLVM.rst +++ b/docs/CompileCudaWithLLVM.rst @@ -119,6 +119,34 @@ your GPU `_. For example, if you want to run your program on a GPU with compute capability of 3.5, you should specify ``--cuda-gpu-arch=sm_35``. +Detecting clang vs NVCC +======================= + +Although clang's CUDA implementation is largely compatible with NVCC's, you may +still want to detect when you're compiling CUDA code specifically with clang. + +This is tricky, because clang defines the ``__NVCC__`` macro, and because NVCC +may invoke clang as part of its own compilation process! For example, NVCC +uses the host compiler's preprocessor when compiling for device code, and that +host compiler may in fact be clang. + +When clang is actually compiling CUDA code -- rather than being used as a +subtool of NVCC's -- it defines the ``__CUDA__`` macro. ``__CUDA_ARCH__`` is +defined only in device mode (but will be defined if NVCC is using clang as a +preprocessor). So you can use the following incantations to detect clang CUDA +compilation, in host and device modes: + +.. code-block:: c++ + + #if defined(__clang__) && defined(__CUDA__) && !defined(__CUDA_ARCH__) + // clang compiling CUDA code, host mode. + #endif + + #if defined(__clang__) && defined(__CUDA__) && defined(__CUDA_ARCH__) + // clang compiling CUDA code, device mode. + #endif + + Optimizations =============