mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-24 19:52:54 +01:00
8d22c58801
llvm-svn: 219947
9564 lines
324 KiB
ReStructuredText
9564 lines
324 KiB
ReStructuredText
==============================
|
|
LLVM Language Reference Manual
|
|
==============================
|
|
|
|
.. contents::
|
|
:local:
|
|
:depth: 4
|
|
|
|
Abstract
|
|
========
|
|
|
|
This document is a reference manual for the LLVM assembly language. LLVM
|
|
is a Static Single Assignment (SSA) based representation that provides
|
|
type safety, low-level operations, flexibility, and the capability of
|
|
representing 'all' high-level languages cleanly. It is the common code
|
|
representation used throughout all phases of the LLVM compilation
|
|
strategy.
|
|
|
|
Introduction
|
|
============
|
|
|
|
The LLVM code representation is designed to be used in three different
|
|
forms: as an in-memory compiler IR, as an on-disk bitcode representation
|
|
(suitable for fast loading by a Just-In-Time compiler), and as a human
|
|
readable assembly language representation. This allows LLVM to provide a
|
|
powerful intermediate representation for efficient compiler
|
|
transformations and analysis, while providing a natural means to debug
|
|
and visualize the transformations. The three different forms of LLVM are
|
|
all equivalent. This document describes the human readable
|
|
representation and notation.
|
|
|
|
The LLVM representation aims to be light-weight and low-level while
|
|
being expressive, typed, and extensible at the same time. It aims to be
|
|
a "universal IR" of sorts, by being at a low enough level that
|
|
high-level ideas may be cleanly mapped to it (similar to how
|
|
microprocessors are "universal IR's", allowing many source languages to
|
|
be mapped to them). By providing type information, LLVM can be used as
|
|
the target of optimizations: for example, through pointer analysis, it
|
|
can be proven that a C automatic variable is never accessed outside of
|
|
the current function, allowing it to be promoted to a simple SSA value
|
|
instead of a memory location.
|
|
|
|
.. _wellformed:
|
|
|
|
Well-Formedness
|
|
---------------
|
|
|
|
It is important to note that this document describes 'well formed' LLVM
|
|
assembly language. There is a difference between what the parser accepts
|
|
and what is considered 'well formed'. For example, the following
|
|
instruction is syntactically okay, but not well formed:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%x = add i32 1, %x
|
|
|
|
because the definition of ``%x`` does not dominate all of its uses. The
|
|
LLVM infrastructure provides a verification pass that may be used to
|
|
verify that an LLVM module is well formed. This pass is automatically
|
|
run by the parser after parsing input assembly and by the optimizer
|
|
before it outputs bitcode. The violations pointed out by the verifier
|
|
pass indicate bugs in transformation passes or input to the parser.
|
|
|
|
.. _identifiers:
|
|
|
|
Identifiers
|
|
===========
|
|
|
|
LLVM identifiers come in two basic types: global and local. Global
|
|
identifiers (functions, global variables) begin with the ``'@'``
|
|
character. Local identifiers (register names, types) begin with the
|
|
``'%'`` character. Additionally, there are three different formats for
|
|
identifiers, for different purposes:
|
|
|
|
#. Named values are represented as a string of characters with their
|
|
prefix. For example, ``%foo``, ``@DivisionByZero``,
|
|
``%a.really.long.identifier``. The actual regular expression used is
|
|
'``[%@][a-zA-Z$._][a-zA-Z$._0-9]*``'. Identifiers that require other
|
|
characters in their names can be surrounded with quotes. Special
|
|
characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
|
|
code for the character in hexadecimal. In this way, any character can
|
|
be used in a name value, even quotes themselves. The ``"\01"`` prefix
|
|
can be used on global variables to suppress mangling.
|
|
#. Unnamed values are represented as an unsigned numeric value with
|
|
their prefix. For example, ``%12``, ``@2``, ``%44``.
|
|
#. Constants, which are described in the section Constants_ below.
|
|
|
|
LLVM requires that values start with a prefix for two reasons: Compilers
|
|
don't need to worry about name clashes with reserved words, and the set
|
|
of reserved words may be expanded in the future without penalty.
|
|
Additionally, unnamed identifiers allow a compiler to quickly come up
|
|
with a temporary variable without having to avoid symbol table
|
|
conflicts.
|
|
|
|
Reserved words in LLVM are very similar to reserved words in other
|
|
languages. There are keywords for different opcodes ('``add``',
|
|
'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
|
|
'``i32``', etc...), and others. These reserved words cannot conflict
|
|
with variable names, because none of them start with a prefix character
|
|
(``'%'`` or ``'@'``).
|
|
|
|
Here is an example of LLVM code to multiply the integer variable
|
|
'``%X``' by 8:
|
|
|
|
The easy way:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%result = mul i32 %X, 8
|
|
|
|
After strength reduction:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%result = shl i32 %X, 3
|
|
|
|
And the hard way:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%0 = add i32 %X, %X ; yields i32:%0
|
|
%1 = add i32 %0, %0 ; yields i32:%1
|
|
%result = add i32 %1, %1
|
|
|
|
This last way of multiplying ``%X`` by 8 illustrates several important
|
|
lexical features of LLVM:
|
|
|
|
#. Comments are delimited with a '``;``' and go until the end of line.
|
|
#. Unnamed temporaries are created when the result of a computation is
|
|
not assigned to a named value.
|
|
#. Unnamed temporaries are numbered sequentially (using a per-function
|
|
incrementing counter, starting with 0). Note that basic blocks and unnamed
|
|
function parameters are included in this numbering. For example, if the
|
|
entry basic block is not given a label name and all function parameters are
|
|
named, then it will get number 0.
|
|
|
|
It also shows a convention that we follow in this document. When
|
|
demonstrating instructions, we will follow an instruction with a comment
|
|
that defines the type and name of value produced.
|
|
|
|
High Level Structure
|
|
====================
|
|
|
|
Module Structure
|
|
----------------
|
|
|
|
LLVM programs are composed of ``Module``'s, each of which is a
|
|
translation unit of the input programs. Each module consists of
|
|
functions, global variables, and symbol table entries. Modules may be
|
|
combined together with the LLVM linker, which merges function (and
|
|
global variable) definitions, resolves forward declarations, and merges
|
|
symbol table entries. Here is an example of the "hello world" module:
|
|
|
|
.. code-block:: llvm
|
|
|
|
; Declare the string constant as a global constant.
|
|
@.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
|
|
|
|
; External declaration of the puts function
|
|
declare i32 @puts(i8* nocapture) nounwind
|
|
|
|
; Definition of main function
|
|
define i32 @main() { ; i32()*
|
|
; Convert [13 x i8]* to i8 *...
|
|
%cast210 = getelementptr [13 x i8]* @.str, i64 0, i64 0
|
|
|
|
; Call puts function to write out the string to stdout.
|
|
call i32 @puts(i8* %cast210)
|
|
ret i32 0
|
|
}
|
|
|
|
; Named metadata
|
|
!0 = metadata !{i32 42, null, metadata !"string"}
|
|
!foo = !{!0}
|
|
|
|
This example is made up of a :ref:`global variable <globalvars>` named
|
|
"``.str``", an external declaration of the "``puts``" function, a
|
|
:ref:`function definition <functionstructure>` for "``main``" and
|
|
:ref:`named metadata <namedmetadatastructure>` "``foo``".
|
|
|
|
In general, a module is made up of a list of global values (where both
|
|
functions and global variables are global values). Global values are
|
|
represented by a pointer to a memory location (in this case, a pointer
|
|
to an array of char, and a pointer to a function), and have one of the
|
|
following :ref:`linkage types <linkage>`.
|
|
|
|
.. _linkage:
|
|
|
|
Linkage Types
|
|
-------------
|
|
|
|
All Global Variables and Functions have one of the following types of
|
|
linkage:
|
|
|
|
``private``
|
|
Global values with "``private``" linkage are only directly
|
|
accessible by objects in the current module. In particular, linking
|
|
code into a module with an private global value may cause the
|
|
private to be renamed as necessary to avoid collisions. Because the
|
|
symbol is private to the module, all references can be updated. This
|
|
doesn't show up in any symbol table in the object file.
|
|
``internal``
|
|
Similar to private, but the value shows as a local symbol
|
|
(``STB_LOCAL`` in the case of ELF) in the object file. This
|
|
corresponds to the notion of the '``static``' keyword in C.
|
|
``available_externally``
|
|
Globals with "``available_externally``" linkage are never emitted
|
|
into the object file corresponding to the LLVM module. They exist to
|
|
allow inlining and other optimizations to take place given knowledge
|
|
of the definition of the global, which is known to be somewhere
|
|
outside the module. Globals with ``available_externally`` linkage
|
|
are allowed to be discarded at will, and are otherwise the same as
|
|
``linkonce_odr``. This linkage type is only allowed on definitions,
|
|
not declarations.
|
|
``linkonce``
|
|
Globals with "``linkonce``" linkage are merged with other globals of
|
|
the same name when linkage occurs. This can be used to implement
|
|
some forms of inline functions, templates, or other code which must
|
|
be generated in each translation unit that uses it, but where the
|
|
body may be overridden with a more definitive definition later.
|
|
Unreferenced ``linkonce`` globals are allowed to be discarded. Note
|
|
that ``linkonce`` linkage does not actually allow the optimizer to
|
|
inline the body of this function into callers because it doesn't
|
|
know if this definition of the function is the definitive definition
|
|
within the program or whether it will be overridden by a stronger
|
|
definition. To enable inlining and other optimizations, use
|
|
"``linkonce_odr``" linkage.
|
|
``weak``
|
|
"``weak``" linkage has the same merging semantics as ``linkonce``
|
|
linkage, except that unreferenced globals with ``weak`` linkage may
|
|
not be discarded. This is used for globals that are declared "weak"
|
|
in C source code.
|
|
``common``
|
|
"``common``" linkage is most similar to "``weak``" linkage, but they
|
|
are used for tentative definitions in C, such as "``int X;``" at
|
|
global scope. Symbols with "``common``" linkage are merged in the
|
|
same way as ``weak symbols``, and they may not be deleted if
|
|
unreferenced. ``common`` symbols may not have an explicit section,
|
|
must have a zero initializer, and may not be marked
|
|
':ref:`constant <globalvars>`'. Functions and aliases may not have
|
|
common linkage.
|
|
|
|
.. _linkage_appending:
|
|
|
|
``appending``
|
|
"``appending``" linkage may only be applied to global variables of
|
|
pointer to array type. When two global variables with appending
|
|
linkage are linked together, the two global arrays are appended
|
|
together. This is the LLVM, typesafe, equivalent of having the
|
|
system linker append together "sections" with identical names when
|
|
.o files are linked.
|
|
``extern_weak``
|
|
The semantics of this linkage follow the ELF object file model: the
|
|
symbol is weak until linked, if not linked, the symbol becomes null
|
|
instead of being an undefined reference.
|
|
``linkonce_odr``, ``weak_odr``
|
|
Some languages allow differing globals to be merged, such as two
|
|
functions with different semantics. Other languages, such as
|
|
``C++``, ensure that only equivalent globals are ever merged (the
|
|
"one definition rule" --- "ODR"). Such languages can use the
|
|
``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
|
|
global will only be merged with equivalent globals. These linkage
|
|
types are otherwise the same as their non-``odr`` versions.
|
|
``external``
|
|
If none of the above identifiers are used, the global is externally
|
|
visible, meaning that it participates in linkage and can be used to
|
|
resolve external symbol references.
|
|
|
|
It is illegal for a function *declaration* to have any linkage type
|
|
other than ``external`` or ``extern_weak``.
|
|
|
|
.. _callingconv:
|
|
|
|
Calling Conventions
|
|
-------------------
|
|
|
|
LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
|
|
:ref:`invokes <i_invoke>` can all have an optional calling convention
|
|
specified for the call. The calling convention of any pair of dynamic
|
|
caller/callee must match, or the behavior of the program is undefined.
|
|
The following calling conventions are supported by LLVM, and more may be
|
|
added in the future:
|
|
|
|
"``ccc``" - The C calling convention
|
|
This calling convention (the default if no other calling convention
|
|
is specified) matches the target C calling conventions. This calling
|
|
convention supports varargs function calls and tolerates some
|
|
mismatch in the declared prototype and implemented declaration of
|
|
the function (as does normal C).
|
|
"``fastcc``" - The fast calling convention
|
|
This calling convention attempts to make calls as fast as possible
|
|
(e.g. by passing things in registers). This calling convention
|
|
allows the target to use whatever tricks it wants to produce fast
|
|
code for the target, without having to conform to an externally
|
|
specified ABI (Application Binary Interface). `Tail calls can only
|
|
be optimized when this, the GHC or the HiPE convention is
|
|
used. <CodeGenerator.html#id80>`_ This calling convention does not
|
|
support varargs and requires the prototype of all callees to exactly
|
|
match the prototype of the function definition.
|
|
"``coldcc``" - The cold calling convention
|
|
This calling convention attempts to make code in the caller as
|
|
efficient as possible under the assumption that the call is not
|
|
commonly executed. As such, these calls often preserve all registers
|
|
so that the call does not break any live ranges in the caller side.
|
|
This calling convention does not support varargs and requires the
|
|
prototype of all callees to exactly match the prototype of the
|
|
function definition. Furthermore the inliner doesn't consider such function
|
|
calls for inlining.
|
|
"``cc 10``" - GHC convention
|
|
This calling convention has been implemented specifically for use by
|
|
the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
|
|
It passes everything in registers, going to extremes to achieve this
|
|
by disabling callee save registers. This calling convention should
|
|
not be used lightly but only for specific situations such as an
|
|
alternative to the *register pinning* performance technique often
|
|
used when implementing functional programming languages. At the
|
|
moment only X86 supports this convention and it has the following
|
|
limitations:
|
|
|
|
- On *X86-32* only supports up to 4 bit type parameters. No
|
|
floating point types are supported.
|
|
- On *X86-64* only supports up to 10 bit type parameters and 6
|
|
floating point parameters.
|
|
|
|
This calling convention supports `tail call
|
|
optimization <CodeGenerator.html#id80>`_ but requires both the
|
|
caller and callee are using it.
|
|
"``cc 11``" - The HiPE calling convention
|
|
This calling convention has been implemented specifically for use by
|
|
the `High-Performance Erlang
|
|
(HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
|
|
native code compiler of the `Ericsson's Open Source Erlang/OTP
|
|
system <http://www.erlang.org/download.shtml>`_. It uses more
|
|
registers for argument passing than the ordinary C calling
|
|
convention and defines no callee-saved registers. The calling
|
|
convention properly supports `tail call
|
|
optimization <CodeGenerator.html#id80>`_ but requires that both the
|
|
caller and the callee use it. It uses a *register pinning*
|
|
mechanism, similar to GHC's convention, for keeping frequently
|
|
accessed runtime components pinned to specific hardware registers.
|
|
At the moment only X86 supports this convention (both 32 and 64
|
|
bit).
|
|
"``webkit_jscc``" - WebKit's JavaScript calling convention
|
|
This calling convention has been implemented for `WebKit FTL JIT
|
|
<https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
|
|
stack right to left (as cdecl does), and returns a value in the
|
|
platform's customary return register.
|
|
"``anyregcc``" - Dynamic calling convention for code patching
|
|
This is a special convention that supports patching an arbitrary code
|
|
sequence in place of a call site. This convention forces the call
|
|
arguments into registers but allows them to be dynamcially
|
|
allocated. This can currently only be used with calls to
|
|
llvm.experimental.patchpoint because only this intrinsic records
|
|
the location of its arguments in a side table. See :doc:`StackMaps`.
|
|
"``preserve_mostcc``" - The `PreserveMost` calling convention
|
|
This calling convention attempts to make the code in the caller as little
|
|
intrusive as possible. This calling convention behaves identical to the `C`
|
|
calling convention on how arguments and return values are passed, but it
|
|
uses a different set of caller/callee-saved registers. This alleviates the
|
|
burden of saving and recovering a large register set before and after the
|
|
call in the caller. If the arguments are passed in callee-saved registers,
|
|
then they will be preserved by the callee across the call. This doesn't
|
|
apply for values returned in callee-saved registers.
|
|
|
|
- On X86-64 the callee preserves all general purpose registers, except for
|
|
R11. R11 can be used as a scratch register. Floating-point registers
|
|
(XMMs/YMMs) are not preserved and need to be saved by the caller.
|
|
|
|
The idea behind this convention is to support calls to runtime functions
|
|
that have a hot path and a cold path. The hot path is usually a small piece
|
|
of code that doesn't many registers. The cold path might need to call out to
|
|
another function and therefore only needs to preserve the caller-saved
|
|
registers, which haven't already been saved by the caller. The
|
|
`PreserveMost` calling convention is very similar to the `cold` calling
|
|
convention in terms of caller/callee-saved registers, but they are used for
|
|
different types of function calls. `coldcc` is for function calls that are
|
|
rarely executed, whereas `preserve_mostcc` function calls are intended to be
|
|
on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
|
|
doesn't prevent the inliner from inlining the function call.
|
|
|
|
This calling convention will be used by a future version of the ObjectiveC
|
|
runtime and should therefore still be considered experimental at this time.
|
|
Although this convention was created to optimize certain runtime calls to
|
|
the ObjectiveC runtime, it is not limited to this runtime and might be used
|
|
by other runtimes in the future too. The current implementation only
|
|
supports X86-64, but the intention is to support more architectures in the
|
|
future.
|
|
"``preserve_allcc``" - The `PreserveAll` calling convention
|
|
This calling convention attempts to make the code in the caller even less
|
|
intrusive than the `PreserveMost` calling convention. This calling
|
|
convention also behaves identical to the `C` calling convention on how
|
|
arguments and return values are passed, but it uses a different set of
|
|
caller/callee-saved registers. This removes the burden of saving and
|
|
recovering a large register set before and after the call in the caller. If
|
|
the arguments are passed in callee-saved registers, then they will be
|
|
preserved by the callee across the call. This doesn't apply for values
|
|
returned in callee-saved registers.
|
|
|
|
- On X86-64 the callee preserves all general purpose registers, except for
|
|
R11. R11 can be used as a scratch register. Furthermore it also preserves
|
|
all floating-point registers (XMMs/YMMs).
|
|
|
|
The idea behind this convention is to support calls to runtime functions
|
|
that don't need to call out to any other functions.
|
|
|
|
This calling convention, like the `PreserveMost` calling convention, will be
|
|
used by a future version of the ObjectiveC runtime and should be considered
|
|
experimental at this time.
|
|
"``cc <n>``" - Numbered convention
|
|
Any calling convention may be specified by number, allowing
|
|
target-specific calling conventions to be used. Target specific
|
|
calling conventions start at 64.
|
|
|
|
More calling conventions can be added/defined on an as-needed basis, to
|
|
support Pascal conventions or any other well-known target-independent
|
|
convention.
|
|
|
|
.. _visibilitystyles:
|
|
|
|
Visibility Styles
|
|
-----------------
|
|
|
|
All Global Variables and Functions have one of the following visibility
|
|
styles:
|
|
|
|
"``default``" - Default style
|
|
On targets that use the ELF object file format, default visibility
|
|
means that the declaration is visible to other modules and, in
|
|
shared libraries, means that the declared entity may be overridden.
|
|
On Darwin, default visibility means that the declaration is visible
|
|
to other modules. Default visibility corresponds to "external
|
|
linkage" in the language.
|
|
"``hidden``" - Hidden style
|
|
Two declarations of an object with hidden visibility refer to the
|
|
same object if they are in the same shared object. Usually, hidden
|
|
visibility indicates that the symbol will not be placed into the
|
|
dynamic symbol table, so no other module (executable or shared
|
|
library) can reference it directly.
|
|
"``protected``" - Protected style
|
|
On ELF, protected visibility indicates that the symbol will be
|
|
placed in the dynamic symbol table, but that references within the
|
|
defining module will bind to the local symbol. That is, the symbol
|
|
cannot be overridden by another module.
|
|
|
|
A symbol with ``internal`` or ``private`` linkage must have ``default``
|
|
visibility.
|
|
|
|
.. _dllstorageclass:
|
|
|
|
DLL Storage Classes
|
|
-------------------
|
|
|
|
All Global Variables, Functions and Aliases can have one of the following
|
|
DLL storage class:
|
|
|
|
``dllimport``
|
|
"``dllimport``" causes the compiler to reference a function or variable via
|
|
a global pointer to a pointer that is set up by the DLL exporting the
|
|
symbol. On Microsoft Windows targets, the pointer name is formed by
|
|
combining ``__imp_`` and the function or variable name.
|
|
``dllexport``
|
|
"``dllexport``" causes the compiler to provide a global pointer to a pointer
|
|
in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
|
|
Microsoft Windows targets, the pointer name is formed by combining
|
|
``__imp_`` and the function or variable name. Since this storage class
|
|
exists for defining a dll interface, the compiler, assembler and linker know
|
|
it is externally referenced and must refrain from deleting the symbol.
|
|
|
|
.. _tls_model:
|
|
|
|
Thread Local Storage Models
|
|
---------------------------
|
|
|
|
A variable may be defined as ``thread_local``, which means that it will
|
|
not be shared by threads (each thread will have a separated copy of the
|
|
variable). Not all targets support thread-local variables. Optionally, a
|
|
TLS model may be specified:
|
|
|
|
``localdynamic``
|
|
For variables that are only used within the current shared library.
|
|
``initialexec``
|
|
For variables in modules that will not be loaded dynamically.
|
|
``localexec``
|
|
For variables defined in the executable and only used within it.
|
|
|
|
If no explicit model is given, the "general dynamic" model is used.
|
|
|
|
The models correspond to the ELF TLS models; see `ELF Handling For
|
|
Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
|
|
more information on under which circumstances the different models may
|
|
be used. The target may choose a different TLS model if the specified
|
|
model is not supported, or if a better choice of model can be made.
|
|
|
|
A model can also be specified in a alias, but then it only governs how
|
|
the alias is accessed. It will not have any effect in the aliasee.
|
|
|
|
.. _namedtypes:
|
|
|
|
Structure Types
|
|
---------------
|
|
|
|
LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
|
|
types <t_struct>`. Literal types are uniqued structurally, but identified types
|
|
are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
|
|
to forward declare a type that is not yet available.
|
|
|
|
An example of a identified structure specification is:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%mytype = type { %mytype*, i32 }
|
|
|
|
Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
|
|
literal types are uniqued in recent versions of LLVM.
|
|
|
|
.. _globalvars:
|
|
|
|
Global Variables
|
|
----------------
|
|
|
|
Global variables define regions of memory allocated at compilation time
|
|
instead of run-time.
|
|
|
|
Global variables definitions must be initialized.
|
|
|
|
Global variables in other translation units can also be declared, in which
|
|
case they don't have an initializer.
|
|
|
|
Either global variable definitions or declarations may have an explicit section
|
|
to be placed in and may have an optional explicit alignment specified.
|
|
|
|
A variable may be defined as a global ``constant``, which indicates that
|
|
the contents of the variable will **never** be modified (enabling better
|
|
optimization, allowing the global data to be placed in the read-only
|
|
section of an executable, etc). Note that variables that need runtime
|
|
initialization cannot be marked ``constant`` as there is a store to the
|
|
variable.
|
|
|
|
LLVM explicitly allows *declarations* of global variables to be marked
|
|
constant, even if the final definition of the global is not. This
|
|
capability can be used to enable slightly better optimization of the
|
|
program, but requires the language definition to guarantee that
|
|
optimizations based on the 'constantness' are valid for the translation
|
|
units that do not include the definition.
|
|
|
|
As SSA values, global variables define pointer values that are in scope
|
|
(i.e. they dominate) all basic blocks in the program. Global variables
|
|
always define a pointer to their "content" type because they describe a
|
|
region of memory, and all memory objects in LLVM are accessed through
|
|
pointers.
|
|
|
|
Global variables can be marked with ``unnamed_addr`` which indicates
|
|
that the address is not significant, only the content. Constants marked
|
|
like this can be merged with other constants if they have the same
|
|
initializer. Note that a constant with significant address *can* be
|
|
merged with a ``unnamed_addr`` constant, the result being a constant
|
|
whose address is significant.
|
|
|
|
A global variable may be declared to reside in a target-specific
|
|
numbered address space. For targets that support them, address spaces
|
|
may affect how optimizations are performed and/or what target
|
|
instructions are used to access the variable. The default address space
|
|
is zero. The address space qualifier must precede any other attributes.
|
|
|
|
LLVM allows an explicit section to be specified for globals. If the
|
|
target supports it, it will emit globals to the section specified.
|
|
Additionally, the global can placed in a comdat if the target has the necessary
|
|
support.
|
|
|
|
By default, global initializers are optimized by assuming that global
|
|
variables defined within the module are not modified from their
|
|
initial values before the start of the global initializer. This is
|
|
true even for variables potentially accessible from outside the
|
|
module, including those with external linkage or appearing in
|
|
``@llvm.used`` or dllexported variables. This assumption may be suppressed
|
|
by marking the variable with ``externally_initialized``.
|
|
|
|
An explicit alignment may be specified for a global, which must be a
|
|
power of 2. If not present, or if the alignment is set to zero, the
|
|
alignment of the global is set by the target to whatever it feels
|
|
convenient. If an explicit alignment is specified, the global is forced
|
|
to have exactly that alignment. Targets and optimizers are not allowed
|
|
to over-align the global if the global has an assigned section. In this
|
|
case, the extra alignment could be observable: for example, code could
|
|
assume that the globals are densely packed in their section and try to
|
|
iterate over them as an array, alignment padding would break this
|
|
iteration. The maximum alignment is ``1 << 29``.
|
|
|
|
Globals can also have a :ref:`DLL storage class <dllstorageclass>`.
|
|
|
|
Variables and aliasaes can have a
|
|
:ref:`Thread Local Storage Model <tls_model>`.
|
|
|
|
Syntax::
|
|
|
|
[@<GlobalVarName> =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal]
|
|
[unnamed_addr] [AddrSpace] [ExternallyInitialized]
|
|
<global | constant> <Type> [<InitializerConstant>]
|
|
[, section "name"] [, align <Alignment>]
|
|
|
|
For example, the following defines a global in a numbered address space
|
|
with an initializer, section, and alignment:
|
|
|
|
.. code-block:: llvm
|
|
|
|
@G = addrspace(5) constant float 1.0, section "foo", align 4
|
|
|
|
The following example just declares a global variable
|
|
|
|
.. code-block:: llvm
|
|
|
|
@G = external global i32
|
|
|
|
The following example defines a thread-local global with the
|
|
``initialexec`` TLS model:
|
|
|
|
.. code-block:: llvm
|
|
|
|
@G = thread_local(initialexec) global i32 0, align 4
|
|
|
|
.. _functionstructure:
|
|
|
|
Functions
|
|
---------
|
|
|
|
LLVM function definitions consist of the "``define``" keyword, an
|
|
optional :ref:`linkage type <linkage>`, an optional :ref:`visibility
|
|
style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
|
|
an optional :ref:`calling convention <callingconv>`,
|
|
an optional ``unnamed_addr`` attribute, a return type, an optional
|
|
:ref:`parameter attribute <paramattrs>` for the return type, a function
|
|
name, a (possibly empty) argument list (each with optional :ref:`parameter
|
|
attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
|
|
an optional section, an optional alignment,
|
|
an optional :ref:`comdat <langref_comdats>`,
|
|
an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening
|
|
curly brace, a list of basic blocks, and a closing curly brace.
|
|
|
|
LLVM function declarations consist of the "``declare``" keyword, an
|
|
optional :ref:`linkage type <linkage>`, an optional :ref:`visibility
|
|
style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
|
|
an optional :ref:`calling convention <callingconv>`,
|
|
an optional ``unnamed_addr`` attribute, a return type, an optional
|
|
:ref:`parameter attribute <paramattrs>` for the return type, a function
|
|
name, a possibly empty list of arguments, an optional alignment, an optional
|
|
:ref:`garbage collector name <gc>` and an optional :ref:`prefix <prefixdata>`.
|
|
|
|
A function definition contains a list of basic blocks, forming the CFG (Control
|
|
Flow Graph) for the function. Each basic block may optionally start with a label
|
|
(giving the basic block a symbol table entry), contains a list of instructions,
|
|
and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
|
|
function return). If an explicit label is not provided, a block is assigned an
|
|
implicit numbered label, using the next value from the same counter as used for
|
|
unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function
|
|
entry block does not have an explicit label, it will be assigned label "%0",
|
|
then the first unnamed temporary in that block will be "%1", etc.
|
|
|
|
The first basic block in a function is special in two ways: it is
|
|
immediately executed on entrance to the function, and it is not allowed
|
|
to have predecessor basic blocks (i.e. there can not be any branches to
|
|
the entry block of a function). Because the block can have no
|
|
predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
|
|
|
|
LLVM allows an explicit section to be specified for functions. If the
|
|
target supports it, it will emit functions to the section specified.
|
|
Additionally, the function can placed in a COMDAT.
|
|
|
|
An explicit alignment may be specified for a function. If not present,
|
|
or if the alignment is set to zero, the alignment of the function is set
|
|
by the target to whatever it feels convenient. If an explicit alignment
|
|
is specified, the function is forced to have at least that much
|
|
alignment. All alignments must be a power of 2.
|
|
|
|
If the ``unnamed_addr`` attribute is given, the address is know to not
|
|
be significant and two identical functions can be merged.
|
|
|
|
Syntax::
|
|
|
|
define [linkage] [visibility] [DLLStorageClass]
|
|
[cconv] [ret attrs]
|
|
<ResultType> @<FunctionName> ([argument list])
|
|
[unnamed_addr] [fn Attrs] [section "name"] [comdat $<ComdatName>]
|
|
[align N] [gc] [prefix Constant] { ... }
|
|
|
|
The argument list is a comma seperated sequence of arguments where each
|
|
argument is of the following form
|
|
|
|
Syntax::
|
|
|
|
<type> [parameter Attrs] [name]
|
|
|
|
|
|
.. _langref_aliases:
|
|
|
|
Aliases
|
|
-------
|
|
|
|
Aliases, unlike function or variables, don't create any new data. They
|
|
are just a new symbol and metadata for an existing position.
|
|
|
|
Aliases have a name and an aliasee that is either a global value or a
|
|
constant expression.
|
|
|
|
Aliases may have an optional :ref:`linkage type <linkage>`, an optional
|
|
:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
|
|
<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
|
|
|
|
Syntax::
|
|
|
|
@<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias <AliaseeTy> @<Aliasee>
|
|
|
|
The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
|
|
``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
|
|
might not correctly handle dropping a weak symbol that is aliased.
|
|
|
|
Alias that are not ``unnamed_addr`` are guaranteed to have the same address as
|
|
the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
|
|
to the same content.
|
|
|
|
Since aliases are only a second name, some restrictions apply, of which
|
|
some can only be checked when producing an object file:
|
|
|
|
* The expression defining the aliasee must be computable at assembly
|
|
time. Since it is just a name, no relocations can be used.
|
|
|
|
* No alias in the expression can be weak as the possibility of the
|
|
intermediate alias being overridden cannot be represented in an
|
|
object file.
|
|
|
|
* No global value in the expression can be a declaration, since that
|
|
would require a relocation, which is not possible.
|
|
|
|
.. _langref_comdats:
|
|
|
|
Comdats
|
|
-------
|
|
|
|
Comdat IR provides access to COFF and ELF object file COMDAT functionality.
|
|
|
|
Comdats have a name which represents the COMDAT key. All global objects that
|
|
specify this key will only end up in the final object file if the linker chooses
|
|
that key over some other key. Aliases are placed in the same COMDAT that their
|
|
aliasee computes to, if any.
|
|
|
|
Comdats have a selection kind to provide input on how the linker should
|
|
choose between keys in two different object files.
|
|
|
|
Syntax::
|
|
|
|
$<Name> = comdat SelectionKind
|
|
|
|
The selection kind must be one of the following:
|
|
|
|
``any``
|
|
The linker may choose any COMDAT key, the choice is arbitrary.
|
|
``exactmatch``
|
|
The linker may choose any COMDAT key but the sections must contain the
|
|
same data.
|
|
``largest``
|
|
The linker will choose the section containing the largest COMDAT key.
|
|
``noduplicates``
|
|
The linker requires that only section with this COMDAT key exist.
|
|
``samesize``
|
|
The linker may choose any COMDAT key but the sections must contain the
|
|
same amount of data.
|
|
|
|
Note that the Mach-O platform doesn't support COMDATs and ELF only supports
|
|
``any`` as a selection kind.
|
|
|
|
Here is an example of a COMDAT group where a function will only be selected if
|
|
the COMDAT key's section is the largest:
|
|
|
|
.. code-block:: llvm
|
|
|
|
$foo = comdat largest
|
|
@foo = global i32 2, comdat $foo
|
|
|
|
define void @bar() comdat $foo {
|
|
ret void
|
|
}
|
|
|
|
In a COFF object file, this will create a COMDAT section with selection kind
|
|
``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
|
|
and another COMDAT section with selection kind
|
|
``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
|
|
section and contains the contents of the ``@bar`` symbol.
|
|
|
|
There are some restrictions on the properties of the global object.
|
|
It, or an alias to it, must have the same name as the COMDAT group when
|
|
targeting COFF.
|
|
The contents and size of this object may be used during link-time to determine
|
|
which COMDAT groups get selected depending on the selection kind.
|
|
Because the name of the object must match the name of the COMDAT group, the
|
|
linkage of the global object must not be local; local symbols can get renamed
|
|
if a collision occurs in the symbol table.
|
|
|
|
The combined use of COMDATS and section attributes may yield surprising results.
|
|
For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
$foo = comdat any
|
|
$bar = comdat any
|
|
@g1 = global i32 42, section "sec", comdat $foo
|
|
@g2 = global i32 42, section "sec", comdat $bar
|
|
|
|
From the object file perspective, this requires the creation of two sections
|
|
with the same name. This is necessary because both globals belong to different
|
|
COMDAT groups and COMDATs, at the object file level, are represented by
|
|
sections.
|
|
|
|
Note that certain IR constructs like global variables and functions may create
|
|
COMDATs in the object file in addition to any which are specified using COMDAT
|
|
IR. This arises, for example, when a global variable has linkonce_odr linkage.
|
|
|
|
.. _namedmetadatastructure:
|
|
|
|
Named Metadata
|
|
--------------
|
|
|
|
Named metadata is a collection of metadata. :ref:`Metadata
|
|
nodes <metadata>` (but not metadata strings) are the only valid
|
|
operands for a named metadata.
|
|
|
|
Syntax::
|
|
|
|
; Some unnamed metadata nodes, which are referenced by the named metadata.
|
|
!0 = metadata !{metadata !"zero"}
|
|
!1 = metadata !{metadata !"one"}
|
|
!2 = metadata !{metadata !"two"}
|
|
; A named metadata.
|
|
!name = !{!0, !1, !2}
|
|
|
|
.. _paramattrs:
|
|
|
|
Parameter Attributes
|
|
--------------------
|
|
|
|
The return type and each parameter of a function type may have a set of
|
|
*parameter attributes* associated with them. Parameter attributes are
|
|
used to communicate additional information about the result or
|
|
parameters of a function. Parameter attributes are considered to be part
|
|
of the function, not of the function type, so functions with different
|
|
parameter attributes can have the same function type.
|
|
|
|
Parameter attributes are simple keywords that follow the type specified.
|
|
If multiple parameter attributes are needed, they are space separated.
|
|
For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
declare i32 @printf(i8* noalias nocapture, ...)
|
|
declare i32 @atoi(i8 zeroext)
|
|
declare signext i8 @returns_signed_char()
|
|
|
|
Note that any attributes for the function result (``nounwind``,
|
|
``readonly``) come immediately after the argument list.
|
|
|
|
Currently, only the following parameter attributes are defined:
|
|
|
|
``zeroext``
|
|
This indicates to the code generator that the parameter or return
|
|
value should be zero-extended to the extent required by the target's
|
|
ABI (which is usually 32-bits, but is 8-bits for a i1 on x86-64) by
|
|
the caller (for a parameter) or the callee (for a return value).
|
|
``signext``
|
|
This indicates to the code generator that the parameter or return
|
|
value should be sign-extended to the extent required by the target's
|
|
ABI (which is usually 32-bits) by the caller (for a parameter) or
|
|
the callee (for a return value).
|
|
``inreg``
|
|
This indicates that this parameter or return value should be treated
|
|
in a special target-dependent fashion during while emitting code for
|
|
a function call or return (usually, by putting it in a register as
|
|
opposed to memory, though some targets use it to distinguish between
|
|
two different kinds of registers). Use of this attribute is
|
|
target-specific.
|
|
``byval``
|
|
This indicates that the pointer parameter should really be passed by
|
|
value to the function. The attribute implies that a hidden copy of
|
|
the pointee is made between the caller and the callee, so the callee
|
|
is unable to modify the value in the caller. This attribute is only
|
|
valid on LLVM pointer arguments. It is generally used to pass
|
|
structs and arrays by value, but is also valid on pointers to
|
|
scalars. The copy is considered to belong to the caller not the
|
|
callee (for example, ``readonly`` functions should not write to
|
|
``byval`` parameters). This is not a valid attribute for return
|
|
values.
|
|
|
|
The byval attribute also supports specifying an alignment with the
|
|
align attribute. It indicates the alignment of the stack slot to
|
|
form and the known alignment of the pointer specified to the call
|
|
site. If the alignment is not specified, then the code generator
|
|
makes a target-specific assumption.
|
|
|
|
.. _attr_inalloca:
|
|
|
|
``inalloca``
|
|
|
|
The ``inalloca`` argument attribute allows the caller to take the
|
|
address of outgoing stack arguments. An ``inalloca`` argument must
|
|
be a pointer to stack memory produced by an ``alloca`` instruction.
|
|
The alloca, or argument allocation, must also be tagged with the
|
|
inalloca keyword. Only the last argument may have the ``inalloca``
|
|
attribute, and that argument is guaranteed to be passed in memory.
|
|
|
|
An argument allocation may be used by a call at most once because
|
|
the call may deallocate it. The ``inalloca`` attribute cannot be
|
|
used in conjunction with other attributes that affect argument
|
|
storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
|
|
``inalloca`` attribute also disables LLVM's implicit lowering of
|
|
large aggregate return values, which means that frontend authors
|
|
must lower them with ``sret`` pointers.
|
|
|
|
When the call site is reached, the argument allocation must have
|
|
been the most recent stack allocation that is still live, or the
|
|
results are undefined. It is possible to allocate additional stack
|
|
space after an argument allocation and before its call site, but it
|
|
must be cleared off with :ref:`llvm.stackrestore
|
|
<int_stackrestore>`.
|
|
|
|
See :doc:`InAlloca` for more information on how to use this
|
|
attribute.
|
|
|
|
``sret``
|
|
This indicates that the pointer parameter specifies the address of a
|
|
structure that is the return value of the function in the source
|
|
program. This pointer must be guaranteed by the caller to be valid:
|
|
loads and stores to the structure may be assumed by the callee
|
|
not to trap and to be properly aligned. This may only be applied to
|
|
the first parameter. This is not a valid attribute for return
|
|
values.
|
|
|
|
``align <n>``
|
|
This indicates that the pointer value may be assumed by the optimizer to
|
|
have the specified alignment.
|
|
|
|
Note that this attribute has additional semantics when combined with the
|
|
``byval`` attribute.
|
|
|
|
.. _noalias:
|
|
|
|
``noalias``
|
|
This indicates that pointer values :ref:`based <pointeraliasing>` on
|
|
the argument or return value do not alias pointer values that are
|
|
not *based* on it, ignoring certain "irrelevant" dependencies. For a
|
|
call to the parent function, dependencies between memory references
|
|
from before or after the call and from those during the call are
|
|
"irrelevant" to the ``noalias`` keyword for the arguments and return
|
|
value used in that call. The caller shares the responsibility with
|
|
the callee for ensuring that these requirements are met. For further
|
|
details, please see the discussion of the NoAlias response in :ref:`alias
|
|
analysis <Must, May, or No>`.
|
|
|
|
Note that this definition of ``noalias`` is intentionally similar
|
|
to the definition of ``restrict`` in C99 for function arguments,
|
|
though it is slightly weaker.
|
|
|
|
For function return values, C99's ``restrict`` is not meaningful,
|
|
while LLVM's ``noalias`` is.
|
|
``nocapture``
|
|
This indicates that the callee does not make any copies of the
|
|
pointer that outlive the callee itself. This is not a valid
|
|
attribute for return values.
|
|
|
|
.. _nest:
|
|
|
|
``nest``
|
|
This indicates that the pointer parameter can be excised using the
|
|
:ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
|
|
attribute for return values and can only be applied to one parameter.
|
|
|
|
``returned``
|
|
This indicates that the function always returns the argument as its return
|
|
value. This is an optimization hint to the code generator when generating
|
|
the caller, allowing tail call optimization and omission of register saves
|
|
and restores in some cases; it is not checked or enforced when generating
|
|
the callee. The parameter and the function return type must be valid
|
|
operands for the :ref:`bitcast instruction <i_bitcast>`. This is not a
|
|
valid attribute for return values and can only be applied to one parameter.
|
|
|
|
``nonnull``
|
|
This indicates that the parameter or return pointer is not null. This
|
|
attribute may only be applied to pointer typed parameters. This is not
|
|
checked or enforced by LLVM, the caller must ensure that the pointer
|
|
passed in is non-null, or the callee must ensure that the returned pointer
|
|
is non-null.
|
|
|
|
``dereferenceable(<n>)``
|
|
This indicates that the parameter or return pointer is dereferenceable. This
|
|
attribute may only be applied to pointer typed parameters. A pointer that
|
|
is dereferenceable can be loaded from speculatively without a risk of
|
|
trapping. The number of bytes known to be dereferenceable must be provided
|
|
in parentheses. It is legal for the number of bytes to be less than the
|
|
size of the pointee type. The ``nonnull`` attribute does not imply
|
|
dereferenceability (consider a pointer to one element past the end of an
|
|
array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
|
|
``addrspace(0)`` (which is the default address space).
|
|
|
|
.. _gc:
|
|
|
|
Garbage Collector Names
|
|
-----------------------
|
|
|
|
Each function may specify a garbage collector name, which is simply a
|
|
string:
|
|
|
|
.. code-block:: llvm
|
|
|
|
define void @f() gc "name" { ... }
|
|
|
|
The compiler declares the supported values of *name*. Specifying a
|
|
collector will cause the compiler to alter its output in order to
|
|
support the named garbage collection algorithm.
|
|
|
|
.. _prefixdata:
|
|
|
|
Prefix Data
|
|
-----------
|
|
|
|
Prefix data is data associated with a function which the code generator
|
|
will emit immediately before the function body. The purpose of this feature
|
|
is to allow frontends to associate language-specific runtime metadata with
|
|
specific functions and make it available through the function pointer while
|
|
still allowing the function pointer to be called. To access the data for a
|
|
given function, a program may bitcast the function pointer to a pointer to
|
|
the constant's type. This implies that the IR symbol points to the start
|
|
of the prefix data.
|
|
|
|
To maintain the semantics of ordinary function calls, the prefix data must
|
|
have a particular format. Specifically, it must begin with a sequence of
|
|
bytes which decode to a sequence of machine instructions, valid for the
|
|
module's target, which transfer control to the point immediately succeeding
|
|
the prefix data, without performing any other visible action. This allows
|
|
the inliner and other passes to reason about the semantics of the function
|
|
definition without needing to reason about the prefix data. Obviously this
|
|
makes the format of the prefix data highly target dependent.
|
|
|
|
Prefix data is laid out as if it were an initializer for a global variable
|
|
of the prefix data's type. No padding is automatically placed between the
|
|
prefix data and the function body. If padding is required, it must be part
|
|
of the prefix data.
|
|
|
|
A trivial example of valid prefix data for the x86 architecture is ``i8 144``,
|
|
which encodes the ``nop`` instruction:
|
|
|
|
.. code-block:: llvm
|
|
|
|
define void @f() prefix i8 144 { ... }
|
|
|
|
Generally prefix data can be formed by encoding a relative branch instruction
|
|
which skips the metadata, as in this example of valid prefix data for the
|
|
x86_64 architecture, where the first two bytes encode ``jmp .+10``:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%0 = type <{ i8, i8, i8* }>
|
|
|
|
define void @f() prefix %0 <{ i8 235, i8 8, i8* @md}> { ... }
|
|
|
|
A function may have prefix data but no body. This has similar semantics
|
|
to the ``available_externally`` linkage in that the data may be used by the
|
|
optimizers but will not be emitted in the object file.
|
|
|
|
.. _attrgrp:
|
|
|
|
Attribute Groups
|
|
----------------
|
|
|
|
Attribute groups are groups of attributes that are referenced by objects within
|
|
the IR. They are important for keeping ``.ll`` files readable, because a lot of
|
|
functions will use the same set of attributes. In the degenerative case of a
|
|
``.ll`` file that corresponds to a single ``.c`` file, the single attribute
|
|
group will capture the important command line flags used to build that file.
|
|
|
|
An attribute group is a module-level object. To use an attribute group, an
|
|
object references the attribute group's ID (e.g. ``#37``). An object may refer
|
|
to more than one attribute group. In that situation, the attributes from the
|
|
different groups are merged.
|
|
|
|
Here is an example of attribute groups for a function that should always be
|
|
inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
|
|
|
|
.. code-block:: llvm
|
|
|
|
; Target-independent attributes:
|
|
attributes #0 = { alwaysinline alignstack=4 }
|
|
|
|
; Target-dependent attributes:
|
|
attributes #1 = { "no-sse" }
|
|
|
|
; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
|
|
define void @f() #0 #1 { ... }
|
|
|
|
.. _fnattrs:
|
|
|
|
Function Attributes
|
|
-------------------
|
|
|
|
Function attributes are set to communicate additional information about
|
|
a function. Function attributes are considered to be part of the
|
|
function, not of the function type, so functions with different function
|
|
attributes can have the same function type.
|
|
|
|
Function attributes are simple keywords that follow the type specified.
|
|
If multiple attributes are needed, they are space separated. For
|
|
example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
define void @f() noinline { ... }
|
|
define void @f() alwaysinline { ... }
|
|
define void @f() alwaysinline optsize { ... }
|
|
define void @f() optsize { ... }
|
|
|
|
``alignstack(<n>)``
|
|
This attribute indicates that, when emitting the prologue and
|
|
epilogue, the backend should forcibly align the stack pointer.
|
|
Specify the desired alignment, which must be a power of two, in
|
|
parentheses.
|
|
``alwaysinline``
|
|
This attribute indicates that the inliner should attempt to inline
|
|
this function into callers whenever possible, ignoring any active
|
|
inlining size threshold for this caller.
|
|
``builtin``
|
|
This indicates that the callee function at a call site should be
|
|
recognized as a built-in function, even though the function's declaration
|
|
uses the ``nobuiltin`` attribute. This is only valid at call sites for
|
|
direct calls to functions that are declared with the ``nobuiltin``
|
|
attribute.
|
|
``cold``
|
|
This attribute indicates that this function is rarely called. When
|
|
computing edge weights, basic blocks post-dominated by a cold
|
|
function call are also considered to be cold; and, thus, given low
|
|
weight.
|
|
``inlinehint``
|
|
This attribute indicates that the source code contained a hint that
|
|
inlining this function is desirable (such as the "inline" keyword in
|
|
C/C++). It is just a hint; it imposes no requirements on the
|
|
inliner.
|
|
``jumptable``
|
|
This attribute indicates that the function should be added to a
|
|
jump-instruction table at code-generation time, and that all address-taken
|
|
references to this function should be replaced with a reference to the
|
|
appropriate jump-instruction-table function pointer. Note that this creates
|
|
a new pointer for the original function, which means that code that depends
|
|
on function-pointer identity can break. So, any function annotated with
|
|
``jumptable`` must also be ``unnamed_addr``.
|
|
``minsize``
|
|
This attribute suggests that optimization passes and code generator
|
|
passes make choices that keep the code size of this function as small
|
|
as possible and perform optimizations that may sacrifice runtime
|
|
performance in order to minimize the size of the generated code.
|
|
``naked``
|
|
This attribute disables prologue / epilogue emission for the
|
|
function. This can have very system-specific consequences.
|
|
``nobuiltin``
|
|
This indicates that the callee function at a call site is not recognized as
|
|
a built-in function. LLVM will retain the original call and not replace it
|
|
with equivalent code based on the semantics of the built-in function, unless
|
|
the call site uses the ``builtin`` attribute. This is valid at call sites
|
|
and on function declarations and definitions.
|
|
``noduplicate``
|
|
This attribute indicates that calls to the function cannot be
|
|
duplicated. A call to a ``noduplicate`` function may be moved
|
|
within its parent function, but may not be duplicated within
|
|
its parent function.
|
|
|
|
A function containing a ``noduplicate`` call may still
|
|
be an inlining candidate, provided that the call is not
|
|
duplicated by inlining. That implies that the function has
|
|
internal linkage and only has one call site, so the original
|
|
call is dead after inlining.
|
|
``noimplicitfloat``
|
|
This attributes disables implicit floating point instructions.
|
|
``noinline``
|
|
This attribute indicates that the inliner should never inline this
|
|
function in any situation. This attribute may not be used together
|
|
with the ``alwaysinline`` attribute.
|
|
``nonlazybind``
|
|
This attribute suppresses lazy symbol binding for the function. This
|
|
may make calls to the function faster, at the cost of extra program
|
|
startup time if the function is not called during program startup.
|
|
``noredzone``
|
|
This attribute indicates that the code generator should not use a
|
|
red zone, even if the target-specific ABI normally permits it.
|
|
``noreturn``
|
|
This function attribute indicates that the function never returns
|
|
normally. This produces undefined behavior at runtime if the
|
|
function ever does dynamically return.
|
|
``nounwind``
|
|
This function attribute indicates that the function never returns
|
|
with an unwind or exceptional control flow. If the function does
|
|
unwind, its runtime behavior is undefined.
|
|
``optnone``
|
|
This function attribute indicates that the function is not optimized
|
|
by any optimization or code generator passes with the
|
|
exception of interprocedural optimization passes.
|
|
This attribute cannot be used together with the ``alwaysinline``
|
|
attribute; this attribute is also incompatible
|
|
with the ``minsize`` attribute and the ``optsize`` attribute.
|
|
|
|
This attribute requires the ``noinline`` attribute to be specified on
|
|
the function as well, so the function is never inlined into any caller.
|
|
Only functions with the ``alwaysinline`` attribute are valid
|
|
candidates for inlining into the body of this function.
|
|
``optsize``
|
|
This attribute suggests that optimization passes and code generator
|
|
passes make choices that keep the code size of this function low,
|
|
and otherwise do optimizations specifically to reduce code size as
|
|
long as they do not significantly impact runtime performance.
|
|
``readnone``
|
|
On a function, this attribute indicates that the function computes its
|
|
result (or decides to unwind an exception) based strictly on its arguments,
|
|
without dereferencing any pointer arguments or otherwise accessing
|
|
any mutable state (e.g. memory, control registers, etc) visible to
|
|
caller functions. It does not write through any pointer arguments
|
|
(including ``byval`` arguments) and never changes any state visible
|
|
to callers. This means that it cannot unwind exceptions by calling
|
|
the ``C++`` exception throwing methods.
|
|
|
|
On an argument, this attribute indicates that the function does not
|
|
dereference that pointer argument, even though it may read or write the
|
|
memory that the pointer points to if accessed through other pointers.
|
|
``readonly``
|
|
On a function, this attribute indicates that the function does not write
|
|
through any pointer arguments (including ``byval`` arguments) or otherwise
|
|
modify any state (e.g. memory, control registers, etc) visible to
|
|
caller functions. It may dereference pointer arguments and read
|
|
state that may be set in the caller. A readonly function always
|
|
returns the same value (or unwinds an exception identically) when
|
|
called with the same set of arguments and global state. It cannot
|
|
unwind an exception by calling the ``C++`` exception throwing
|
|
methods.
|
|
|
|
On an argument, this attribute indicates that the function does not write
|
|
through this pointer argument, even though it may write to the memory that
|
|
the pointer points to.
|
|
``returns_twice``
|
|
This attribute indicates that this function can return twice. The C
|
|
``setjmp`` is an example of such a function. The compiler disables
|
|
some optimizations (like tail calls) in the caller of these
|
|
functions.
|
|
``sanitize_address``
|
|
This attribute indicates that AddressSanitizer checks
|
|
(dynamic address safety analysis) are enabled for this function.
|
|
``sanitize_memory``
|
|
This attribute indicates that MemorySanitizer checks (dynamic detection
|
|
of accesses to uninitialized memory) are enabled for this function.
|
|
``sanitize_thread``
|
|
This attribute indicates that ThreadSanitizer checks
|
|
(dynamic thread safety analysis) are enabled for this function.
|
|
``ssp``
|
|
This attribute indicates that the function should emit a stack
|
|
smashing protector. It is in the form of a "canary" --- a random value
|
|
placed on the stack before the local variables that's checked upon
|
|
return from the function to see if it has been overwritten. A
|
|
heuristic is used to determine if a function needs stack protectors
|
|
or not. The heuristic used will enable protectors for functions with:
|
|
|
|
- Character arrays larger than ``ssp-buffer-size`` (default 8).
|
|
- Aggregates containing character arrays larger than ``ssp-buffer-size``.
|
|
- Calls to alloca() with variable sizes or constant sizes greater than
|
|
``ssp-buffer-size``.
|
|
|
|
Variables that are identified as requiring a protector will be arranged
|
|
on the stack such that they are adjacent to the stack protector guard.
|
|
|
|
If a function that has an ``ssp`` attribute is inlined into a
|
|
function that doesn't have an ``ssp`` attribute, then the resulting
|
|
function will have an ``ssp`` attribute.
|
|
``sspreq``
|
|
This attribute indicates that the function should *always* emit a
|
|
stack smashing protector. This overrides the ``ssp`` function
|
|
attribute.
|
|
|
|
Variables that are identified as requiring a protector will be arranged
|
|
on the stack such that they are adjacent to the stack protector guard.
|
|
The specific layout rules are:
|
|
|
|
#. Large arrays and structures containing large arrays
|
|
(``>= ssp-buffer-size``) are closest to the stack protector.
|
|
#. Small arrays and structures containing small arrays
|
|
(``< ssp-buffer-size``) are 2nd closest to the protector.
|
|
#. Variables that have had their address taken are 3rd closest to the
|
|
protector.
|
|
|
|
If a function that has an ``sspreq`` attribute is inlined into a
|
|
function that doesn't have an ``sspreq`` attribute or which has an
|
|
``ssp`` or ``sspstrong`` attribute, then the resulting function will have
|
|
an ``sspreq`` attribute.
|
|
``sspstrong``
|
|
This attribute indicates that the function should emit a stack smashing
|
|
protector. This attribute causes a strong heuristic to be used when
|
|
determining if a function needs stack protectors. The strong heuristic
|
|
will enable protectors for functions with:
|
|
|
|
- Arrays of any size and type
|
|
- Aggregates containing an array of any size and type.
|
|
- Calls to alloca().
|
|
- Local variables that have had their address taken.
|
|
|
|
Variables that are identified as requiring a protector will be arranged
|
|
on the stack such that they are adjacent to the stack protector guard.
|
|
The specific layout rules are:
|
|
|
|
#. Large arrays and structures containing large arrays
|
|
(``>= ssp-buffer-size``) are closest to the stack protector.
|
|
#. Small arrays and structures containing small arrays
|
|
(``< ssp-buffer-size``) are 2nd closest to the protector.
|
|
#. Variables that have had their address taken are 3rd closest to the
|
|
protector.
|
|
|
|
This overrides the ``ssp`` function attribute.
|
|
|
|
If a function that has an ``sspstrong`` attribute is inlined into a
|
|
function that doesn't have an ``sspstrong`` attribute, then the
|
|
resulting function will have an ``sspstrong`` attribute.
|
|
``uwtable``
|
|
This attribute indicates that the ABI being targeted requires that
|
|
an unwind table entry be produce for this function even if we can
|
|
show that no exceptions passes by it. This is normally the case for
|
|
the ELF x86-64 abi, but it can be disabled for some compilation
|
|
units.
|
|
|
|
.. _moduleasm:
|
|
|
|
Module-Level Inline Assembly
|
|
----------------------------
|
|
|
|
Modules may contain "module-level inline asm" blocks, which corresponds
|
|
to the GCC "file scope inline asm" blocks. These blocks are internally
|
|
concatenated by LLVM and treated as a single unit, but may be separated
|
|
in the ``.ll`` file if desired. The syntax is very simple:
|
|
|
|
.. code-block:: llvm
|
|
|
|
module asm "inline asm code goes here"
|
|
module asm "more can go here"
|
|
|
|
The strings can contain any character by escaping non-printable
|
|
characters. The escape sequence used is simply "\\xx" where "xx" is the
|
|
two digit hex code for the number.
|
|
|
|
The inline asm code is simply printed to the machine code .s file when
|
|
assembly code is generated.
|
|
|
|
.. _langref_datalayout:
|
|
|
|
Data Layout
|
|
-----------
|
|
|
|
A module may specify a target specific data layout string that specifies
|
|
how data is to be laid out in memory. The syntax for the data layout is
|
|
simply:
|
|
|
|
.. code-block:: llvm
|
|
|
|
target datalayout = "layout specification"
|
|
|
|
The *layout specification* consists of a list of specifications
|
|
separated by the minus sign character ('-'). Each specification starts
|
|
with a letter and may include other information after the letter to
|
|
define some aspect of the data layout. The specifications accepted are
|
|
as follows:
|
|
|
|
``E``
|
|
Specifies that the target lays out data in big-endian form. That is,
|
|
the bits with the most significance have the lowest address
|
|
location.
|
|
``e``
|
|
Specifies that the target lays out data in little-endian form. That
|
|
is, the bits with the least significance have the lowest address
|
|
location.
|
|
``S<size>``
|
|
Specifies the natural alignment of the stack in bits. Alignment
|
|
promotion of stack variables is limited to the natural stack
|
|
alignment to avoid dynamic stack realignment. The stack alignment
|
|
must be a multiple of 8-bits. If omitted, the natural stack
|
|
alignment defaults to "unspecified", which does not prevent any
|
|
alignment promotions.
|
|
``p[n]:<size>:<abi>:<pref>``
|
|
This specifies the *size* of a pointer and its ``<abi>`` and
|
|
``<pref>``\erred alignments for address space ``n``. All sizes are in
|
|
bits. The address space, ``n`` is optional, and if not specified,
|
|
denotes the default address space 0. The value of ``n`` must be
|
|
in the range [1,2^23).
|
|
``i<size>:<abi>:<pref>``
|
|
This specifies the alignment for an integer type of a given bit
|
|
``<size>``. The value of ``<size>`` must be in the range [1,2^23).
|
|
``v<size>:<abi>:<pref>``
|
|
This specifies the alignment for a vector type of a given bit
|
|
``<size>``.
|
|
``f<size>:<abi>:<pref>``
|
|
This specifies the alignment for a floating point type of a given bit
|
|
``<size>``. Only values of ``<size>`` that are supported by the target
|
|
will work. 32 (float) and 64 (double) are supported on all targets; 80
|
|
or 128 (different flavors of long double) are also supported on some
|
|
targets.
|
|
``a:<abi>:<pref>``
|
|
This specifies the alignment for an object of aggregate type.
|
|
``m:<mangling>``
|
|
If present, specifies that llvm names are mangled in the output. The
|
|
options are
|
|
|
|
* ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
|
|
* ``m``: Mips mangling: Private symbols get a ``$`` prefix.
|
|
* ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
|
|
symbols get a ``_`` prefix.
|
|
* ``w``: Windows COFF prefix: Similar to Mach-O, but stdcall and fastcall
|
|
functions also get a suffix based on the frame size.
|
|
``n<size1>:<size2>:<size3>...``
|
|
This specifies a set of native integer widths for the target CPU in
|
|
bits. For example, it might contain ``n32`` for 32-bit PowerPC,
|
|
``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
|
|
this set are considered to support most general arithmetic operations
|
|
efficiently.
|
|
|
|
On every specification that takes a ``<abi>:<pref>``, specifying the
|
|
``<pref>`` alignment is optional. If omitted, the preceding ``:``
|
|
should be omitted too and ``<pref>`` will be equal to ``<abi>``.
|
|
|
|
When constructing the data layout for a given target, LLVM starts with a
|
|
default set of specifications which are then (possibly) overridden by
|
|
the specifications in the ``datalayout`` keyword. The default
|
|
specifications are given in this list:
|
|
|
|
- ``E`` - big endian
|
|
- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
|
|
- ``p[n]:64:64:64`` - Other address spaces are assumed to be the
|
|
same as the default address space.
|
|
- ``S0`` - natural stack alignment is unspecified
|
|
- ``i1:8:8`` - i1 is 8-bit (byte) aligned
|
|
- ``i8:8:8`` - i8 is 8-bit (byte) aligned
|
|
- ``i16:16:16`` - i16 is 16-bit aligned
|
|
- ``i32:32:32`` - i32 is 32-bit aligned
|
|
- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
|
|
alignment of 64-bits
|
|
- ``f16:16:16`` - half is 16-bit aligned
|
|
- ``f32:32:32`` - float is 32-bit aligned
|
|
- ``f64:64:64`` - double is 64-bit aligned
|
|
- ``f128:128:128`` - quad is 128-bit aligned
|
|
- ``v64:64:64`` - 64-bit vector is 64-bit aligned
|
|
- ``v128:128:128`` - 128-bit vector is 128-bit aligned
|
|
- ``a:0:64`` - aggregates are 64-bit aligned
|
|
|
|
When LLVM is determining the alignment for a given type, it uses the
|
|
following rules:
|
|
|
|
#. If the type sought is an exact match for one of the specifications,
|
|
that specification is used.
|
|
#. If no match is found, and the type sought is an integer type, then
|
|
the smallest integer type that is larger than the bitwidth of the
|
|
sought type is used. If none of the specifications are larger than
|
|
the bitwidth then the largest integer type is used. For example,
|
|
given the default specifications above, the i7 type will use the
|
|
alignment of i8 (next largest) while both i65 and i256 will use the
|
|
alignment of i64 (largest specified).
|
|
#. If no match is found, and the type sought is a vector type, then the
|
|
largest vector type that is smaller than the sought vector type will
|
|
be used as a fall back. This happens because <128 x double> can be
|
|
implemented in terms of 64 <2 x double>, for example.
|
|
|
|
The function of the data layout string may not be what you expect.
|
|
Notably, this is not a specification from the frontend of what alignment
|
|
the code generator should use.
|
|
|
|
Instead, if specified, the target data layout is required to match what
|
|
the ultimate *code generator* expects. This string is used by the
|
|
mid-level optimizers to improve code, and this only works if it matches
|
|
what the ultimate code generator uses. If you would like to generate IR
|
|
that does not embed this target-specific detail into the IR, then you
|
|
don't have to specify the string. This will disable some optimizations
|
|
that require precise layout information, but this also prevents those
|
|
optimizations from introducing target specificity into the IR.
|
|
|
|
.. _langref_triple:
|
|
|
|
Target Triple
|
|
-------------
|
|
|
|
A module may specify a target triple string that describes the target
|
|
host. The syntax for the target triple is simply:
|
|
|
|
.. code-block:: llvm
|
|
|
|
target triple = "x86_64-apple-macosx10.7.0"
|
|
|
|
The *target triple* string consists of a series of identifiers delimited
|
|
by the minus sign character ('-'). The canonical forms are:
|
|
|
|
::
|
|
|
|
ARCHITECTURE-VENDOR-OPERATING_SYSTEM
|
|
ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
|
|
|
|
This information is passed along to the backend so that it generates
|
|
code for the proper architecture. It's possible to override this on the
|
|
command line with the ``-mtriple`` command line option.
|
|
|
|
.. _pointeraliasing:
|
|
|
|
Pointer Aliasing Rules
|
|
----------------------
|
|
|
|
Any memory access must be done through a pointer value associated with
|
|
an address range of the memory access, otherwise the behavior is
|
|
undefined. Pointer values are associated with address ranges according
|
|
to the following rules:
|
|
|
|
- A pointer value is associated with the addresses associated with any
|
|
value it is *based* on.
|
|
- An address of a global variable is associated with the address range
|
|
of the variable's storage.
|
|
- The result value of an allocation instruction is associated with the
|
|
address range of the allocated storage.
|
|
- A null pointer in the default address-space is associated with no
|
|
address.
|
|
- An integer constant other than zero or a pointer value returned from
|
|
a function not defined within LLVM may be associated with address
|
|
ranges allocated through mechanisms other than those provided by
|
|
LLVM. Such ranges shall not overlap with any ranges of addresses
|
|
allocated by mechanisms provided by LLVM.
|
|
|
|
A pointer value is *based* on another pointer value according to the
|
|
following rules:
|
|
|
|
- A pointer value formed from a ``getelementptr`` operation is *based*
|
|
on the first operand of the ``getelementptr``.
|
|
- The result value of a ``bitcast`` is *based* on the operand of the
|
|
``bitcast``.
|
|
- A pointer value formed by an ``inttoptr`` is *based* on all pointer
|
|
values that contribute (directly or indirectly) to the computation of
|
|
the pointer's value.
|
|
- The "*based* on" relationship is transitive.
|
|
|
|
Note that this definition of *"based"* is intentionally similar to the
|
|
definition of *"based"* in C99, though it is slightly weaker.
|
|
|
|
LLVM IR does not associate types with memory. The result type of a
|
|
``load`` merely indicates the size and alignment of the memory from
|
|
which to load, as well as the interpretation of the value. The first
|
|
operand type of a ``store`` similarly only indicates the size and
|
|
alignment of the store.
|
|
|
|
Consequently, type-based alias analysis, aka TBAA, aka
|
|
``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
|
|
:ref:`Metadata <metadata>` may be used to encode additional information
|
|
which specialized optimization passes may use to implement type-based
|
|
alias analysis.
|
|
|
|
.. _volatile:
|
|
|
|
Volatile Memory Accesses
|
|
------------------------
|
|
|
|
Certain memory accesses, such as :ref:`load <i_load>`'s,
|
|
:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
|
|
marked ``volatile``. The optimizers must not change the number of
|
|
volatile operations or change their order of execution relative to other
|
|
volatile operations. The optimizers *may* change the order of volatile
|
|
operations relative to non-volatile operations. This is not Java's
|
|
"volatile" and has no cross-thread synchronization behavior.
|
|
|
|
IR-level volatile loads and stores cannot safely be optimized into
|
|
llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are
|
|
flagged volatile. Likewise, the backend should never split or merge
|
|
target-legal volatile load/store instructions.
|
|
|
|
.. admonition:: Rationale
|
|
|
|
Platforms may rely on volatile loads and stores of natively supported
|
|
data width to be executed as single instruction. For example, in C
|
|
this holds for an l-value of volatile primitive type with native
|
|
hardware support, but not necessarily for aggregate types. The
|
|
frontend upholds these expectations, which are intentionally
|
|
unspecified in the IR. The rules above ensure that IR transformation
|
|
do not violate the frontend's contract with the language.
|
|
|
|
.. _memmodel:
|
|
|
|
Memory Model for Concurrent Operations
|
|
--------------------------------------
|
|
|
|
The LLVM IR does not define any way to start parallel threads of
|
|
execution or to register signal handlers. Nonetheless, there are
|
|
platform-specific ways to create them, and we define LLVM IR's behavior
|
|
in their presence. This model is inspired by the C++0x memory model.
|
|
|
|
For a more informal introduction to this model, see the :doc:`Atomics`.
|
|
|
|
We define a *happens-before* partial order as the least partial order
|
|
that
|
|
|
|
- Is a superset of single-thread program order, and
|
|
- When a *synchronizes-with* ``b``, includes an edge from ``a`` to
|
|
``b``. *Synchronizes-with* pairs are introduced by platform-specific
|
|
techniques, like pthread locks, thread creation, thread joining,
|
|
etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
|
|
Constraints <ordering>`).
|
|
|
|
Note that program order does not introduce *happens-before* edges
|
|
between a thread and signals executing inside that thread.
|
|
|
|
Every (defined) read operation (load instructions, memcpy, atomic
|
|
loads/read-modify-writes, etc.) R reads a series of bytes written by
|
|
(defined) write operations (store instructions, atomic
|
|
stores/read-modify-writes, memcpy, etc.). For the purposes of this
|
|
section, initialized globals are considered to have a write of the
|
|
initializer which is atomic and happens before any other read or write
|
|
of the memory in question. For each byte of a read R, R\ :sub:`byte`
|
|
may see any write to the same byte, except:
|
|
|
|
- If write\ :sub:`1` happens before write\ :sub:`2`, and
|
|
write\ :sub:`2` happens before R\ :sub:`byte`, then
|
|
R\ :sub:`byte` does not see write\ :sub:`1`.
|
|
- If R\ :sub:`byte` happens before write\ :sub:`3`, then
|
|
R\ :sub:`byte` does not see write\ :sub:`3`.
|
|
|
|
Given that definition, R\ :sub:`byte` is defined as follows:
|
|
|
|
- If R is volatile, the result is target-dependent. (Volatile is
|
|
supposed to give guarantees which can support ``sig_atomic_t`` in
|
|
C/C++, and may be used for accesses to addresses that do not behave
|
|
like normal memory. It does not generally provide cross-thread
|
|
synchronization.)
|
|
- Otherwise, if there is no write to the same byte that happens before
|
|
R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
|
|
- Otherwise, if R\ :sub:`byte` may see exactly one write,
|
|
R\ :sub:`byte` returns the value written by that write.
|
|
- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
|
|
see are atomic, it chooses one of the values written. See the :ref:`Atomic
|
|
Memory Ordering Constraints <ordering>` section for additional
|
|
constraints on how the choice is made.
|
|
- Otherwise R\ :sub:`byte` returns ``undef``.
|
|
|
|
R returns the value composed of the series of bytes it read. This
|
|
implies that some bytes within the value may be ``undef`` **without**
|
|
the entire value being ``undef``. Note that this only defines the
|
|
semantics of the operation; it doesn't mean that targets will emit more
|
|
than one instruction to read the series of bytes.
|
|
|
|
Note that in cases where none of the atomic intrinsics are used, this
|
|
model places only one restriction on IR transformations on top of what
|
|
is required for single-threaded execution: introducing a store to a byte
|
|
which might not otherwise be stored is not allowed in general.
|
|
(Specifically, in the case where another thread might write to and read
|
|
from an address, introducing a store can change a load that may see
|
|
exactly one write into a load that may see multiple writes.)
|
|
|
|
.. _ordering:
|
|
|
|
Atomic Memory Ordering Constraints
|
|
----------------------------------
|
|
|
|
Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
|
|
:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
|
|
:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
|
|
ordering parameters that determine which other atomic instructions on
|
|
the same address they *synchronize with*. These semantics are borrowed
|
|
from Java and C++0x, but are somewhat more colloquial. If these
|
|
descriptions aren't precise enough, check those specs (see spec
|
|
references in the :doc:`atomics guide <Atomics>`).
|
|
:ref:`fence <i_fence>` instructions treat these orderings somewhat
|
|
differently since they don't take an address. See that instruction's
|
|
documentation for details.
|
|
|
|
For a simpler introduction to the ordering constraints, see the
|
|
:doc:`Atomics`.
|
|
|
|
``unordered``
|
|
The set of values that can be read is governed by the happens-before
|
|
partial order. A value cannot be read unless some operation wrote
|
|
it. This is intended to provide a guarantee strong enough to model
|
|
Java's non-volatile shared variables. This ordering cannot be
|
|
specified for read-modify-write operations; it is not strong enough
|
|
to make them atomic in any interesting way.
|
|
``monotonic``
|
|
In addition to the guarantees of ``unordered``, there is a single
|
|
total order for modifications by ``monotonic`` operations on each
|
|
address. All modification orders must be compatible with the
|
|
happens-before order. There is no guarantee that the modification
|
|
orders can be combined to a global total order for the whole program
|
|
(and this often will not be possible). The read in an atomic
|
|
read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
|
|
:ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
|
|
order immediately before the value it writes. If one atomic read
|
|
happens before another atomic read of the same address, the later
|
|
read must see the same value or a later value in the address's
|
|
modification order. This disallows reordering of ``monotonic`` (or
|
|
stronger) operations on the same address. If an address is written
|
|
``monotonic``-ally by one thread, and other threads ``monotonic``-ally
|
|
read that address repeatedly, the other threads must eventually see
|
|
the write. This corresponds to the C++0x/C1x
|
|
``memory_order_relaxed``.
|
|
``acquire``
|
|
In addition to the guarantees of ``monotonic``, a
|
|
*synchronizes-with* edge may be formed with a ``release`` operation.
|
|
This is intended to model C++'s ``memory_order_acquire``.
|
|
``release``
|
|
In addition to the guarantees of ``monotonic``, if this operation
|
|
writes a value which is subsequently read by an ``acquire``
|
|
operation, it *synchronizes-with* that operation. (This isn't a
|
|
complete description; see the C++0x definition of a release
|
|
sequence.) This corresponds to the C++0x/C1x
|
|
``memory_order_release``.
|
|
``acq_rel`` (acquire+release)
|
|
Acts as both an ``acquire`` and ``release`` operation on its
|
|
address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
|
|
``seq_cst`` (sequentially consistent)
|
|
In addition to the guarantees of ``acq_rel`` (``acquire`` for an
|
|
operation that only reads, ``release`` for an operation that only
|
|
writes), there is a global total order on all
|
|
sequentially-consistent operations on all addresses, which is
|
|
consistent with the *happens-before* partial order and with the
|
|
modification orders of all the affected addresses. Each
|
|
sequentially-consistent read sees the last preceding write to the
|
|
same address in this global order. This corresponds to the C++0x/C1x
|
|
``memory_order_seq_cst`` and Java volatile.
|
|
|
|
.. _singlethread:
|
|
|
|
If an atomic operation is marked ``singlethread``, it only *synchronizes
|
|
with* or participates in modification and seq\_cst total orderings with
|
|
other operations running in the same thread (for example, in signal
|
|
handlers).
|
|
|
|
.. _fastmath:
|
|
|
|
Fast-Math Flags
|
|
---------------
|
|
|
|
LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`,
|
|
:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
|
|
:ref:`frem <i_frem>`) have the following flags that can set to enable
|
|
otherwise unsafe floating point operations
|
|
|
|
``nnan``
|
|
No NaNs - Allow optimizations to assume the arguments and result are not
|
|
NaN. Such optimizations are required to retain defined behavior over
|
|
NaNs, but the value of the result is undefined.
|
|
|
|
``ninf``
|
|
No Infs - Allow optimizations to assume the arguments and result are not
|
|
+/-Inf. Such optimizations are required to retain defined behavior over
|
|
+/-Inf, but the value of the result is undefined.
|
|
|
|
``nsz``
|
|
No Signed Zeros - Allow optimizations to treat the sign of a zero
|
|
argument or result as insignificant.
|
|
|
|
``arcp``
|
|
Allow Reciprocal - Allow optimizations to use the reciprocal of an
|
|
argument rather than perform division.
|
|
|
|
``fast``
|
|
Fast - Allow algebraically equivalent transformations that may
|
|
dramatically change results in floating point (e.g. reassociate). This
|
|
flag implies all the others.
|
|
|
|
.. _uselistorder:
|
|
|
|
Use-list Order Directives
|
|
-------------------------
|
|
|
|
Use-list directives encode the in-memory order of each use-list, allowing the
|
|
order to be recreated. ``<order-indexes>`` is a comma-separated list of
|
|
indexes that are assigned to the referenced value's uses. The referenced
|
|
value's use-list is immediately sorted by these indexes.
|
|
|
|
Use-list directives may appear at function scope or global scope. They are not
|
|
instructions, and have no effect on the semantics of the IR. When they're at
|
|
function scope, they must appear after the terminator of the final basic block.
|
|
|
|
If basic blocks have their address taken via ``blockaddress()`` expressions,
|
|
``uselistorder_bb`` can be used to reorder their use-lists from outside their
|
|
function's scope.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
uselistorder <ty> <value>, { <order-indexes> }
|
|
uselistorder_bb @function, %block { <order-indexes> }
|
|
|
|
:Examples:
|
|
|
|
::
|
|
|
|
define void @foo(i32 %arg1, i32 %arg2) {
|
|
entry:
|
|
; ... instructions ...
|
|
bb:
|
|
; ... instructions ...
|
|
|
|
; At function scope.
|
|
uselistorder i32 %arg1, { 1, 0, 2 }
|
|
uselistorder label %bb, { 1, 0 }
|
|
}
|
|
|
|
; At global scope.
|
|
uselistorder i32* @global, { 1, 2, 0 }
|
|
uselistorder i32 7, { 1, 0 }
|
|
uselistorder i32 (i32) @bar, { 1, 0 }
|
|
uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
|
|
|
|
.. _typesystem:
|
|
|
|
Type System
|
|
===========
|
|
|
|
The LLVM type system is one of the most important features of the
|
|
intermediate representation. Being typed enables a number of
|
|
optimizations to be performed on the intermediate representation
|
|
directly, without having to do extra analyses on the side before the
|
|
transformation. A strong type system makes it easier to read the
|
|
generated code and enables novel analyses and transformations that are
|
|
not feasible to perform on normal three address code representations.
|
|
|
|
.. _t_void:
|
|
|
|
Void Type
|
|
---------
|
|
|
|
:Overview:
|
|
|
|
|
|
The void type does not represent any value and has no size.
|
|
|
|
:Syntax:
|
|
|
|
|
|
::
|
|
|
|
void
|
|
|
|
|
|
.. _t_function:
|
|
|
|
Function Type
|
|
-------------
|
|
|
|
:Overview:
|
|
|
|
|
|
The function type can be thought of as a function signature. It consists of a
|
|
return type and a list of formal parameter types. The return type of a function
|
|
type is a void type or first class type --- except for :ref:`label <t_label>`
|
|
and :ref:`metadata <t_metadata>` types.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
<returntype> (<parameter list>)
|
|
|
|
...where '``<parameter list>``' is a comma-separated list of type
|
|
specifiers. Optionally, the parameter list may include a type ``...``, which
|
|
indicates that the function takes a variable number of arguments. Variable
|
|
argument functions can access their arguments with the :ref:`variable argument
|
|
handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
|
|
except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
|
|
|
|
:Examples:
|
|
|
|
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
|
|
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. |
|
|
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
|
|
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
|
|
+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
|
|
.. _t_firstclass:
|
|
|
|
First Class Types
|
|
-----------------
|
|
|
|
The :ref:`first class <t_firstclass>` types are perhaps the most important.
|
|
Values of these types are the only ones which can be produced by
|
|
instructions.
|
|
|
|
.. _t_single_value:
|
|
|
|
Single Value Types
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
These are the types that are valid in registers from CodeGen's perspective.
|
|
|
|
.. _t_integer:
|
|
|
|
Integer Type
|
|
""""""""""""
|
|
|
|
:Overview:
|
|
|
|
The integer type is a very simple type that simply specifies an
|
|
arbitrary bit width for the integer type desired. Any bit width from 1
|
|
bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
iN
|
|
|
|
The number of bits the integer will occupy is specified by the ``N``
|
|
value.
|
|
|
|
Examples:
|
|
*********
|
|
|
|
+----------------+------------------------------------------------+
|
|
| ``i1`` | a single-bit integer. |
|
|
+----------------+------------------------------------------------+
|
|
| ``i32`` | a 32-bit integer. |
|
|
+----------------+------------------------------------------------+
|
|
| ``i1942652`` | a really big integer of over 1 million bits. |
|
|
+----------------+------------------------------------------------+
|
|
|
|
.. _t_floating:
|
|
|
|
Floating Point Types
|
|
""""""""""""""""""""
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
|
|
* - Type
|
|
- Description
|
|
|
|
* - ``half``
|
|
- 16-bit floating point value
|
|
|
|
* - ``float``
|
|
- 32-bit floating point value
|
|
|
|
* - ``double``
|
|
- 64-bit floating point value
|
|
|
|
* - ``fp128``
|
|
- 128-bit floating point value (112-bit mantissa)
|
|
|
|
* - ``x86_fp80``
|
|
- 80-bit floating point value (X87)
|
|
|
|
* - ``ppc_fp128``
|
|
- 128-bit floating point value (two 64-bits)
|
|
|
|
X86_mmx Type
|
|
""""""""""""
|
|
|
|
:Overview:
|
|
|
|
The x86_mmx type represents a value held in an MMX register on an x86
|
|
machine. The operations allowed on it are quite limited: parameters and
|
|
return values, load and store, and bitcast. User-specified MMX
|
|
instructions are represented as intrinsic or asm calls with arguments
|
|
and/or results of this type. There are no arrays, vectors or constants
|
|
of this type.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
x86_mmx
|
|
|
|
|
|
.. _t_pointer:
|
|
|
|
Pointer Type
|
|
""""""""""""
|
|
|
|
:Overview:
|
|
|
|
The pointer type is used to specify memory locations. Pointers are
|
|
commonly used to reference objects in memory.
|
|
|
|
Pointer types may have an optional address space attribute defining the
|
|
numbered address space where the pointed-to object resides. The default
|
|
address space is number zero. The semantics of non-zero address spaces
|
|
are target-specific.
|
|
|
|
Note that LLVM does not permit pointers to void (``void*``) nor does it
|
|
permit pointers to labels (``label*``). Use ``i8*`` instead.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
<type> *
|
|
|
|
:Examples:
|
|
|
|
+-------------------------+--------------------------------------------------------------------------------------------------------------+
|
|
| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. |
|
|
+-------------------------+--------------------------------------------------------------------------------------------------------------+
|
|
| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
|
|
+-------------------------+--------------------------------------------------------------------------------------------------------------+
|
|
| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. |
|
|
+-------------------------+--------------------------------------------------------------------------------------------------------------+
|
|
|
|
.. _t_vector:
|
|
|
|
Vector Type
|
|
"""""""""""
|
|
|
|
:Overview:
|
|
|
|
A vector type is a simple derived type that represents a vector of
|
|
elements. Vector types are used when multiple primitive data are
|
|
operated in parallel using a single instruction (SIMD). A vector type
|
|
requires a size (number of elements) and an underlying primitive data
|
|
type. Vector types are considered :ref:`first class <t_firstclass>`.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
< <# elements> x <elementtype> >
|
|
|
|
The number of elements is a constant integer value larger than 0;
|
|
elementtype may be any integer, floating point or pointer type. Vectors
|
|
of size zero are not allowed.
|
|
|
|
:Examples:
|
|
|
|
+-------------------+--------------------------------------------------+
|
|
| ``<4 x i32>`` | Vector of 4 32-bit integer values. |
|
|
+-------------------+--------------------------------------------------+
|
|
| ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
|
|
+-------------------+--------------------------------------------------+
|
|
| ``<2 x i64>`` | Vector of 2 64-bit integer values. |
|
|
+-------------------+--------------------------------------------------+
|
|
| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. |
|
|
+-------------------+--------------------------------------------------+
|
|
|
|
.. _t_label:
|
|
|
|
Label Type
|
|
^^^^^^^^^^
|
|
|
|
:Overview:
|
|
|
|
The label type represents code labels.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
label
|
|
|
|
.. _t_metadata:
|
|
|
|
Metadata Type
|
|
^^^^^^^^^^^^^
|
|
|
|
:Overview:
|
|
|
|
The metadata type represents embedded metadata. No derived types may be
|
|
created from metadata except for :ref:`function <t_function>` arguments.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
metadata
|
|
|
|
.. _t_aggregate:
|
|
|
|
Aggregate Types
|
|
^^^^^^^^^^^^^^^
|
|
|
|
Aggregate Types are a subset of derived types that can contain multiple
|
|
member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
|
|
aggregate types. :ref:`Vectors <t_vector>` are not considered to be
|
|
aggregate types.
|
|
|
|
.. _t_array:
|
|
|
|
Array Type
|
|
""""""""""
|
|
|
|
:Overview:
|
|
|
|
The array type is a very simple derived type that arranges elements
|
|
sequentially in memory. The array type requires a size (number of
|
|
elements) and an underlying data type.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
[<# elements> x <elementtype>]
|
|
|
|
The number of elements is a constant integer value; ``elementtype`` may
|
|
be any type with a size.
|
|
|
|
:Examples:
|
|
|
|
+------------------+--------------------------------------+
|
|
| ``[40 x i32]`` | Array of 40 32-bit integer values. |
|
|
+------------------+--------------------------------------+
|
|
| ``[41 x i32]`` | Array of 41 32-bit integer values. |
|
|
+------------------+--------------------------------------+
|
|
| ``[4 x i8]`` | Array of 4 8-bit integer values. |
|
|
+------------------+--------------------------------------+
|
|
|
|
Here are some examples of multidimensional arrays:
|
|
|
|
+-----------------------------+----------------------------------------------------------+
|
|
| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
|
|
+-----------------------------+----------------------------------------------------------+
|
|
| ``[12 x [10 x float]]`` | 12x10 array of single precision floating point values. |
|
|
+-----------------------------+----------------------------------------------------------+
|
|
| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
|
|
+-----------------------------+----------------------------------------------------------+
|
|
|
|
There is no restriction on indexing beyond the end of the array implied
|
|
by a static type (though there are restrictions on indexing beyond the
|
|
bounds of an allocated object in some cases). This means that
|
|
single-dimension 'variable sized array' addressing can be implemented in
|
|
LLVM with a zero length array type. An implementation of 'pascal style
|
|
arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
|
|
example.
|
|
|
|
.. _t_struct:
|
|
|
|
Structure Type
|
|
""""""""""""""
|
|
|
|
:Overview:
|
|
|
|
The structure type is used to represent a collection of data members
|
|
together in memory. The elements of a structure may be any type that has
|
|
a size.
|
|
|
|
Structures in memory are accessed using '``load``' and '``store``' by
|
|
getting a pointer to a field with the '``getelementptr``' instruction.
|
|
Structures in registers are accessed using the '``extractvalue``' and
|
|
'``insertvalue``' instructions.
|
|
|
|
Structures may optionally be "packed" structures, which indicate that
|
|
the alignment of the struct is one byte, and that there is no padding
|
|
between the elements. In non-packed structs, padding between field types
|
|
is inserted as defined by the DataLayout string in the module, which is
|
|
required to match what the underlying code generator expects.
|
|
|
|
Structures can either be "literal" or "identified". A literal structure
|
|
is defined inline with other types (e.g. ``{i32, i32}*``) whereas
|
|
identified types are always defined at the top level with a name.
|
|
Literal types are uniqued by their contents and can never be recursive
|
|
or opaque since there is no way to write one. Identified types can be
|
|
recursive, can be opaqued, and are never uniqued.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
%T1 = type { <type list> } ; Identified normal struct type
|
|
%T2 = type <{ <type list> }> ; Identified packed struct type
|
|
|
|
:Examples:
|
|
|
|
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values |
|
|
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. |
|
|
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
|
|
+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
|
|
.. _t_opaque:
|
|
|
|
Opaque Structure Types
|
|
""""""""""""""""""""""
|
|
|
|
:Overview:
|
|
|
|
Opaque structure types are used to represent named structure types that
|
|
do not have a body specified. This corresponds (for example) to the C
|
|
notion of a forward declared structure.
|
|
|
|
:Syntax:
|
|
|
|
::
|
|
|
|
%X = type opaque
|
|
%52 = type opaque
|
|
|
|
:Examples:
|
|
|
|
+--------------+-------------------+
|
|
| ``opaque`` | An opaque type. |
|
|
+--------------+-------------------+
|
|
|
|
.. _constants:
|
|
|
|
Constants
|
|
=========
|
|
|
|
LLVM has several different basic types of constants. This section
|
|
describes them all and their syntax.
|
|
|
|
Simple Constants
|
|
----------------
|
|
|
|
**Boolean constants**
|
|
The two strings '``true``' and '``false``' are both valid constants
|
|
of the ``i1`` type.
|
|
**Integer constants**
|
|
Standard integers (such as '4') are constants of the
|
|
:ref:`integer <t_integer>` type. Negative numbers may be used with
|
|
integer types.
|
|
**Floating point constants**
|
|
Floating point constants use standard decimal notation (e.g.
|
|
123.421), exponential notation (e.g. 1.23421e+2), or a more precise
|
|
hexadecimal notation (see below). The assembler requires the exact
|
|
decimal value of a floating-point constant. For example, the
|
|
assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
|
|
decimal in binary. Floating point constants must have a :ref:`floating
|
|
point <t_floating>` type.
|
|
**Null pointer constants**
|
|
The identifier '``null``' is recognized as a null pointer constant
|
|
and must be of :ref:`pointer type <t_pointer>`.
|
|
|
|
The one non-intuitive notation for constants is the hexadecimal form of
|
|
floating point constants. For example, the form
|
|
'``double 0x432ff973cafa8000``' is equivalent to (but harder to read
|
|
than) '``double 4.5e+15``'. The only time hexadecimal floating point
|
|
constants are required (and the only time that they are generated by the
|
|
disassembler) is when a floating point constant must be emitted but it
|
|
cannot be represented as a decimal floating point number in a reasonable
|
|
number of digits. For example, NaN's, infinities, and other special
|
|
values are represented in their IEEE hexadecimal format so that assembly
|
|
and disassembly do not cause any bits to change in the constants.
|
|
|
|
When using the hexadecimal form, constants of types half, float, and
|
|
double are represented using the 16-digit form shown above (which
|
|
matches the IEEE754 representation for double); half and float values
|
|
must, however, be exactly representable as IEEE 754 half and single
|
|
precision, respectively. Hexadecimal format is always used for long
|
|
double, and there are three forms of long double. The 80-bit format used
|
|
by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The
|
|
128-bit format used by PowerPC (two adjacent doubles) is represented by
|
|
``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
|
|
represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles
|
|
will only work if they match the long double format on your target.
|
|
The IEEE 16-bit format (half precision) is represented by ``0xH``
|
|
followed by 4 hexadecimal digits. All hexadecimal formats are big-endian
|
|
(sign bit at the left).
|
|
|
|
There are no constants of type x86_mmx.
|
|
|
|
.. _complexconstants:
|
|
|
|
Complex Constants
|
|
-----------------
|
|
|
|
Complex constants are a (potentially recursive) combination of simple
|
|
constants and smaller complex constants.
|
|
|
|
**Structure constants**
|
|
Structure constants are represented with notation similar to
|
|
structure type definitions (a comma separated list of elements,
|
|
surrounded by braces (``{}``)). For example:
|
|
"``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
|
|
"``@G = external global i32``". Structure constants must have
|
|
:ref:`structure type <t_struct>`, and the number and types of elements
|
|
must match those specified by the type.
|
|
**Array constants**
|
|
Array constants are represented with notation similar to array type
|
|
definitions (a comma separated list of elements, surrounded by
|
|
square brackets (``[]``)). For example:
|
|
"``[ i32 42, i32 11, i32 74 ]``". Array constants must have
|
|
:ref:`array type <t_array>`, and the number and types of elements must
|
|
match those specified by the type. As a special case, character array
|
|
constants may also be represented as a double-quoted string using the ``c``
|
|
prefix. For example: "``c"Hello World\0A\00"``".
|
|
**Vector constants**
|
|
Vector constants are represented with notation similar to vector
|
|
type definitions (a comma separated list of elements, surrounded by
|
|
less-than/greater-than's (``<>``)). For example:
|
|
"``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
|
|
must have :ref:`vector type <t_vector>`, and the number and types of
|
|
elements must match those specified by the type.
|
|
**Zero initialization**
|
|
The string '``zeroinitializer``' can be used to zero initialize a
|
|
value to zero of *any* type, including scalar and
|
|
:ref:`aggregate <t_aggregate>` types. This is often used to avoid
|
|
having to print large zero initializers (e.g. for large arrays) and
|
|
is always exactly equivalent to using explicit zero initializers.
|
|
**Metadata node**
|
|
A metadata node is a structure-like constant with :ref:`metadata
|
|
type <t_metadata>`. For example:
|
|
"``metadata !{ i32 0, metadata !"test" }``". Unlike other
|
|
constants that are meant to be interpreted as part of the
|
|
instruction stream, metadata is a place to attach additional
|
|
information such as debug info.
|
|
|
|
Global Variable and Function Addresses
|
|
--------------------------------------
|
|
|
|
The addresses of :ref:`global variables <globalvars>` and
|
|
:ref:`functions <functionstructure>` are always implicitly valid
|
|
(link-time) constants. These constants are explicitly referenced when
|
|
the :ref:`identifier for the global <identifiers>` is used and always have
|
|
:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
|
|
file:
|
|
|
|
.. code-block:: llvm
|
|
|
|
@X = global i32 17
|
|
@Y = global i32 42
|
|
@Z = global [2 x i32*] [ i32* @X, i32* @Y ]
|
|
|
|
.. _undefvalues:
|
|
|
|
Undefined Values
|
|
----------------
|
|
|
|
The string '``undef``' can be used anywhere a constant is expected, and
|
|
indicates that the user of the value may receive an unspecified
|
|
bit-pattern. Undefined values may be of any type (other than '``label``'
|
|
or '``void``') and be used anywhere a constant is permitted.
|
|
|
|
Undefined values are useful because they indicate to the compiler that
|
|
the program is well defined no matter what value is used. This gives the
|
|
compiler more freedom to optimize. Here are some examples of
|
|
(potentially surprising) transformations that are valid (in pseudo IR):
|
|
|
|
.. code-block:: llvm
|
|
|
|
%A = add %X, undef
|
|
%B = sub %X, undef
|
|
%C = xor %X, undef
|
|
Safe:
|
|
%A = undef
|
|
%B = undef
|
|
%C = undef
|
|
|
|
This is safe because all of the output bits are affected by the undef
|
|
bits. Any output bit can have a zero or one depending on the input bits.
|
|
|
|
.. code-block:: llvm
|
|
|
|
%A = or %X, undef
|
|
%B = and %X, undef
|
|
Safe:
|
|
%A = -1
|
|
%B = 0
|
|
Unsafe:
|
|
%A = undef
|
|
%B = undef
|
|
|
|
These logical operations have bits that are not always affected by the
|
|
input. For example, if ``%X`` has a zero bit, then the output of the
|
|
'``and``' operation will always be a zero for that bit, no matter what
|
|
the corresponding bit from the '``undef``' is. As such, it is unsafe to
|
|
optimize or assume that the result of the '``and``' is '``undef``'.
|
|
However, it is safe to assume that all bits of the '``undef``' could be
|
|
0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
|
|
all the bits of the '``undef``' operand to the '``or``' could be set,
|
|
allowing the '``or``' to be folded to -1.
|
|
|
|
.. code-block:: llvm
|
|
|
|
%A = select undef, %X, %Y
|
|
%B = select undef, 42, %Y
|
|
%C = select %X, %Y, undef
|
|
Safe:
|
|
%A = %X (or %Y)
|
|
%B = 42 (or %Y)
|
|
%C = %Y
|
|
Unsafe:
|
|
%A = undef
|
|
%B = undef
|
|
%C = undef
|
|
|
|
This set of examples shows that undefined '``select``' (and conditional
|
|
branch) conditions can go *either way*, but they have to come from one
|
|
of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
|
|
both known to have a clear low bit, then ``%A`` would have to have a
|
|
cleared low bit. However, in the ``%C`` example, the optimizer is
|
|
allowed to assume that the '``undef``' operand could be the same as
|
|
``%Y``, allowing the whole '``select``' to be eliminated.
|
|
|
|
.. code-block:: llvm
|
|
|
|
%A = xor undef, undef
|
|
|
|
%B = undef
|
|
%C = xor %B, %B
|
|
|
|
%D = undef
|
|
%E = icmp slt %D, 4
|
|
%F = icmp gte %D, 4
|
|
|
|
Safe:
|
|
%A = undef
|
|
%B = undef
|
|
%C = undef
|
|
%D = undef
|
|
%E = undef
|
|
%F = undef
|
|
|
|
This example points out that two '``undef``' operands are not
|
|
necessarily the same. This can be surprising to people (and also matches
|
|
C semantics) where they assume that "``X^X``" is always zero, even if
|
|
``X`` is undefined. This isn't true for a number of reasons, but the
|
|
short answer is that an '``undef``' "variable" can arbitrarily change
|
|
its value over its "live range". This is true because the variable
|
|
doesn't actually *have a live range*. Instead, the value is logically
|
|
read from arbitrary registers that happen to be around when needed, so
|
|
the value is not necessarily consistent over time. In fact, ``%A`` and
|
|
``%C`` need to have the same semantics or the core LLVM "replace all
|
|
uses with" concept would not hold.
|
|
|
|
.. code-block:: llvm
|
|
|
|
%A = fdiv undef, %X
|
|
%B = fdiv %X, undef
|
|
Safe:
|
|
%A = undef
|
|
b: unreachable
|
|
|
|
These examples show the crucial difference between an *undefined value*
|
|
and *undefined behavior*. An undefined value (like '``undef``') is
|
|
allowed to have an arbitrary bit-pattern. This means that the ``%A``
|
|
operation can be constant folded to '``undef``', because the '``undef``'
|
|
could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's.
|
|
However, in the second example, we can make a more aggressive
|
|
assumption: because the ``undef`` is allowed to be an arbitrary value,
|
|
we are allowed to assume that it could be zero. Since a divide by zero
|
|
has *undefined behavior*, we are allowed to assume that the operation
|
|
does not execute at all. This allows us to delete the divide and all
|
|
code after it. Because the undefined operation "can't happen", the
|
|
optimizer can assume that it occurs in dead code.
|
|
|
|
.. code-block:: llvm
|
|
|
|
a: store undef -> %X
|
|
b: store %X -> undef
|
|
Safe:
|
|
a: <deleted>
|
|
b: unreachable
|
|
|
|
These examples reiterate the ``fdiv`` example: a store *of* an undefined
|
|
value can be assumed to not have any effect; we can assume that the
|
|
value is overwritten with bits that happen to match what was already
|
|
there. However, a store *to* an undefined location could clobber
|
|
arbitrary memory, therefore, it has undefined behavior.
|
|
|
|
.. _poisonvalues:
|
|
|
|
Poison Values
|
|
-------------
|
|
|
|
Poison values are similar to :ref:`undef values <undefvalues>`, however
|
|
they also represent the fact that an instruction or constant expression
|
|
that cannot evoke side effects has nevertheless detected a condition
|
|
that results in undefined behavior.
|
|
|
|
There is currently no way of representing a poison value in the IR; they
|
|
only exist when produced by operations such as :ref:`add <i_add>` with
|
|
the ``nsw`` flag.
|
|
|
|
Poison value behavior is defined in terms of value *dependence*:
|
|
|
|
- Values other than :ref:`phi <i_phi>` nodes depend on their operands.
|
|
- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
|
|
their dynamic predecessor basic block.
|
|
- Function arguments depend on the corresponding actual argument values
|
|
in the dynamic callers of their functions.
|
|
- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
|
|
instructions that dynamically transfer control back to them.
|
|
- :ref:`Invoke <i_invoke>` instructions depend on the
|
|
:ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
|
|
call instructions that dynamically transfer control back to them.
|
|
- Non-volatile loads and stores depend on the most recent stores to all
|
|
of the referenced memory addresses, following the order in the IR
|
|
(including loads and stores implied by intrinsics such as
|
|
:ref:`@llvm.memcpy <int_memcpy>`.)
|
|
- An instruction with externally visible side effects depends on the
|
|
most recent preceding instruction with externally visible side
|
|
effects, following the order in the IR. (This includes :ref:`volatile
|
|
operations <volatile>`.)
|
|
- An instruction *control-depends* on a :ref:`terminator
|
|
instruction <terminators>` if the terminator instruction has
|
|
multiple successors and the instruction is always executed when
|
|
control transfers to one of the successors, and may not be executed
|
|
when control is transferred to another.
|
|
- Additionally, an instruction also *control-depends* on a terminator
|
|
instruction if the set of instructions it otherwise depends on would
|
|
be different if the terminator had transferred control to a different
|
|
successor.
|
|
- Dependence is transitive.
|
|
|
|
Poison values have the same behavior as :ref:`undef values <undefvalues>`,
|
|
with the additional effect that any instruction that has a *dependence*
|
|
on a poison value has undefined behavior.
|
|
|
|
Here are some examples:
|
|
|
|
.. code-block:: llvm
|
|
|
|
entry:
|
|
%poison = sub nuw i32 0, 1 ; Results in a poison value.
|
|
%still_poison = and i32 %poison, 0 ; 0, but also poison.
|
|
%poison_yet_again = getelementptr i32* @h, i32 %still_poison
|
|
store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned
|
|
|
|
store i32 %poison, i32* @g ; Poison value stored to memory.
|
|
%poison2 = load i32* @g ; Poison value loaded back from memory.
|
|
|
|
store volatile i32 %poison, i32* @g ; External observation; undefined behavior.
|
|
|
|
%narrowaddr = bitcast i32* @g to i16*
|
|
%wideaddr = bitcast i32* @g to i64*
|
|
%poison3 = load i16* %narrowaddr ; Returns a poison value.
|
|
%poison4 = load i64* %wideaddr ; Returns a poison value.
|
|
|
|
%cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
|
|
br i1 %cmp, label %true, label %end ; Branch to either destination.
|
|
|
|
true:
|
|
store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so
|
|
; it has undefined behavior.
|
|
br label %end
|
|
|
|
end:
|
|
%p = phi i32 [ 0, %entry ], [ 1, %true ]
|
|
; Both edges into this PHI are
|
|
; control-dependent on %cmp, so this
|
|
; always results in a poison value.
|
|
|
|
store volatile i32 0, i32* @g ; This would depend on the store in %true
|
|
; if %cmp is true, or the store in %entry
|
|
; otherwise, so this is undefined behavior.
|
|
|
|
br i1 %cmp, label %second_true, label %second_end
|
|
; The same branch again, but this time the
|
|
; true block doesn't have side effects.
|
|
|
|
second_true:
|
|
; No side effects!
|
|
ret void
|
|
|
|
second_end:
|
|
store volatile i32 0, i32* @g ; This time, the instruction always depends
|
|
; on the store in %end. Also, it is
|
|
; control-equivalent to %end, so this is
|
|
; well-defined (ignoring earlier undefined
|
|
; behavior in this example).
|
|
|
|
.. _blockaddress:
|
|
|
|
Addresses of Basic Blocks
|
|
-------------------------
|
|
|
|
``blockaddress(@function, %block)``
|
|
|
|
The '``blockaddress``' constant computes the address of the specified
|
|
basic block in the specified function, and always has an ``i8*`` type.
|
|
Taking the address of the entry block is illegal.
|
|
|
|
This value only has defined behavior when used as an operand to the
|
|
':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons
|
|
against null. Pointer equality tests between labels addresses results in
|
|
undefined behavior --- though, again, comparison against null is ok, and
|
|
no label is equal to the null pointer. This may be passed around as an
|
|
opaque pointer sized value as long as the bits are not inspected. This
|
|
allows ``ptrtoint`` and arithmetic to be performed on these values so
|
|
long as the original value is reconstituted before the ``indirectbr``
|
|
instruction.
|
|
|
|
Finally, some targets may provide defined semantics when using the value
|
|
as the operand to an inline assembly, but that is target specific.
|
|
|
|
.. _constantexprs:
|
|
|
|
Constant Expressions
|
|
--------------------
|
|
|
|
Constant expressions are used to allow expressions involving other
|
|
constants to be used as constants. Constant expressions may be of any
|
|
:ref:`first class <t_firstclass>` type and may involve any LLVM operation
|
|
that does not have side effects (e.g. load and call are not supported).
|
|
The following is the syntax for constant expressions:
|
|
|
|
``trunc (CST to TYPE)``
|
|
Truncate a constant to another type. The bit size of CST must be
|
|
larger than the bit size of TYPE. Both types must be integers.
|
|
``zext (CST to TYPE)``
|
|
Zero extend a constant to another type. The bit size of CST must be
|
|
smaller than the bit size of TYPE. Both types must be integers.
|
|
``sext (CST to TYPE)``
|
|
Sign extend a constant to another type. The bit size of CST must be
|
|
smaller than the bit size of TYPE. Both types must be integers.
|
|
``fptrunc (CST to TYPE)``
|
|
Truncate a floating point constant to another floating point type.
|
|
The size of CST must be larger than the size of TYPE. Both types
|
|
must be floating point.
|
|
``fpext (CST to TYPE)``
|
|
Floating point extend a constant to another type. The size of CST
|
|
must be smaller or equal to the size of TYPE. Both types must be
|
|
floating point.
|
|
``fptoui (CST to TYPE)``
|
|
Convert a floating point constant to the corresponding unsigned
|
|
integer constant. TYPE must be a scalar or vector integer type. CST
|
|
must be of scalar or vector floating point type. Both CST and TYPE
|
|
must be scalars, or vectors of the same number of elements. If the
|
|
value won't fit in the integer type, the results are undefined.
|
|
``fptosi (CST to TYPE)``
|
|
Convert a floating point constant to the corresponding signed
|
|
integer constant. TYPE must be a scalar or vector integer type. CST
|
|
must be of scalar or vector floating point type. Both CST and TYPE
|
|
must be scalars, or vectors of the same number of elements. If the
|
|
value won't fit in the integer type, the results are undefined.
|
|
``uitofp (CST to TYPE)``
|
|
Convert an unsigned integer constant to the corresponding floating
|
|
point constant. TYPE must be a scalar or vector floating point type.
|
|
CST must be of scalar or vector integer type. Both CST and TYPE must
|
|
be scalars, or vectors of the same number of elements. If the value
|
|
won't fit in the floating point type, the results are undefined.
|
|
``sitofp (CST to TYPE)``
|
|
Convert a signed integer constant to the corresponding floating
|
|
point constant. TYPE must be a scalar or vector floating point type.
|
|
CST must be of scalar or vector integer type. Both CST and TYPE must
|
|
be scalars, or vectors of the same number of elements. If the value
|
|
won't fit in the floating point type, the results are undefined.
|
|
``ptrtoint (CST to TYPE)``
|
|
Convert a pointer typed constant to the corresponding integer
|
|
constant. ``TYPE`` must be an integer type. ``CST`` must be of
|
|
pointer type. The ``CST`` value is zero extended, truncated, or
|
|
unchanged to make it fit in ``TYPE``.
|
|
``inttoptr (CST to TYPE)``
|
|
Convert an integer constant to a pointer constant. TYPE must be a
|
|
pointer type. CST must be of integer type. The CST value is zero
|
|
extended, truncated, or unchanged to make it fit in a pointer size.
|
|
This one is *really* dangerous!
|
|
``bitcast (CST to TYPE)``
|
|
Convert a constant, CST, to another TYPE. The constraints of the
|
|
operands are the same as those for the :ref:`bitcast
|
|
instruction <i_bitcast>`.
|
|
``addrspacecast (CST to TYPE)``
|
|
Convert a constant pointer or constant vector of pointer, CST, to another
|
|
TYPE in a different address space. The constraints of the operands are the
|
|
same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
|
|
``getelementptr (CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (CSTPTR, IDX0, IDX1, ...)``
|
|
Perform the :ref:`getelementptr operation <i_getelementptr>` on
|
|
constants. As with the :ref:`getelementptr <i_getelementptr>`
|
|
instruction, the index list may have zero or more indexes, which are
|
|
required to make sense for the type of "CSTPTR".
|
|
``select (COND, VAL1, VAL2)``
|
|
Perform the :ref:`select operation <i_select>` on constants.
|
|
``icmp COND (VAL1, VAL2)``
|
|
Performs the :ref:`icmp operation <i_icmp>` on constants.
|
|
``fcmp COND (VAL1, VAL2)``
|
|
Performs the :ref:`fcmp operation <i_fcmp>` on constants.
|
|
``extractelement (VAL, IDX)``
|
|
Perform the :ref:`extractelement operation <i_extractelement>` on
|
|
constants.
|
|
``insertelement (VAL, ELT, IDX)``
|
|
Perform the :ref:`insertelement operation <i_insertelement>` on
|
|
constants.
|
|
``shufflevector (VEC1, VEC2, IDXMASK)``
|
|
Perform the :ref:`shufflevector operation <i_shufflevector>` on
|
|
constants.
|
|
``extractvalue (VAL, IDX0, IDX1, ...)``
|
|
Perform the :ref:`extractvalue operation <i_extractvalue>` on
|
|
constants. The index list is interpreted in a similar manner as
|
|
indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
|
|
least one index value must be specified.
|
|
``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
|
|
Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
|
|
The index list is interpreted in a similar manner as indices in a
|
|
':ref:`getelementptr <i_getelementptr>`' operation. At least one index
|
|
value must be specified.
|
|
``OPCODE (LHS, RHS)``
|
|
Perform the specified operation of the LHS and RHS constants. OPCODE
|
|
may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
|
|
binary <bitwiseops>` operations. The constraints on operands are
|
|
the same as those for the corresponding instruction (e.g. no bitwise
|
|
operations on floating point values are allowed).
|
|
|
|
Other Values
|
|
============
|
|
|
|
.. _inlineasmexprs:
|
|
|
|
Inline Assembler Expressions
|
|
----------------------------
|
|
|
|
LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
|
|
Inline Assembly <moduleasm>`) through the use of a special value. This
|
|
value represents the inline assembler as a string (containing the
|
|
instructions to emit), a list of operand constraints (stored as a
|
|
string), a flag that indicates whether or not the inline asm expression
|
|
has side effects, and a flag indicating whether the function containing
|
|
the asm needs to align its stack conservatively. An example inline
|
|
assembler expression is:
|
|
|
|
.. code-block:: llvm
|
|
|
|
i32 (i32) asm "bswap $0", "=r,r"
|
|
|
|
Inline assembler expressions may **only** be used as the callee operand
|
|
of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
|
|
Thus, typically we have:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
|
|
|
|
Inline asms with side effects not visible in the constraint list must be
|
|
marked as having side effects. This is done through the use of the
|
|
'``sideeffect``' keyword, like so:
|
|
|
|
.. code-block:: llvm
|
|
|
|
call void asm sideeffect "eieio", ""()
|
|
|
|
In some cases inline asms will contain code that will not work unless
|
|
the stack is aligned in some way, such as calls or SSE instructions on
|
|
x86, yet will not contain code that does that alignment within the asm.
|
|
The compiler should make conservative assumptions about what the asm
|
|
might contain and should generate its usual stack alignment code in the
|
|
prologue if the '``alignstack``' keyword is present:
|
|
|
|
.. code-block:: llvm
|
|
|
|
call void asm alignstack "eieio", ""()
|
|
|
|
Inline asms also support using non-standard assembly dialects. The
|
|
assumed dialect is ATT. When the '``inteldialect``' keyword is present,
|
|
the inline asm is using the Intel dialect. Currently, ATT and Intel are
|
|
the only supported dialects. An example is:
|
|
|
|
.. code-block:: llvm
|
|
|
|
call void asm inteldialect "eieio", ""()
|
|
|
|
If multiple keywords appear the '``sideeffect``' keyword must come
|
|
first, the '``alignstack``' keyword second and the '``inteldialect``'
|
|
keyword last.
|
|
|
|
Inline Asm Metadata
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
The call instructions that wrap inline asm nodes may have a
|
|
"``!srcloc``" MDNode attached to it that contains a list of constant
|
|
integers. If present, the code generator will use the integer as the
|
|
location cookie value when report errors through the ``LLVMContext``
|
|
error reporting mechanisms. This allows a front-end to correlate backend
|
|
errors that occur with inline asm back to the source code that produced
|
|
it. For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
call void asm sideeffect "something bad", ""(), !srcloc !42
|
|
...
|
|
!42 = !{ i32 1234567 }
|
|
|
|
It is up to the front-end to make sense of the magic numbers it places
|
|
in the IR. If the MDNode contains multiple constants, the code generator
|
|
will use the one that corresponds to the line of the asm that the error
|
|
occurs on.
|
|
|
|
.. _metadata:
|
|
|
|
Metadata Nodes and Metadata Strings
|
|
-----------------------------------
|
|
|
|
LLVM IR allows metadata to be attached to instructions in the program
|
|
that can convey extra information about the code to the optimizers and
|
|
code generator. One example application of metadata is source-level
|
|
debug information. There are two metadata primitives: strings and nodes.
|
|
All metadata has the ``metadata`` type and is identified in syntax by a
|
|
preceding exclamation point ('``!``').
|
|
|
|
A metadata string is a string surrounded by double quotes. It can
|
|
contain any character by escaping non-printable characters with
|
|
"``\xx``" where "``xx``" is the two digit hex code. For example:
|
|
"``!"test\00"``".
|
|
|
|
Metadata nodes are represented with notation similar to structure
|
|
constants (a comma separated list of elements, surrounded by braces and
|
|
preceded by an exclamation point). Metadata nodes can have any values as
|
|
their operand. For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!{ metadata !"test\00", i32 10}
|
|
|
|
A :ref:`named metadata <namedmetadatastructure>` is a collection of
|
|
metadata nodes, which can be looked up in the module symbol table. For
|
|
example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!foo = metadata !{!4, !3}
|
|
|
|
Metadata can be used as function arguments. Here ``llvm.dbg.value``
|
|
function is using two metadata arguments:
|
|
|
|
.. code-block:: llvm
|
|
|
|
call void @llvm.dbg.value(metadata !24, i64 0, metadata !25)
|
|
|
|
Metadata can be attached with an instruction. Here metadata ``!21`` is
|
|
attached to the ``add`` instruction using the ``!dbg`` identifier:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%indvar.next = add i64 %indvar, 1, !dbg !21
|
|
|
|
More information about specific metadata nodes recognized by the
|
|
optimizers and code generator is found below.
|
|
|
|
'``tbaa``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
In LLVM IR, memory does not have types, so LLVM's own type system is not
|
|
suitable for doing TBAA. Instead, metadata is added to the IR to
|
|
describe a type system of a higher level language. This can be used to
|
|
implement typical C/C++ TBAA, but it can also be used to implement
|
|
custom alias analysis behavior for other languages.
|
|
|
|
The current metadata format is very simple. TBAA metadata nodes have up
|
|
to three fields, e.g.:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"an example type tree" }
|
|
!1 = metadata !{ metadata !"int", metadata !0 }
|
|
!2 = metadata !{ metadata !"float", metadata !0 }
|
|
!3 = metadata !{ metadata !"const float", metadata !2, i64 1 }
|
|
|
|
The first field is an identity field. It can be any value, usually a
|
|
metadata string, which uniquely identifies the type. The most important
|
|
name in the tree is the name of the root node. Two trees with different
|
|
root node names are entirely disjoint, even if they have leaves with
|
|
common names.
|
|
|
|
The second field identifies the type's parent node in the tree, or is
|
|
null or omitted for a root node. A type is considered to alias all of
|
|
its descendants and all of its ancestors in the tree. Also, a type is
|
|
considered to alias all types in other trees, so that bitcode produced
|
|
from multiple front-ends is handled conservatively.
|
|
|
|
If the third field is present, it's an integer which if equal to 1
|
|
indicates that the type is "constant" (meaning
|
|
``pointsToConstantMemory`` should return true; see `other useful
|
|
AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).
|
|
|
|
'``tbaa.struct``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
|
|
aggregate assignment operations in C and similar languages, however it
|
|
is defined to copy a contiguous region of memory, which is more than
|
|
strictly necessary for aggregate types which contain holes due to
|
|
padding. Also, it doesn't contain any TBAA information about the fields
|
|
of the aggregate.
|
|
|
|
``!tbaa.struct`` metadata can describe which memory subregions in a
|
|
memcpy are padding and what the TBAA tags of the struct are.
|
|
|
|
The current metadata format is very simple. ``!tbaa.struct`` metadata
|
|
nodes are a list of operands which are in conceptual groups of three.
|
|
For each group of three, the first operand gives the byte offset of a
|
|
field in bytes, the second gives its size in bytes, and the third gives
|
|
its tbaa tag. e.g.:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!4 = metadata !{ i64 0, i64 4, metadata !1, i64 8, i64 4, metadata !2 }
|
|
|
|
This describes a struct with two fields. The first is at offset 0 bytes
|
|
with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
|
|
and has size 4 bytes and has tbaa tag !2.
|
|
|
|
Note that the fields need not be contiguous. In this example, there is a
|
|
4 byte gap between the two fields. This gap represents padding which
|
|
does not carry useful data and need not be preserved.
|
|
|
|
'``noalias``' and '``alias.scope``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
|
|
noalias memory-access sets. This means that some collection of memory access
|
|
instructions (loads, stores, memory-accessing calls, etc.) that carry
|
|
``noalias`` metadata can specifically be specified not to alias with some other
|
|
collection of memory access instructions that carry ``alias.scope`` metadata.
|
|
Each type of metadata specifies a list of scopes where each scope has an id and
|
|
a domain. When evaluating an aliasing query, if for some some domain, the set
|
|
of scopes with that domain in one instruction's ``alias.scope`` list is a
|
|
subset of (or qual to) the set of scopes for that domain in another
|
|
instruction's ``noalias`` list, then the two memory accesses are assumed not to
|
|
alias.
|
|
|
|
The metadata identifying each domain is itself a list containing one or two
|
|
entries. The first entry is the name of the domain. Note that if the name is a
|
|
string then it can be combined accross functions and translation units. A
|
|
self-reference can be used to create globally unique domain names. A
|
|
descriptive string may optionally be provided as a second list entry.
|
|
|
|
The metadata identifying each scope is also itself a list containing two or
|
|
three entries. The first entry is the name of the scope. Note that if the name
|
|
is a string then it can be combined accross functions and translation units. A
|
|
self-reference can be used to create globally unique scope names. A metadata
|
|
reference to the scope's domain is the second entry. A descriptive string may
|
|
optionally be provided as a third list entry.
|
|
|
|
For example,
|
|
|
|
.. code-block:: llvm
|
|
|
|
; Two scope domains:
|
|
!0 = metadata !{metadata !0}
|
|
!1 = metadata !{metadata !1}
|
|
|
|
; Some scopes in these domains:
|
|
!2 = metadata !{metadata !2, metadata !0}
|
|
!3 = metadata !{metadata !3, metadata !0}
|
|
!4 = metadata !{metadata !4, metadata !1}
|
|
|
|
; Some scope lists:
|
|
!5 = metadata !{metadata !4} ; A list containing only scope !4
|
|
!6 = metadata !{metadata !4, metadata !3, metadata !2}
|
|
!7 = metadata !{metadata !3}
|
|
|
|
; These two instructions don't alias:
|
|
%0 = load float* %c, align 4, !alias.scope !5
|
|
store float %0, float* %arrayidx.i, align 4, !noalias !5
|
|
|
|
; These two instructions also don't alias (for domain !1, the set of scopes
|
|
; in the !alias.scope equals that in the !noalias list):
|
|
%2 = load float* %c, align 4, !alias.scope !5
|
|
store float %2, float* %arrayidx.i2, align 4, !noalias !6
|
|
|
|
; These two instructions don't alias (for domain !0, the set of scopes in
|
|
; the !noalias list is not a superset of, or equal to, the scopes in the
|
|
; !alias.scope list):
|
|
%2 = load float* %c, align 4, !alias.scope !6
|
|
store float %0, float* %arrayidx.i, align 4, !noalias !7
|
|
|
|
'``fpmath``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
``fpmath`` metadata may be attached to any instruction of floating point
|
|
type. It can be used to express the maximum acceptable error in the
|
|
result of that instruction, in ULPs, thus potentially allowing the
|
|
compiler to use a more efficient but less accurate method of computing
|
|
it. ULP is defined as follows:
|
|
|
|
If ``x`` is a real number that lies between two finite consecutive
|
|
floating-point numbers ``a`` and ``b``, without being equal to one
|
|
of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
|
|
distance between the two non-equal finite floating-point numbers
|
|
nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
|
|
|
|
The metadata node shall consist of a single positive floating point
|
|
number representing the maximum relative error, for example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
|
|
|
|
'``range``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
|
|
integer types. It expresses the possible ranges the loaded value or the value
|
|
returned by the called function at this call site is in. The ranges are
|
|
represented with a flattened list of integers. The loaded value or the value
|
|
returned is known to be in the union of the ranges defined by each consecutive
|
|
pair. Each pair has the following properties:
|
|
|
|
- The type must match the type loaded by the instruction.
|
|
- The pair ``a,b`` represents the range ``[a,b)``.
|
|
- Both ``a`` and ``b`` are constants.
|
|
- The range is allowed to wrap.
|
|
- The range should not represent the full or empty set. That is,
|
|
``a!=b``.
|
|
|
|
In addition, the pairs must be in signed order of the lower bound and
|
|
they must be non-contiguous.
|
|
|
|
Examples:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%a = load i8* %x, align 1, !range !0 ; Can only be 0 or 1
|
|
%b = load i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
|
|
%c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
|
|
%d = invoke i8 @bar() to label %cont
|
|
unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
|
|
...
|
|
!0 = metadata !{ i8 0, i8 2 }
|
|
!1 = metadata !{ i8 255, i8 2 }
|
|
!2 = metadata !{ i8 0, i8 2, i8 3, i8 6 }
|
|
!3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 }
|
|
|
|
'``llvm.loop``'
|
|
^^^^^^^^^^^^^^^
|
|
|
|
It is sometimes useful to attach information to loop constructs. Currently,
|
|
loop metadata is implemented as metadata attached to the branch instruction
|
|
in the loop latch block. This type of metadata refer to a metadata node that is
|
|
guaranteed to be separate for each loop. The loop identifier metadata is
|
|
specified with the name ``llvm.loop``.
|
|
|
|
The loop identifier metadata is implemented using a metadata that refers to
|
|
itself to avoid merging it with any other identifier metadata, e.g.,
|
|
during module linkage or function inlining. That is, each loop should refer
|
|
to their own identification metadata even if they reside in separate functions.
|
|
The following example contains loop identifier metadata for two separate loop
|
|
constructs:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !0 }
|
|
!1 = metadata !{ metadata !1 }
|
|
|
|
The loop identifier metadata can be used to specify additional
|
|
per-loop metadata. Any operands after the first operand can be treated
|
|
as user-defined metadata. For example the ``llvm.loop.unroll.count``
|
|
suggests an unroll factor to the loop unroller:
|
|
|
|
.. code-block:: llvm
|
|
|
|
br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
|
|
...
|
|
!0 = metadata !{ metadata !0, metadata !1 }
|
|
!1 = metadata !{ metadata !"llvm.loop.unroll.count", i32 4 }
|
|
|
|
'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
|
|
used to control per-loop vectorization and interleaving parameters such as
|
|
vectorization width and interleave count. These metadata should be used in
|
|
conjunction with ``llvm.loop`` loop identification metadata. The
|
|
``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
|
|
optimization hints and the optimizer will only interleave and vectorize loops if
|
|
it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata
|
|
which contains information about loop-carried memory dependencies can be helpful
|
|
in determining the safety of these transformations.
|
|
|
|
'``llvm.loop.interleave.count``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This metadata suggests an interleave count to the loop interleaver.
|
|
The first operand is the string ``llvm.loop.interleave.count`` and the
|
|
second operand is an integer specifying the interleave count. For
|
|
example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"llvm.loop.interleave.count", i32 4 }
|
|
|
|
Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
|
|
multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
|
|
then the interleave count will be determined automatically.
|
|
|
|
'``llvm.loop.vectorize.enable``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This metadata selectively enables or disables vectorization for the loop. The
|
|
first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
|
|
is a bit. If the bit operand value is 1 vectorization is enabled. A value of
|
|
0 disables vectorization:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"llvm.loop.vectorize.enable", i1 0 }
|
|
!1 = metadata !{ metadata !"llvm.loop.vectorize.enable", i1 1 }
|
|
|
|
'``llvm.loop.vectorize.width``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This metadata sets the target width of the vectorizer. The first
|
|
operand is the string ``llvm.loop.vectorize.width`` and the second
|
|
operand is an integer specifying the width. For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"llvm.loop.vectorize.width", i32 4 }
|
|
|
|
Note that setting ``llvm.loop.vectorize.width`` to 1 disables
|
|
vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
|
|
0 or if the loop does not have this metadata the width will be
|
|
determined automatically.
|
|
|
|
'``llvm.loop.unroll``'
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
|
|
optimization hints such as the unroll factor. ``llvm.loop.unroll``
|
|
metadata should be used in conjunction with ``llvm.loop`` loop
|
|
identification metadata. The ``llvm.loop.unroll`` metadata are only
|
|
optimization hints and the unrolling will only be performed if the
|
|
optimizer believes it is safe to do so.
|
|
|
|
'``llvm.loop.unroll.count``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This metadata suggests an unroll factor to the loop unroller. The
|
|
first operand is the string ``llvm.loop.unroll.count`` and the second
|
|
operand is a positive integer specifying the unroll factor. For
|
|
example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"llvm.loop.unroll.count", i32 4 }
|
|
|
|
If the trip count of the loop is less than the unroll count the loop
|
|
will be partially unrolled.
|
|
|
|
'``llvm.loop.unroll.disable``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This metadata either disables loop unrolling. The metadata has a single operand
|
|
which is the string ``llvm.loop.unroll.disable``. For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"llvm.loop.unroll.disable" }
|
|
|
|
'``llvm.loop.unroll.full``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This metadata either suggests that the loop should be unrolled fully. The
|
|
metadata has a single operand which is the string ``llvm.loop.unroll.disable``.
|
|
For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ metadata !"llvm.loop.unroll.full" }
|
|
|
|
'``llvm.mem``'
|
|
^^^^^^^^^^^^^^^
|
|
|
|
Metadata types used to annotate memory accesses with information helpful
|
|
for optimizations are prefixed with ``llvm.mem``.
|
|
|
|
'``llvm.mem.parallel_loop_access``' Metadata
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier,
|
|
or metadata containing a list of loop identifiers for nested loops.
|
|
The metadata is attached to memory accessing instructions and denotes that
|
|
no loop carried memory dependence exist between it and other instructions denoted
|
|
with the same loop identifier.
|
|
|
|
Precisely, given two instructions ``m1`` and ``m2`` that both have the
|
|
``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
|
|
set of loops associated with that metadata, respectively, then there is no loop
|
|
carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
|
|
``L2``.
|
|
|
|
As a special case, if all memory accessing instructions in a loop have
|
|
``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the
|
|
loop has no loop carried memory dependences and is considered to be a parallel
|
|
loop.
|
|
|
|
Note that if not all memory access instructions have such metadata referring to
|
|
the loop, then the loop is considered not being trivially parallel. Additional
|
|
memory dependence analysis is required to make that determination. As a fail
|
|
safe mechanism, this causes loops that were originally parallel to be considered
|
|
sequential (if optimization passes that are unaware of the parallel semantics
|
|
insert new memory instructions into the loop body).
|
|
|
|
Example of a loop that is considered parallel due to its correct use of
|
|
both ``llvm.loop`` and ``llvm.mem.parallel_loop_access``
|
|
metadata types that refer to the same loop identifier metadata.
|
|
|
|
.. code-block:: llvm
|
|
|
|
for.body:
|
|
...
|
|
%val0 = load i32* %arrayidx, !llvm.mem.parallel_loop_access !0
|
|
...
|
|
store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
|
|
...
|
|
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
|
|
|
|
for.end:
|
|
...
|
|
!0 = metadata !{ metadata !0 }
|
|
|
|
It is also possible to have nested parallel loops. In that case the
|
|
memory accesses refer to a list of loop identifier metadata nodes instead of
|
|
the loop identifier metadata node directly:
|
|
|
|
.. code-block:: llvm
|
|
|
|
outer.for.body:
|
|
...
|
|
%val1 = load i32* %arrayidx3, !llvm.mem.parallel_loop_access !2
|
|
...
|
|
br label %inner.for.body
|
|
|
|
inner.for.body:
|
|
...
|
|
%val0 = load i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
|
|
...
|
|
store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0
|
|
...
|
|
br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
|
|
|
|
inner.for.end:
|
|
...
|
|
store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2
|
|
...
|
|
br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
|
|
|
|
outer.for.end: ; preds = %for.body
|
|
...
|
|
!0 = metadata !{ metadata !1, metadata !2 } ; a list of loop identifiers
|
|
!1 = metadata !{ metadata !1 } ; an identifier for the inner loop
|
|
!2 = metadata !{ metadata !2 } ; an identifier for the outer loop
|
|
|
|
Module Flags Metadata
|
|
=====================
|
|
|
|
Information about the module as a whole is difficult to convey to LLVM's
|
|
subsystems. The LLVM IR isn't sufficient to transmit this information.
|
|
The ``llvm.module.flags`` named metadata exists in order to facilitate
|
|
this. These flags are in the form of key / value pairs --- much like a
|
|
dictionary --- making it easy for any subsystem who cares about a flag to
|
|
look it up.
|
|
|
|
The ``llvm.module.flags`` metadata contains a list of metadata triplets.
|
|
Each triplet has the following form:
|
|
|
|
- The first element is a *behavior* flag, which specifies the behavior
|
|
when two (or more) modules are merged together, and it encounters two
|
|
(or more) metadata with the same ID. The supported behaviors are
|
|
described below.
|
|
- The second element is a metadata string that is a unique ID for the
|
|
metadata. Each module may only have one flag entry for each unique ID (not
|
|
including entries with the **Require** behavior).
|
|
- The third element is the value of the flag.
|
|
|
|
When two (or more) modules are merged together, the resulting
|
|
``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
|
|
each unique metadata ID string, there will be exactly one entry in the merged
|
|
modules ``llvm.module.flags`` metadata table, and the value for that entry will
|
|
be determined by the merge behavior flag, as described below. The only exception
|
|
is that entries with the *Require* behavior are always preserved.
|
|
|
|
The following behaviors are supported:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:widths: 10 90
|
|
|
|
* - Value
|
|
- Behavior
|
|
|
|
* - 1
|
|
- **Error**
|
|
Emits an error if two values disagree, otherwise the resulting value
|
|
is that of the operands.
|
|
|
|
* - 2
|
|
- **Warning**
|
|
Emits a warning if two values disagree. The result value will be the
|
|
operand for the flag from the first module being linked.
|
|
|
|
* - 3
|
|
- **Require**
|
|
Adds a requirement that another module flag be present and have a
|
|
specified value after linking is performed. The value must be a
|
|
metadata pair, where the first element of the pair is the ID of the
|
|
module flag to be restricted, and the second element of the pair is
|
|
the value the module flag should be restricted to. This behavior can
|
|
be used to restrict the allowable results (via triggering of an
|
|
error) of linking IDs with the **Override** behavior.
|
|
|
|
* - 4
|
|
- **Override**
|
|
Uses the specified value, regardless of the behavior or value of the
|
|
other module. If both modules specify **Override**, but the values
|
|
differ, an error will be emitted.
|
|
|
|
* - 5
|
|
- **Append**
|
|
Appends the two values, which are required to be metadata nodes.
|
|
|
|
* - 6
|
|
- **AppendUnique**
|
|
Appends the two values, which are required to be metadata
|
|
nodes. However, duplicate entries in the second list are dropped
|
|
during the append operation.
|
|
|
|
It is an error for a particular unique flag ID to have multiple behaviors,
|
|
except in the case of **Require** (which adds restrictions on another metadata
|
|
value) or **Override**.
|
|
|
|
An example of module flags:
|
|
|
|
.. code-block:: llvm
|
|
|
|
!0 = metadata !{ i32 1, metadata !"foo", i32 1 }
|
|
!1 = metadata !{ i32 4, metadata !"bar", i32 37 }
|
|
!2 = metadata !{ i32 2, metadata !"qux", i32 42 }
|
|
!3 = metadata !{ i32 3, metadata !"qux",
|
|
metadata !{
|
|
metadata !"foo", i32 1
|
|
}
|
|
}
|
|
!llvm.module.flags = !{ !0, !1, !2, !3 }
|
|
|
|
- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
|
|
if two or more ``!"foo"`` flags are seen is to emit an error if their
|
|
values are not equal.
|
|
|
|
- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
|
|
behavior if two or more ``!"bar"`` flags are seen is to use the value
|
|
'37'.
|
|
|
|
- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
|
|
behavior if two or more ``!"qux"`` flags are seen is to emit a
|
|
warning if their values are not equal.
|
|
|
|
- Metadata ``!3`` has the ID ``!"qux"`` and the value:
|
|
|
|
::
|
|
|
|
metadata !{ metadata !"foo", i32 1 }
|
|
|
|
The behavior is to emit an error if the ``llvm.module.flags`` does not
|
|
contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
|
|
performed.
|
|
|
|
Objective-C Garbage Collection Module Flags Metadata
|
|
----------------------------------------------------
|
|
|
|
On the Mach-O platform, Objective-C stores metadata about garbage
|
|
collection in a special section called "image info". The metadata
|
|
consists of a version number and a bitmask specifying what types of
|
|
garbage collection are supported (if any) by the file. If two or more
|
|
modules are linked together their garbage collection metadata needs to
|
|
be merged rather than appended together.
|
|
|
|
The Objective-C garbage collection module flags metadata consists of the
|
|
following key-value pairs:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:widths: 30 70
|
|
|
|
* - Key
|
|
- Value
|
|
|
|
* - ``Objective-C Version``
|
|
- **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
|
|
|
|
* - ``Objective-C Image Info Version``
|
|
- **[Required]** --- The version of the image info section. Currently
|
|
always 0.
|
|
|
|
* - ``Objective-C Image Info Section``
|
|
- **[Required]** --- The section to place the metadata. Valid values are
|
|
``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
|
|
``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
|
|
Objective-C ABI version 2.
|
|
|
|
* - ``Objective-C Garbage Collection``
|
|
- **[Required]** --- Specifies whether garbage collection is supported or
|
|
not. Valid values are 0, for no garbage collection, and 2, for garbage
|
|
collection supported.
|
|
|
|
* - ``Objective-C GC Only``
|
|
- **[Optional]** --- Specifies that only garbage collection is supported.
|
|
If present, its value must be 6. This flag requires that the
|
|
``Objective-C Garbage Collection`` flag have the value 2.
|
|
|
|
Some important flag interactions:
|
|
|
|
- If a module with ``Objective-C Garbage Collection`` set to 0 is
|
|
merged with a module with ``Objective-C Garbage Collection`` set to
|
|
2, then the resulting module has the
|
|
``Objective-C Garbage Collection`` flag set to 0.
|
|
- A module with ``Objective-C Garbage Collection`` set to 0 cannot be
|
|
merged with a module with ``Objective-C GC Only`` set to 6.
|
|
|
|
Automatic Linker Flags Module Flags Metadata
|
|
--------------------------------------------
|
|
|
|
Some targets support embedding flags to the linker inside individual object
|
|
files. Typically this is used in conjunction with language extensions which
|
|
allow source files to explicitly declare the libraries they depend on, and have
|
|
these automatically be transmitted to the linker via object files.
|
|
|
|
These flags are encoded in the IR using metadata in the module flags section,
|
|
using the ``Linker Options`` key. The merge behavior for this flag is required
|
|
to be ``AppendUnique``, and the value for the key is expected to be a metadata
|
|
node which should be a list of other metadata nodes, each of which should be a
|
|
list of metadata strings defining linker options.
|
|
|
|
For example, the following metadata section specifies two separate sets of
|
|
linker options, presumably to link against ``libz`` and the ``Cocoa``
|
|
framework::
|
|
|
|
!0 = metadata !{ i32 6, metadata !"Linker Options",
|
|
metadata !{
|
|
metadata !{ metadata !"-lz" },
|
|
metadata !{ metadata !"-framework", metadata !"Cocoa" } } }
|
|
!llvm.module.flags = !{ !0 }
|
|
|
|
The metadata encoding as lists of lists of options, as opposed to a collapsed
|
|
list of options, is chosen so that the IR encoding can use multiple option
|
|
strings to specify e.g., a single library, while still having that specifier be
|
|
preserved as an atomic element that can be recognized by a target specific
|
|
assembly writer or object file emitter.
|
|
|
|
Each individual option is required to be either a valid option for the target's
|
|
linker, or an option that is reserved by the target specific assembly writer or
|
|
object file emitter. No other aspect of these options is defined by the IR.
|
|
|
|
C type width Module Flags Metadata
|
|
----------------------------------
|
|
|
|
The ARM backend emits a section into each generated object file describing the
|
|
options that it was compiled with (in a compiler-independent way) to prevent
|
|
linking incompatible objects, and to allow automatic library selection. Some
|
|
of these options are not visible at the IR level, namely wchar_t width and enum
|
|
width.
|
|
|
|
To pass this information to the backend, these options are encoded in module
|
|
flags metadata, using the following key-value pairs:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:widths: 30 70
|
|
|
|
* - Key
|
|
- Value
|
|
|
|
* - short_wchar
|
|
- * 0 --- sizeof(wchar_t) == 4
|
|
* 1 --- sizeof(wchar_t) == 2
|
|
|
|
* - short_enum
|
|
- * 0 --- Enums are at least as large as an ``int``.
|
|
* 1 --- Enums are stored in the smallest integer type which can
|
|
represent all of its values.
|
|
|
|
For example, the following metadata section specifies that the module was
|
|
compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
|
|
enum is the smallest type which can represent all of its values::
|
|
|
|
!llvm.module.flags = !{!0, !1}
|
|
!0 = metadata !{i32 1, metadata !"short_wchar", i32 1}
|
|
!1 = metadata !{i32 1, metadata !"short_enum", i32 0}
|
|
|
|
.. _intrinsicglobalvariables:
|
|
|
|
Intrinsic Global Variables
|
|
==========================
|
|
|
|
LLVM has a number of "magic" global variables that contain data that
|
|
affect code generation or other IR semantics. These are documented here.
|
|
All globals of this sort should have a section specified as
|
|
"``llvm.metadata``". This section and all globals that start with
|
|
"``llvm.``" are reserved for use by LLVM.
|
|
|
|
.. _gv_llvmused:
|
|
|
|
The '``llvm.used``' Global Variable
|
|
-----------------------------------
|
|
|
|
The ``@llvm.used`` global is an array which has
|
|
:ref:`appending linkage <linkage_appending>`. This array contains a list of
|
|
pointers to named global variables, functions and aliases which may optionally
|
|
have a pointer cast formed of bitcast or getelementptr. For example, a legal
|
|
use of it is:
|
|
|
|
.. code-block:: llvm
|
|
|
|
@X = global i8 4
|
|
@Y = global i32 123
|
|
|
|
@llvm.used = appending global [2 x i8*] [
|
|
i8* @X,
|
|
i8* bitcast (i32* @Y to i8*)
|
|
], section "llvm.metadata"
|
|
|
|
If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
|
|
and linker are required to treat the symbol as if there is a reference to the
|
|
symbol that it cannot see (which is why they have to be named). For example, if
|
|
a variable has internal linkage and no references other than that from the
|
|
``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
|
|
references from inline asms and other things the compiler cannot "see", and
|
|
corresponds to "``attribute((used))``" in GNU C.
|
|
|
|
On some targets, the code generator must emit a directive to the
|
|
assembler or object file to prevent the assembler and linker from
|
|
molesting the symbol.
|
|
|
|
.. _gv_llvmcompilerused:
|
|
|
|
The '``llvm.compiler.used``' Global Variable
|
|
--------------------------------------------
|
|
|
|
The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
|
|
directive, except that it only prevents the compiler from touching the
|
|
symbol. On targets that support it, this allows an intelligent linker to
|
|
optimize references to the symbol without being impeded as it would be
|
|
by ``@llvm.used``.
|
|
|
|
This is a rare construct that should only be used in rare circumstances,
|
|
and should not be exposed to source languages.
|
|
|
|
.. _gv_llvmglobalctors:
|
|
|
|
The '``llvm.global_ctors``' Global Variable
|
|
-------------------------------------------
|
|
|
|
.. code-block:: llvm
|
|
|
|
%0 = type { i32, void ()*, i8* }
|
|
@llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
|
|
|
|
The ``@llvm.global_ctors`` array contains a list of constructor
|
|
functions, priorities, and an optional associated global or function.
|
|
The functions referenced by this array will be called in ascending order
|
|
of priority (i.e. lowest first) when the module is loaded. The order of
|
|
functions with the same priority is not defined.
|
|
|
|
If the third field is present, non-null, and points to a global variable
|
|
or function, the initializer function will only run if the associated
|
|
data from the current module is not discarded.
|
|
|
|
.. _llvmglobaldtors:
|
|
|
|
The '``llvm.global_dtors``' Global Variable
|
|
-------------------------------------------
|
|
|
|
.. code-block:: llvm
|
|
|
|
%0 = type { i32, void ()*, i8* }
|
|
@llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
|
|
|
|
The ``@llvm.global_dtors`` array contains a list of destructor
|
|
functions, priorities, and an optional associated global or function.
|
|
The functions referenced by this array will be called in descending
|
|
order of priority (i.e. highest first) when the module is unloaded. The
|
|
order of functions with the same priority is not defined.
|
|
|
|
If the third field is present, non-null, and points to a global variable
|
|
or function, the destructor function will only run if the associated
|
|
data from the current module is not discarded.
|
|
|
|
Instruction Reference
|
|
=====================
|
|
|
|
The LLVM instruction set consists of several different classifications
|
|
of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
|
|
instructions <binaryops>`, :ref:`bitwise binary
|
|
instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
|
|
:ref:`other instructions <otherops>`.
|
|
|
|
.. _terminators:
|
|
|
|
Terminator Instructions
|
|
-----------------------
|
|
|
|
As mentioned :ref:`previously <functionstructure>`, every basic block in a
|
|
program ends with a "Terminator" instruction, which indicates which
|
|
block should be executed after the current block is finished. These
|
|
terminator instructions typically yield a '``void``' value: they produce
|
|
control flow, not values (the one exception being the
|
|
':ref:`invoke <i_invoke>`' instruction).
|
|
|
|
The terminator instructions are: ':ref:`ret <i_ret>`',
|
|
':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
|
|
':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
|
|
':ref:`resume <i_resume>`', and ':ref:`unreachable <i_unreachable>`'.
|
|
|
|
.. _i_ret:
|
|
|
|
'``ret``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
ret <type> <value> ; Return a value from a non-void function
|
|
ret void ; Return from void function
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``ret``' instruction is used to return control flow (and optionally
|
|
a value) from a function back to the caller.
|
|
|
|
There are two forms of the '``ret``' instruction: one that returns a
|
|
value and then causes control flow, and one that just causes control
|
|
flow to occur.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``ret``' instruction optionally accepts a single argument, the
|
|
return value. The type of the return value must be a ':ref:`first
|
|
class <t_firstclass>`' type.
|
|
|
|
A function is not :ref:`well formed <wellformed>` if it it has a non-void
|
|
return type and contains a '``ret``' instruction with no return value or
|
|
a return value with a type that does not match its type, or if it has a
|
|
void return type and contains a '``ret``' instruction with a return
|
|
value.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
When the '``ret``' instruction is executed, control flow returns back to
|
|
the calling function's context. If the caller is a
|
|
":ref:`call <i_call>`" instruction, execution continues at the
|
|
instruction after the call. If the caller was an
|
|
":ref:`invoke <i_invoke>`" instruction, execution continues at the
|
|
beginning of the "normal" destination block. If the instruction returns
|
|
a value, that value shall set the call or invoke instruction's return
|
|
value.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
ret i32 5 ; Return an integer value of 5
|
|
ret void ; Return from a void function
|
|
ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
|
|
|
|
.. _i_br:
|
|
|
|
'``br``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
br i1 <cond>, label <iftrue>, label <iffalse>
|
|
br label <dest> ; Unconditional branch
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``br``' instruction is used to cause control flow to transfer to a
|
|
different basic block in the current function. There are two forms of
|
|
this instruction, corresponding to a conditional branch and an
|
|
unconditional branch.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The conditional branch form of the '``br``' instruction takes a single
|
|
'``i1``' value and two '``label``' values. The unconditional form of the
|
|
'``br``' instruction takes a single '``label``' value as a target.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
Upon execution of a conditional '``br``' instruction, the '``i1``'
|
|
argument is evaluated. If the value is ``true``, control flows to the
|
|
'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
|
|
to the '``iffalse``' ``label`` argument.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
Test:
|
|
%cond = icmp eq i32 %a, %b
|
|
br i1 %cond, label %IfEqual, label %IfUnequal
|
|
IfEqual:
|
|
ret i32 1
|
|
IfUnequal:
|
|
ret i32 0
|
|
|
|
.. _i_switch:
|
|
|
|
'``switch``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``switch``' instruction is used to transfer control flow to one of
|
|
several different places. It is a generalization of the '``br``'
|
|
instruction, allowing a branch to occur to one of many possible
|
|
destinations.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``switch``' instruction uses three parameters: an integer
|
|
comparison value '``value``', a default '``label``' destination, and an
|
|
array of pairs of comparison value constants and '``label``'s. The table
|
|
is not allowed to contain duplicate constant entries.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The ``switch`` instruction specifies a table of values and destinations.
|
|
When the '``switch``' instruction is executed, this table is searched
|
|
for the given value. If the value is found, control flow is transferred
|
|
to the corresponding destination; otherwise, control flow is transferred
|
|
to the default destination.
|
|
|
|
Implementation:
|
|
"""""""""""""""
|
|
|
|
Depending on properties of the target machine and the particular
|
|
``switch`` instruction, this instruction may be code generated in
|
|
different ways. For example, it could be generated as a series of
|
|
chained conditional branches or with a lookup table.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
; Emulate a conditional br instruction
|
|
%Val = zext i1 %value to i32
|
|
switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
|
|
|
|
; Emulate an unconditional br instruction
|
|
switch i32 0, label %dest [ ]
|
|
|
|
; Implement a jump table:
|
|
switch i32 %val, label %otherwise [ i32 0, label %onzero
|
|
i32 1, label %onone
|
|
i32 2, label %ontwo ]
|
|
|
|
.. _i_indirectbr:
|
|
|
|
'``indirectbr``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``indirectbr``' instruction implements an indirect branch to a
|
|
label within the current function, whose address is specified by
|
|
"``address``". Address must be derived from a
|
|
:ref:`blockaddress <blockaddress>` constant.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``address``' argument is the address of the label to jump to. The
|
|
rest of the arguments indicate the full set of possible destinations
|
|
that the address may point to. Blocks are allowed to occur multiple
|
|
times in the destination list, though this isn't particularly useful.
|
|
|
|
This destination list is required so that dataflow analysis has an
|
|
accurate understanding of the CFG.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
Control transfers to the block specified in the address argument. All
|
|
possible destination blocks must be listed in the label list, otherwise
|
|
this instruction has undefined behavior. This implies that jumps to
|
|
labels defined in other functions have undefined behavior as well.
|
|
|
|
Implementation:
|
|
"""""""""""""""
|
|
|
|
This is typically implemented with a jump through a register.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
|
|
|
|
.. _i_invoke:
|
|
|
|
'``invoke``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs]
|
|
to label <normal label> unwind label <exception label>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``invoke``' instruction causes control to transfer to a specified
|
|
function, with the possibility of control flow transfer to either the
|
|
'``normal``' label or the '``exception``' label. If the callee function
|
|
returns with the "``ret``" instruction, control flow will return to the
|
|
"normal" label. If the callee (or any indirect callees) returns via the
|
|
":ref:`resume <i_resume>`" instruction or other exception handling
|
|
mechanism, control is interrupted and continued at the dynamically
|
|
nearest "exception" label.
|
|
|
|
The '``exception``' label is a `landing
|
|
pad <ExceptionHandling.html#overview>`_ for the exception. As such,
|
|
'``exception``' label is required to have the
|
|
":ref:`landingpad <i_landingpad>`" instruction, which contains the
|
|
information about the behavior of the program after unwinding happens,
|
|
as its first non-PHI instruction. The restrictions on the
|
|
"``landingpad``" instruction's tightly couples it to the "``invoke``"
|
|
instruction, so that the important information contained within the
|
|
"``landingpad``" instruction can't be lost through normal code motion.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
This instruction requires several arguments:
|
|
|
|
#. The optional "cconv" marker indicates which :ref:`calling
|
|
convention <callingconv>` the call should use. If none is
|
|
specified, the call defaults to using C calling conventions.
|
|
#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
|
|
values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
|
|
are valid here.
|
|
#. '``ptr to function ty``': shall be the signature of the pointer to
|
|
function value being invoked. In most cases, this is a direct
|
|
function invocation, but indirect ``invoke``'s are just as possible,
|
|
branching off an arbitrary pointer to function value.
|
|
#. '``function ptr val``': An LLVM value containing a pointer to a
|
|
function to be invoked.
|
|
#. '``function args``': argument list whose types match the function
|
|
signature argument types and parameter attributes. All arguments must
|
|
be of :ref:`first class <t_firstclass>` type. If the function signature
|
|
indicates the function accepts a variable number of arguments, the
|
|
extra arguments can be specified.
|
|
#. '``normal label``': the label reached when the called function
|
|
executes a '``ret``' instruction.
|
|
#. '``exception label``': the label reached when a callee returns via
|
|
the :ref:`resume <i_resume>` instruction or other exception handling
|
|
mechanism.
|
|
#. The optional :ref:`function attributes <fnattrs>` list. Only
|
|
'``noreturn``', '``nounwind``', '``readonly``' and '``readnone``'
|
|
attributes are valid here.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This instruction is designed to operate as a standard '``call``'
|
|
instruction in most regards. The primary difference is that it
|
|
establishes an association with a label, which is used by the runtime
|
|
library to unwind the stack.
|
|
|
|
This instruction is used in languages with destructors to ensure that
|
|
proper cleanup is performed in the case of either a ``longjmp`` or a
|
|
thrown exception. Additionally, this is important for implementation of
|
|
'``catch``' clauses in high-level languages that support them.
|
|
|
|
For the purposes of the SSA form, the definition of the value returned
|
|
by the '``invoke``' instruction is deemed to occur on the edge from the
|
|
current block to the "normal" label. If the callee unwinds then no
|
|
return value is available.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%retval = invoke i32 @Test(i32 15) to label %Continue
|
|
unwind label %TestCleanup ; i32:retval set
|
|
%retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
|
|
unwind label %TestCleanup ; i32:retval set
|
|
|
|
.. _i_resume:
|
|
|
|
'``resume``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
resume <type> <value>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``resume``' instruction is a terminator instruction that has no
|
|
successors.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``resume``' instruction requires one argument, which must have the
|
|
same type as the result of any '``landingpad``' instruction in the same
|
|
function.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``resume``' instruction resumes propagation of an existing
|
|
(in-flight) exception whose unwinding was interrupted with a
|
|
:ref:`landingpad <i_landingpad>` instruction.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
resume { i8*, i32 } %exn
|
|
|
|
.. _i_unreachable:
|
|
|
|
'``unreachable``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
unreachable
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``unreachable``' instruction has no defined semantics. This
|
|
instruction is used to inform the optimizer that a particular portion of
|
|
the code is not reachable. This can be used to indicate that the code
|
|
after a no-return function cannot be reached, and other facts.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``unreachable``' instruction has no defined semantics.
|
|
|
|
.. _binaryops:
|
|
|
|
Binary Operations
|
|
-----------------
|
|
|
|
Binary operators are used to do most of the computation in a program.
|
|
They require two operands of the same type, execute an operation on
|
|
them, and produce a single value. The operands might represent multiple
|
|
data, as is the case with the :ref:`vector <t_vector>` data type. The
|
|
result value has the same type as its operands.
|
|
|
|
There are several different binary operators:
|
|
|
|
.. _i_add:
|
|
|
|
'``add``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = add <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = add nuw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = add nsw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``add``' instruction returns the sum of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``add``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the integer sum of the two operands.
|
|
|
|
If the sum has unsigned overflow, the result returned is the
|
|
mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
|
|
the result.
|
|
|
|
Because LLVM integers use a two's complement representation, this
|
|
instruction is appropriate for both signed and unsigned integers.
|
|
|
|
``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
|
|
respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
|
|
result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
|
|
unsigned and/or signed overflow, respectively, occurs.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = add i32 4, %var ; yields i32:result = 4 + %var
|
|
|
|
.. _i_fadd:
|
|
|
|
'``fadd``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fadd``' instruction returns the sum of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``fadd``' instruction must be :ref:`floating
|
|
point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
|
|
Both arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the floating point sum of the two operands. This
|
|
instruction can also take any number of :ref:`fast-math flags <fastmath>`,
|
|
which are optimization hints to enable otherwise unsafe floating point
|
|
optimizations:
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
|
|
|
|
'``sub``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = sub <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``sub``' instruction returns the difference of its two operands.
|
|
|
|
Note that the '``sub``' instruction is used to represent the '``neg``'
|
|
instruction present in most other intermediate representations.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``sub``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the integer difference of the two operands.
|
|
|
|
If the difference has unsigned overflow, the result returned is the
|
|
mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
|
|
the result.
|
|
|
|
Because LLVM integers use a two's complement representation, this
|
|
instruction is appropriate for both signed and unsigned integers.
|
|
|
|
``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
|
|
respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
|
|
result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
|
|
unsigned and/or signed overflow, respectively, occurs.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = sub i32 4, %var ; yields i32:result = 4 - %var
|
|
<result> = sub i32 0, %val ; yields i32:result = -%var
|
|
|
|
.. _i_fsub:
|
|
|
|
'``fsub``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fsub``' instruction returns the difference of its two operands.
|
|
|
|
Note that the '``fsub``' instruction is used to represent the '``fneg``'
|
|
instruction present in most other intermediate representations.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``fsub``' instruction must be :ref:`floating
|
|
point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
|
|
Both arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the floating point difference of the two operands.
|
|
This instruction can also take any number of :ref:`fast-math
|
|
flags <fastmath>`, which are optimization hints to enable otherwise
|
|
unsafe floating point optimizations:
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
|
|
<result> = fsub float -0.0, %val ; yields float:result = -%var
|
|
|
|
'``mul``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = mul <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``mul``' instruction returns the product of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``mul``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the integer product of the two operands.
|
|
|
|
If the result of the multiplication has unsigned overflow, the result
|
|
returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
|
|
bit width of the result.
|
|
|
|
Because LLVM integers use a two's complement representation, and the
|
|
result is the same width as the operands, this instruction returns the
|
|
correct result for both signed and unsigned integers. If a full product
|
|
(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
|
|
sign-extended or zero-extended as appropriate to the width of the full
|
|
product.
|
|
|
|
``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
|
|
respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
|
|
result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
|
|
unsigned and/or signed overflow, respectively, occurs.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = mul i32 4, %var ; yields i32:result = 4 * %var
|
|
|
|
.. _i_fmul:
|
|
|
|
'``fmul``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fmul``' instruction returns the product of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``fmul``' instruction must be :ref:`floating
|
|
point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
|
|
Both arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the floating point product of the two operands.
|
|
This instruction can also take any number of :ref:`fast-math
|
|
flags <fastmath>`, which are optimization hints to enable otherwise
|
|
unsafe floating point optimizations:
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
|
|
|
|
'``udiv``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = udiv <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``udiv``' instruction returns the quotient of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``udiv``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the unsigned integer quotient of the two operands.
|
|
|
|
Note that unsigned integer division and signed integer division are
|
|
distinct operations; for signed integer division, use '``sdiv``'.
|
|
|
|
Division by zero leads to undefined behavior.
|
|
|
|
If the ``exact`` keyword is present, the result value of the ``udiv`` is
|
|
a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
|
|
such, "((a udiv exact b) mul b) == a").
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = udiv i32 4, %var ; yields i32:result = 4 / %var
|
|
|
|
'``sdiv``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = sdiv <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``sdiv``' instruction returns the quotient of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``sdiv``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the signed integer quotient of the two operands
|
|
rounded towards zero.
|
|
|
|
Note that signed integer division and unsigned integer division are
|
|
distinct operations; for unsigned integer division, use '``udiv``'.
|
|
|
|
Division by zero leads to undefined behavior. Overflow also leads to
|
|
undefined behavior; this is a rare case, but can occur, for example, by
|
|
doing a 32-bit division of -2147483648 by -1.
|
|
|
|
If the ``exact`` keyword is present, the result value of the ``sdiv`` is
|
|
a :ref:`poison value <poisonvalues>` if the result would be rounded.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
|
|
|
|
.. _i_fdiv:
|
|
|
|
'``fdiv``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fdiv``' instruction returns the quotient of its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``fdiv``' instruction must be :ref:`floating
|
|
point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
|
|
Both arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is the floating point quotient of the two operands.
|
|
This instruction can also take any number of :ref:`fast-math
|
|
flags <fastmath>`, which are optimization hints to enable otherwise
|
|
unsafe floating point optimizations:
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
|
|
|
|
'``urem``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = urem <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``urem``' instruction returns the remainder from the unsigned
|
|
division of its two arguments.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``urem``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This instruction returns the unsigned integer *remainder* of a division.
|
|
This instruction always performs an unsigned division to get the
|
|
remainder.
|
|
|
|
Note that unsigned integer remainder and signed integer remainder are
|
|
distinct operations; for signed integer remainder, use '``srem``'.
|
|
|
|
Taking the remainder of a division by zero leads to undefined behavior.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = urem i32 4, %var ; yields i32:result = 4 % %var
|
|
|
|
'``srem``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = srem <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``srem``' instruction returns the remainder from the signed
|
|
division of its two operands. This instruction can also take
|
|
:ref:`vector <t_vector>` versions of the values in which case the elements
|
|
must be integers.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``srem``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This instruction returns the *remainder* of a division (where the result
|
|
is either zero or has the same sign as the dividend, ``op1``), not the
|
|
*modulo* operator (where the result is either zero or has the same sign
|
|
as the divisor, ``op2``) of a value. For more information about the
|
|
difference, see `The Math
|
|
Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
|
|
table of how this is implemented in various languages, please see
|
|
`Wikipedia: modulo
|
|
operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
|
|
|
|
Note that signed integer remainder and unsigned integer remainder are
|
|
distinct operations; for unsigned integer remainder, use '``urem``'.
|
|
|
|
Taking the remainder of a division by zero leads to undefined behavior.
|
|
Overflow also leads to undefined behavior; this is a rare case, but can
|
|
occur, for example, by taking the remainder of a 32-bit division of
|
|
-2147483648 by -1. (The remainder doesn't actually overflow, but this
|
|
rule lets srem be implemented using instructions that return both the
|
|
result of the division and the remainder.)
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = srem i32 4, %var ; yields i32:result = 4 % %var
|
|
|
|
.. _i_frem:
|
|
|
|
'``frem``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``frem``' instruction returns the remainder from the division of
|
|
its two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``frem``' instruction must be :ref:`floating
|
|
point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
|
|
Both arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This instruction returns the *remainder* of a division. The remainder
|
|
has the same sign as the dividend. This instruction can also take any
|
|
number of :ref:`fast-math flags <fastmath>`, which are optimization hints
|
|
to enable otherwise unsafe floating point optimizations:
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
|
|
|
|
.. _bitwiseops:
|
|
|
|
Bitwise Binary Operations
|
|
-------------------------
|
|
|
|
Bitwise binary operators are used to do various forms of bit-twiddling
|
|
in a program. They are generally very efficient instructions and can
|
|
commonly be strength reduced from other instructions. They require two
|
|
operands of the same type, execute an operation on them, and produce a
|
|
single value. The resulting value is the same type as its operands.
|
|
|
|
'``shl``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = shl <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``shl``' instruction returns the first operand shifted to the left
|
|
a specified number of bits.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
Both arguments to the '``shl``' instruction must be the same
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
|
|
'``op2``' is treated as an unsigned value.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
|
|
where ``n`` is the width of the result. If ``op2`` is (statically or
|
|
dynamically) negative or equal to or larger than the number of bits in
|
|
``op1``, the result is undefined. If the arguments are vectors, each
|
|
vector element of ``op1`` is shifted by the corresponding shift amount
|
|
in ``op2``.
|
|
|
|
If the ``nuw`` keyword is present, then the shift produces a :ref:`poison
|
|
value <poisonvalues>` if it shifts out any non-zero bits. If the
|
|
``nsw`` keyword is present, then the shift produces a :ref:`poison
|
|
value <poisonvalues>` if it shifts out any bits that disagree with the
|
|
resultant sign bit. As such, NUW/NSW have the same semantics as they
|
|
would if the shift were expressed as a mul instruction with the same
|
|
nsw/nuw bits in (mul %op1, (shl 1, %op2)).
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = shl i32 4, %var ; yields i32: 4 << %var
|
|
<result> = shl i32 4, 2 ; yields i32: 16
|
|
<result> = shl i32 1, 10 ; yields i32: 1024
|
|
<result> = shl i32 1, 32 ; undefined
|
|
<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
|
|
|
|
'``lshr``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = lshr <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``lshr``' instruction (logical shift right) returns the first
|
|
operand shifted to the right a specified number of bits with zero fill.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
Both arguments to the '``lshr``' instruction must be the same
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
|
|
'``op2``' is treated as an unsigned value.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This instruction always performs a logical shift right operation. The
|
|
most significant bits of the result will be filled with zero bits after
|
|
the shift. If ``op2`` is (statically or dynamically) equal to or larger
|
|
than the number of bits in ``op1``, the result is undefined. If the
|
|
arguments are vectors, each vector element of ``op1`` is shifted by the
|
|
corresponding shift amount in ``op2``.
|
|
|
|
If the ``exact`` keyword is present, the result value of the ``lshr`` is
|
|
a :ref:`poison value <poisonvalues>` if any of the bits shifted out are
|
|
non-zero.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = lshr i32 4, 1 ; yields i32:result = 2
|
|
<result> = lshr i32 4, 2 ; yields i32:result = 1
|
|
<result> = lshr i8 4, 3 ; yields i8:result = 0
|
|
<result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
|
|
<result> = lshr i32 1, 32 ; undefined
|
|
<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
|
|
|
|
'``ashr``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = ashr <ty> <op1>, <op2> ; yields ty:result
|
|
<result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``ashr``' instruction (arithmetic shift right) returns the first
|
|
operand shifted to the right a specified number of bits with sign
|
|
extension.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
Both arguments to the '``ashr``' instruction must be the same
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
|
|
'``op2``' is treated as an unsigned value.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This instruction always performs an arithmetic shift right operation,
|
|
The most significant bits of the result will be filled with the sign bit
|
|
of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
|
|
than the number of bits in ``op1``, the result is undefined. If the
|
|
arguments are vectors, each vector element of ``op1`` is shifted by the
|
|
corresponding shift amount in ``op2``.
|
|
|
|
If the ``exact`` keyword is present, the result value of the ``ashr`` is
|
|
a :ref:`poison value <poisonvalues>` if any of the bits shifted out are
|
|
non-zero.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = ashr i32 4, 1 ; yields i32:result = 2
|
|
<result> = ashr i32 4, 2 ; yields i32:result = 1
|
|
<result> = ashr i8 4, 3 ; yields i8:result = 0
|
|
<result> = ashr i8 -2, 1 ; yields i8:result = -1
|
|
<result> = ashr i32 1, 32 ; undefined
|
|
<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
|
|
|
|
'``and``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = and <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``and``' instruction returns the bitwise logical and of its two
|
|
operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``and``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The truth table used for the '``and``' instruction is:
|
|
|
|
+-----+-----+-----+
|
|
| In0 | In1 | Out |
|
|
+-----+-----+-----+
|
|
| 0 | 0 | 0 |
|
|
+-----+-----+-----+
|
|
| 0 | 1 | 0 |
|
|
+-----+-----+-----+
|
|
| 1 | 0 | 0 |
|
|
+-----+-----+-----+
|
|
| 1 | 1 | 1 |
|
|
+-----+-----+-----+
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = and i32 4, %var ; yields i32:result = 4 & %var
|
|
<result> = and i32 15, 40 ; yields i32:result = 8
|
|
<result> = and i32 4, 8 ; yields i32:result = 0
|
|
|
|
'``or``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = or <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``or``' instruction returns the bitwise logical inclusive or of its
|
|
two operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``or``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The truth table used for the '``or``' instruction is:
|
|
|
|
+-----+-----+-----+
|
|
| In0 | In1 | Out |
|
|
+-----+-----+-----+
|
|
| 0 | 0 | 0 |
|
|
+-----+-----+-----+
|
|
| 0 | 1 | 1 |
|
|
+-----+-----+-----+
|
|
| 1 | 0 | 1 |
|
|
+-----+-----+-----+
|
|
| 1 | 1 | 1 |
|
|
+-----+-----+-----+
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
::
|
|
|
|
<result> = or i32 4, %var ; yields i32:result = 4 | %var
|
|
<result> = or i32 15, 40 ; yields i32:result = 47
|
|
<result> = or i32 4, 8 ; yields i32:result = 12
|
|
|
|
'``xor``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = xor <ty> <op1>, <op2> ; yields ty:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``xor``' instruction returns the bitwise logical exclusive or of
|
|
its two operands. The ``xor`` is used to implement the "one's
|
|
complement" operation, which is the "~" operator in C.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The two arguments to the '``xor``' instruction must be
|
|
:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
|
|
arguments must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The truth table used for the '``xor``' instruction is:
|
|
|
|
+-----+-----+-----+
|
|
| In0 | In1 | Out |
|
|
+-----+-----+-----+
|
|
| 0 | 0 | 0 |
|
|
+-----+-----+-----+
|
|
| 0 | 1 | 1 |
|
|
+-----+-----+-----+
|
|
| 1 | 0 | 1 |
|
|
+-----+-----+-----+
|
|
| 1 | 1 | 0 |
|
|
+-----+-----+-----+
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
|
|
<result> = xor i32 15, 40 ; yields i32:result = 39
|
|
<result> = xor i32 4, 8 ; yields i32:result = 12
|
|
<result> = xor i32 %V, -1 ; yields i32:result = ~%V
|
|
|
|
Vector Operations
|
|
-----------------
|
|
|
|
LLVM supports several instructions to represent vector operations in a
|
|
target-independent manner. These instructions cover the element-access
|
|
and vector-specific operations needed to process vectors effectively.
|
|
While LLVM does directly support these vector operations, many
|
|
sophisticated algorithms will want to use target-specific intrinsics to
|
|
take full advantage of a specific target.
|
|
|
|
.. _i_extractelement:
|
|
|
|
'``extractelement``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``extractelement``' instruction extracts a single scalar element
|
|
from a vector at a specified index.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first operand of an '``extractelement``' instruction is a value of
|
|
:ref:`vector <t_vector>` type. The second operand is an index indicating
|
|
the position from which to extract the element. The index may be a
|
|
variable of any integer type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The result is a scalar of the same type as the element type of ``val``.
|
|
Its value is the value at position ``idx`` of ``val``. If ``idx``
|
|
exceeds the length of ``val``, the results are undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
|
|
|
|
.. _i_insertelement:
|
|
|
|
'``insertelement``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``insertelement``' instruction inserts a scalar element into a
|
|
vector at a specified index.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first operand of an '``insertelement``' instruction is a value of
|
|
:ref:`vector <t_vector>` type. The second operand is a scalar value whose
|
|
type must equal the element type of the first operand. The third operand
|
|
is an index indicating the position at which to insert the value. The
|
|
index may be a variable of any integer type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The result is a vector of the same type as ``val``. Its element values
|
|
are those of ``val`` except at position ``idx``, where it gets the value
|
|
``elt``. If ``idx`` exceeds the length of ``val``, the results are
|
|
undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
|
|
|
|
.. _i_shufflevector:
|
|
|
|
'``shufflevector``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``shufflevector``' instruction constructs a permutation of elements
|
|
from two input vectors, returning a vector with the same element type as
|
|
the input and length that is the same as the shuffle mask.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first two operands of a '``shufflevector``' instruction are vectors
|
|
with the same type. The third argument is a shuffle mask whose element
|
|
type is always 'i32'. The result of the instruction is a vector whose
|
|
length is the same as the shuffle mask and whose element type is the
|
|
same as the element type of the first two operands.
|
|
|
|
The shuffle mask operand is required to be a constant vector with either
|
|
constant integer or undef values.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The elements of the two input vectors are numbered from left to right
|
|
across both of the vectors. The shuffle mask operand specifies, for each
|
|
element of the result vector, which element of the two input vectors the
|
|
result element gets. The element selector may be undef (meaning "don't
|
|
care") and the second operand may be undef if performing a shuffle from
|
|
only one vector.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
|
|
<4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
|
|
<result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
|
|
<4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
|
|
<result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
|
|
<4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
|
|
<result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
|
|
<8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
|
|
|
|
Aggregate Operations
|
|
--------------------
|
|
|
|
LLVM supports several instructions for working with
|
|
:ref:`aggregate <t_aggregate>` values.
|
|
|
|
.. _i_extractvalue:
|
|
|
|
'``extractvalue``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``extractvalue``' instruction extracts the value of a member field
|
|
from an :ref:`aggregate <t_aggregate>` value.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first operand of an '``extractvalue``' instruction is a value of
|
|
:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The operands are
|
|
constant indices to specify which value to extract in a similar manner
|
|
as indices in a '``getelementptr``' instruction.
|
|
|
|
The major differences to ``getelementptr`` indexing are:
|
|
|
|
- Since the value being indexed is not a pointer, the first index is
|
|
omitted and assumed to be zero.
|
|
- At least one index must be specified.
|
|
- Not only struct indices but also array indices must be in bounds.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The result is the value at the position in the aggregate specified by
|
|
the index operands.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = extractvalue {i32, float} %agg, 0 ; yields i32
|
|
|
|
.. _i_insertvalue:
|
|
|
|
'``insertvalue``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``insertvalue``' instruction inserts a value into a member field in
|
|
an :ref:`aggregate <t_aggregate>` value.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first operand of an '``insertvalue``' instruction is a value of
|
|
:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
|
|
a first-class value to insert. The following operands are constant
|
|
indices indicating the position at which to insert the value in a
|
|
similar manner as indices in a '``extractvalue``' instruction. The value
|
|
to insert must have the same type as the value identified by the
|
|
indices.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The result is an aggregate of the same type as ``val``. Its value is
|
|
that of ``val`` except that the value at the position specified by the
|
|
indices is that of ``elt``.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
|
|
%agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
|
|
%agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
|
|
|
|
.. _memoryops:
|
|
|
|
Memory Access and Addressing Operations
|
|
---------------------------------------
|
|
|
|
A key design point of an SSA-based representation is how it represents
|
|
memory. In LLVM, no memory locations are in SSA form, which makes things
|
|
very simple. This section describes how to read, write, and allocate
|
|
memory in LLVM.
|
|
|
|
.. _i_alloca:
|
|
|
|
'``alloca``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields type*:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``alloca``' instruction allocates memory on the stack frame of the
|
|
currently executing function, to be automatically released when this
|
|
function returns to its caller. The object is always allocated in the
|
|
generic address space (address space zero).
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
|
|
bytes of memory on the runtime stack, returning a pointer of the
|
|
appropriate type to the program. If "NumElements" is specified, it is
|
|
the number of elements allocated, otherwise "NumElements" is defaulted
|
|
to be one. If a constant alignment is specified, the value result of the
|
|
allocation is guaranteed to be aligned to at least that boundary. The
|
|
alignment may not be greater than ``1 << 29``. If not specified, or if
|
|
zero, the target can choose to align the allocation on any convenient
|
|
boundary compatible with the type.
|
|
|
|
'``type``' may be any sized type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
Memory is allocated; a pointer is returned. The operation is undefined
|
|
if there is insufficient stack space for the allocation. '``alloca``'d
|
|
memory is automatically released when the function returns. The
|
|
'``alloca``' instruction is commonly used to represent automatic
|
|
variables that must have an address available. When the function returns
|
|
(either with the ``ret`` or ``resume`` instructions), the memory is
|
|
reclaimed. Allocating zero bytes is legal, but the result is undefined.
|
|
The order in which memory is allocated (ie., which way the stack grows)
|
|
is not specified.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%ptr = alloca i32 ; yields i32*:ptr
|
|
%ptr = alloca i32, i32 4 ; yields i32*:ptr
|
|
%ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr
|
|
%ptr = alloca i32, align 1024 ; yields i32*:ptr
|
|
|
|
.. _i_load:
|
|
|
|
'``load``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = load [volatile] <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>]
|
|
<result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment>
|
|
!<index> = !{ i32 1 }
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``load``' instruction is used to read from memory.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument to the ``load`` instruction specifies the memory address
|
|
from which to load. The pointer must point to a :ref:`first
|
|
class <t_firstclass>` type. If the ``load`` is marked as ``volatile``,
|
|
then the optimizer is not allowed to modify the number or order of
|
|
execution of this ``load`` with other :ref:`volatile
|
|
operations <volatile>`.
|
|
|
|
If the ``load`` is marked as ``atomic``, it takes an extra
|
|
:ref:`ordering <ordering>` and optional ``singlethread`` argument. The
|
|
``release`` and ``acq_rel`` orderings are not valid on ``load``
|
|
instructions. Atomic loads produce :ref:`defined <memmodel>` results
|
|
when they may see multiple atomic stores. The type of the pointee must
|
|
be an integer type whose bit width is a power of two greater than or
|
|
equal to eight and less than or equal to a target-specific size limit.
|
|
``align`` must be explicitly specified on atomic loads, and the load has
|
|
undefined behavior if the alignment is not set to a value which is at
|
|
least the size in bytes of the pointee. ``!nontemporal`` does not have
|
|
any defined semantics for atomic loads.
|
|
|
|
The optional constant ``align`` argument specifies the alignment of the
|
|
operation (that is, the alignment of the memory address). A value of 0
|
|
or an omitted ``align`` argument means that the operation has the ABI
|
|
alignment for the target. It is the responsibility of the code emitter
|
|
to ensure that the alignment information is correct. Overestimating the
|
|
alignment results in undefined behavior. Underestimating the alignment
|
|
may produce less efficient code. An alignment of 1 is always safe. The
|
|
maximum possible alignment is ``1 << 29``.
|
|
|
|
The optional ``!nontemporal`` metadata must reference a single
|
|
metadata name ``<index>`` corresponding to a metadata node with one
|
|
``i32`` entry of value 1. The existence of the ``!nontemporal``
|
|
metadata on the instruction tells the optimizer and code generator
|
|
that this load is not expected to be reused in the cache. The code
|
|
generator may select special instructions to save cache bandwidth, such
|
|
as the ``MOVNT`` instruction on x86.
|
|
|
|
The optional ``!invariant.load`` metadata must reference a single
|
|
metadata name ``<index>`` corresponding to a metadata node with no
|
|
entries. The existence of the ``!invariant.load`` metadata on the
|
|
instruction tells the optimizer and code generator that this load
|
|
address points to memory which does not change value during program
|
|
execution. The optimizer may then move this load around, for example, by
|
|
hoisting it out of loops using loop invariant code motion.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The location of memory pointed to is loaded. If the value being loaded
|
|
is of scalar type then the number of bytes read does not exceed the
|
|
minimum number of bytes needed to hold all bits of the type. For
|
|
example, loading an ``i24`` reads at most three bytes. When loading a
|
|
value of a type like ``i20`` with a size that is not an integral number
|
|
of bytes, the result is undefined if the value was not originally
|
|
written using a store of the same type.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%ptr = alloca i32 ; yields i32*:ptr
|
|
store i32 3, i32* %ptr ; yields void
|
|
%val = load i32* %ptr ; yields i32:val = i32 3
|
|
|
|
.. _i_store:
|
|
|
|
'``store``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields void
|
|
store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields void
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``store``' instruction is used to write to memory.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
There are two arguments to the ``store`` instruction: a value to store
|
|
and an address at which to store it. The type of the ``<pointer>``
|
|
operand must be a pointer to the :ref:`first class <t_firstclass>` type of
|
|
the ``<value>`` operand. If the ``store`` is marked as ``volatile``,
|
|
then the optimizer is not allowed to modify the number or order of
|
|
execution of this ``store`` with other :ref:`volatile
|
|
operations <volatile>`.
|
|
|
|
If the ``store`` is marked as ``atomic``, it takes an extra
|
|
:ref:`ordering <ordering>` and optional ``singlethread`` argument. The
|
|
``acquire`` and ``acq_rel`` orderings aren't valid on ``store``
|
|
instructions. Atomic loads produce :ref:`defined <memmodel>` results
|
|
when they may see multiple atomic stores. The type of the pointee must
|
|
be an integer type whose bit width is a power of two greater than or
|
|
equal to eight and less than or equal to a target-specific size limit.
|
|
``align`` must be explicitly specified on atomic stores, and the store
|
|
has undefined behavior if the alignment is not set to a value which is
|
|
at least the size in bytes of the pointee. ``!nontemporal`` does not
|
|
have any defined semantics for atomic stores.
|
|
|
|
The optional constant ``align`` argument specifies the alignment of the
|
|
operation (that is, the alignment of the memory address). A value of 0
|
|
or an omitted ``align`` argument means that the operation has the ABI
|
|
alignment for the target. It is the responsibility of the code emitter
|
|
to ensure that the alignment information is correct. Overestimating the
|
|
alignment results in undefined behavior. Underestimating the
|
|
alignment may produce less efficient code. An alignment of 1 is always
|
|
safe. The maximum possible alignment is ``1 << 29``.
|
|
|
|
The optional ``!nontemporal`` metadata must reference a single metadata
|
|
name ``<index>`` corresponding to a metadata node with one ``i32`` entry of
|
|
value 1. The existence of the ``!nontemporal`` metadata on the instruction
|
|
tells the optimizer and code generator that this load is not expected to
|
|
be reused in the cache. The code generator may select special
|
|
instructions to save cache bandwidth, such as the MOVNT instruction on
|
|
x86.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The contents of memory are updated to contain ``<value>`` at the
|
|
location specified by the ``<pointer>`` operand. If ``<value>`` is
|
|
of scalar type then the number of bytes written does not exceed the
|
|
minimum number of bytes needed to hold all bits of the type. For
|
|
example, storing an ``i24`` writes at most three bytes. When writing a
|
|
value of a type like ``i20`` with a size that is not an integral number
|
|
of bytes, it is unspecified what happens to the extra bits that do not
|
|
belong to the type, but they will typically be overwritten.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%ptr = alloca i32 ; yields i32*:ptr
|
|
store i32 3, i32* %ptr ; yields void
|
|
%val = load i32* %ptr ; yields i32:val = i32 3
|
|
|
|
.. _i_fence:
|
|
|
|
'``fence``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
fence [singlethread] <ordering> ; yields void
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fence``' instruction is used to introduce happens-before edges
|
|
between operations.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
'``fence``' instructions take an :ref:`ordering <ordering>` argument which
|
|
defines what *synchronizes-with* edges they add. They can only be given
|
|
``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
A fence A which has (at least) ``release`` ordering semantics
|
|
*synchronizes with* a fence B with (at least) ``acquire`` ordering
|
|
semantics if and only if there exist atomic operations X and Y, both
|
|
operating on some atomic object M, such that A is sequenced before X, X
|
|
modifies M (either directly or through some side effect of a sequence
|
|
headed by X), Y is sequenced before B, and Y observes M. This provides a
|
|
*happens-before* dependency between A and B. Rather than an explicit
|
|
``fence``, one (but not both) of the atomic operations X or Y might
|
|
provide a ``release`` or ``acquire`` (resp.) ordering constraint and
|
|
still *synchronize-with* the explicit ``fence`` and establish the
|
|
*happens-before* edge.
|
|
|
|
A ``fence`` which has ``seq_cst`` ordering, in addition to having both
|
|
``acquire`` and ``release`` semantics specified above, participates in
|
|
the global program order of other ``seq_cst`` operations and/or fences.
|
|
|
|
The optional ":ref:`singlethread <singlethread>`" argument specifies
|
|
that the fence only synchronizes with other fences in the same thread.
|
|
(This is useful for interacting with signal handlers.)
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
fence acquire ; yields void
|
|
fence singlethread seq_cst ; yields void
|
|
|
|
.. _i_cmpxchg:
|
|
|
|
'``cmpxchg``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields { ty, i1 }
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``cmpxchg``' instruction is used to atomically modify memory. It
|
|
loads a value in memory and compares it to a given value. If they are
|
|
equal, it tries to store a new value into the memory.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
There are three arguments to the '``cmpxchg``' instruction: an address
|
|
to operate on, a value to compare to the value currently be at that
|
|
address, and a new value to place at that address if the compared values
|
|
are equal. The type of '<cmp>' must be an integer type whose bit width
|
|
is a power of two greater than or equal to eight and less than or equal
|
|
to a target-specific size limit. '<cmp>' and '<new>' must have the same
|
|
type, and the type of '<pointer>' must be a pointer to that type. If the
|
|
``cmpxchg`` is marked as ``volatile``, then the optimizer is not allowed
|
|
to modify the number or order of execution of this ``cmpxchg`` with
|
|
other :ref:`volatile operations <volatile>`.
|
|
|
|
The success and failure :ref:`ordering <ordering>` arguments specify how this
|
|
``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
|
|
must be at least ``monotonic``, the ordering constraint on failure must be no
|
|
stronger than that on success, and the failure ordering cannot be either
|
|
``release`` or ``acq_rel``.
|
|
|
|
The optional "``singlethread``" argument declares that the ``cmpxchg``
|
|
is only atomic with respect to code (usually signal handlers) running in
|
|
the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with
|
|
respect to all other code in the system.
|
|
|
|
The pointer passed into cmpxchg must have alignment greater than or
|
|
equal to the size in memory of the operand.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The contents of memory at the location specified by the '``<pointer>``' operand
|
|
is read and compared to '``<cmp>``'; if the read value is the equal, the
|
|
'``<new>``' is written. The original value at the location is returned, together
|
|
with a flag indicating success (true) or failure (false).
|
|
|
|
If the cmpxchg operation is marked as ``weak`` then a spurious failure is
|
|
permitted: the operation may not write ``<new>`` even if the comparison
|
|
matched.
|
|
|
|
If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
|
|
if the value loaded equals ``cmp``.
|
|
|
|
A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
|
|
identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
|
|
load with an ordering parameter determined the second ordering parameter.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
entry:
|
|
%orig = atomic load i32* %ptr unordered ; yields i32
|
|
br label %loop
|
|
|
|
loop:
|
|
%cmp = phi i32 [ %orig, %entry ], [%old, %loop]
|
|
%squared = mul i32 %cmp, %cmp
|
|
%val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
|
|
%value_loaded = extractvalue { i32, i1 } %val_success, 0
|
|
%success = extractvalue { i32, i1 } %val_success, 1
|
|
br i1 %success, label %done, label %loop
|
|
|
|
done:
|
|
...
|
|
|
|
.. _i_atomicrmw:
|
|
|
|
'``atomicrmw``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields ty
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``atomicrmw``' instruction is used to atomically modify memory.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
There are three arguments to the '``atomicrmw``' instruction: an
|
|
operation to apply, an address whose value to modify, an argument to the
|
|
operation. The operation must be one of the following keywords:
|
|
|
|
- xchg
|
|
- add
|
|
- sub
|
|
- and
|
|
- nand
|
|
- or
|
|
- xor
|
|
- max
|
|
- min
|
|
- umax
|
|
- umin
|
|
|
|
The type of '<value>' must be an integer type whose bit width is a power
|
|
of two greater than or equal to eight and less than or equal to a
|
|
target-specific size limit. The type of the '``<pointer>``' operand must
|
|
be a pointer to that type. If the ``atomicrmw`` is marked as
|
|
``volatile``, then the optimizer is not allowed to modify the number or
|
|
order of execution of this ``atomicrmw`` with other :ref:`volatile
|
|
operations <volatile>`.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The contents of memory at the location specified by the '``<pointer>``'
|
|
operand are atomically read, modified, and written back. The original
|
|
value at the location is returned. The modification is specified by the
|
|
operation argument:
|
|
|
|
- xchg: ``*ptr = val``
|
|
- add: ``*ptr = *ptr + val``
|
|
- sub: ``*ptr = *ptr - val``
|
|
- and: ``*ptr = *ptr & val``
|
|
- nand: ``*ptr = ~(*ptr & val)``
|
|
- or: ``*ptr = *ptr | val``
|
|
- xor: ``*ptr = *ptr ^ val``
|
|
- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
|
|
- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
|
|
- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
|
|
comparison)
|
|
- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
|
|
comparison)
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32
|
|
|
|
.. _i_getelementptr:
|
|
|
|
'``getelementptr``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}*
|
|
<result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}*
|
|
<result> = getelementptr <ptr vector> ptrval, <vector index type> idx
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``getelementptr``' instruction is used to get the address of a
|
|
subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
|
|
address calculation only and does not access memory.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is always a pointer or a vector of pointers, and
|
|
forms the basis of the calculation. The remaining arguments are indices
|
|
that indicate which of the elements of the aggregate object are indexed.
|
|
The interpretation of each index is dependent on the type being indexed
|
|
into. The first index always indexes the pointer value given as the
|
|
first argument, the second index indexes a value of the type pointed to
|
|
(not necessarily the value directly pointed to, since the first index
|
|
can be non-zero), etc. The first type indexed into must be a pointer
|
|
value, subsequent types can be arrays, vectors, and structs. Note that
|
|
subsequent types being indexed into can never be pointers, since that
|
|
would require loading the pointer before continuing calculation.
|
|
|
|
The type of each index argument depends on the type it is indexing into.
|
|
When indexing into a (optionally packed) structure, only ``i32`` integer
|
|
**constants** are allowed (when using a vector of indices they must all
|
|
be the **same** ``i32`` integer constant). When indexing into an array,
|
|
pointer or vector, integers of any width are allowed, and they are not
|
|
required to be constant. These integers are treated as signed values
|
|
where relevant.
|
|
|
|
For example, let's consider a C code fragment and how it gets compiled
|
|
to LLVM:
|
|
|
|
.. code-block:: c
|
|
|
|
struct RT {
|
|
char A;
|
|
int B[10][20];
|
|
char C;
|
|
};
|
|
struct ST {
|
|
int X;
|
|
double Y;
|
|
struct RT Z;
|
|
};
|
|
|
|
int *foo(struct ST *s) {
|
|
return &s[1].Z.B[5][13];
|
|
}
|
|
|
|
The LLVM code generated by Clang is:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%struct.RT = type { i8, [10 x [20 x i32]], i8 }
|
|
%struct.ST = type { i32, double, %struct.RT }
|
|
|
|
define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
|
|
entry:
|
|
%arrayidx = getelementptr inbounds %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
|
|
ret i32* %arrayidx
|
|
}
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
In the example above, the first index is indexing into the
|
|
'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
|
|
= '``{ i32, double, %struct.RT }``' type, a structure. The second index
|
|
indexes into the third element of the structure, yielding a
|
|
'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
|
|
structure. The third index indexes into the second element of the
|
|
structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
|
|
dimensions of the array are subscripted into, yielding an '``i32``'
|
|
type. The '``getelementptr``' instruction returns a pointer to this
|
|
element, thus computing a value of '``i32*``' type.
|
|
|
|
Note that it is perfectly legal to index partially through a structure,
|
|
returning a pointer to an inner element. Because of this, the LLVM code
|
|
for the given testcase is equivalent to:
|
|
|
|
.. code-block:: llvm
|
|
|
|
define i32* @foo(%struct.ST* %s) {
|
|
%t1 = getelementptr %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1
|
|
%t2 = getelementptr %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2
|
|
%t3 = getelementptr %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3
|
|
%t4 = getelementptr [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4
|
|
%t5 = getelementptr [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5
|
|
ret i32* %t5
|
|
}
|
|
|
|
If the ``inbounds`` keyword is present, the result value of the
|
|
``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
|
|
pointer is not an *in bounds* address of an allocated object, or if any
|
|
of the addresses that would be formed by successive addition of the
|
|
offsets implied by the indices to the base address with infinitely
|
|
precise signed arithmetic are not an *in bounds* address of that
|
|
allocated object. The *in bounds* addresses for an allocated object are
|
|
all the addresses that point into the object, plus the address one byte
|
|
past the end. In cases where the base is a vector of pointers the
|
|
``inbounds`` keyword applies to each of the computations element-wise.
|
|
|
|
If the ``inbounds`` keyword is not present, the offsets are added to the
|
|
base address with silently-wrapping two's complement arithmetic. If the
|
|
offsets have a different width from the pointer, they are sign-extended
|
|
or truncated to the width of the pointer. The result value of the
|
|
``getelementptr`` may be outside the object pointed to by the base
|
|
pointer. The result value may not necessarily be used to access memory
|
|
though, even if it happens to point into allocated storage. See the
|
|
:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
|
|
information.
|
|
|
|
The getelementptr instruction is often confusing. For some more insight
|
|
into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
; yields [12 x i8]*:aptr
|
|
%aptr = getelementptr {i32, [12 x i8]}* %saptr, i64 0, i32 1
|
|
; yields i8*:vptr
|
|
%vptr = getelementptr {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
|
|
; yields i8*:eptr
|
|
%eptr = getelementptr [12 x i8]* %aptr, i64 0, i32 1
|
|
; yields i32*:iptr
|
|
%iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0
|
|
|
|
In cases where the pointer argument is a vector of pointers, each index
|
|
must be a vector with the same number of elements. For example:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets,
|
|
|
|
Conversion Operations
|
|
---------------------
|
|
|
|
The instructions in this category are the conversion instructions
|
|
(casting) which all take a single operand and a type. They perform
|
|
various bit conversions on the operand.
|
|
|
|
'``trunc .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = trunc <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``trunc``' instruction truncates its operand to the type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``trunc``' instruction takes a value to trunc, and a type to trunc
|
|
it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
|
|
of the same number of integers. The bit size of the ``value`` must be
|
|
larger than the bit size of the destination type, ``ty2``. Equal sized
|
|
types are not allowed.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``trunc``' instruction truncates the high order bits in ``value``
|
|
and converts the remaining bits to ``ty2``. Since the source size must
|
|
be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
|
|
It will always truncate bits.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = trunc i32 257 to i8 ; yields i8:1
|
|
%Y = trunc i32 123 to i1 ; yields i1:true
|
|
%Z = trunc i32 122 to i1 ; yields i1:false
|
|
%W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
|
|
|
|
'``zext .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = zext <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``zext``' instruction zero extends its operand to type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``zext``' instruction takes a value to cast, and a type to cast it
|
|
to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
|
|
the same number of integers. The bit size of the ``value`` must be
|
|
smaller than the bit size of the destination type, ``ty2``.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The ``zext`` fills the high order bits of the ``value`` with zero bits
|
|
until it reaches the size of the destination type, ``ty2``.
|
|
|
|
When zero extending from i1, the result will always be either 0 or 1.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = zext i32 257 to i64 ; yields i64:257
|
|
%Y = zext i1 true to i32 ; yields i32:1
|
|
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
|
|
|
|
'``sext .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = sext <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``sext``' sign extends ``value`` to the type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``sext``' instruction takes a value to cast, and a type to cast it
|
|
to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
|
|
the same number of integers. The bit size of the ``value`` must be
|
|
smaller than the bit size of the destination type, ``ty2``.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``sext``' instruction performs a sign extension by copying the sign
|
|
bit (highest order bit) of the ``value`` until it reaches the bit size
|
|
of the type ``ty2``.
|
|
|
|
When sign extending from i1, the extension always results in -1 or 0.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = sext i8 -1 to i16 ; yields i16 :65535
|
|
%Y = sext i1 true to i32 ; yields i32:-1
|
|
%Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
|
|
|
|
'``fptrunc .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fptrunc <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``fptrunc``' instruction takes a :ref:`floating point <t_floating>`
|
|
value to cast and a :ref:`floating point <t_floating>` type to cast it to.
|
|
The size of ``value`` must be larger than the size of ``ty2``. This
|
|
implies that ``fptrunc`` cannot be used to make a *no-op cast*.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``fptrunc``' instruction truncates a ``value`` from a larger
|
|
:ref:`floating point <t_floating>` type to a smaller :ref:`floating
|
|
point <t_floating>` type. If the value cannot fit within the
|
|
destination type, ``ty2``, then the results are undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = fptrunc double 123.0 to float ; yields float:123.0
|
|
%Y = fptrunc double 1.0E+300 to float ; yields undefined
|
|
|
|
'``fpext .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fpext <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fpext``' extends a floating point ``value`` to a larger floating
|
|
point value.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``fpext``' instruction takes a :ref:`floating point <t_floating>`
|
|
``value`` to cast, and a :ref:`floating point <t_floating>` type to cast it
|
|
to. The source type must be smaller than the destination type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``fpext``' instruction extends the ``value`` from a smaller
|
|
:ref:`floating point <t_floating>` type to a larger :ref:`floating
|
|
point <t_floating>` type. The ``fpext`` cannot be used to make a
|
|
*no-op cast* because it always changes bits. Use ``bitcast`` to make a
|
|
*no-op cast* for a floating point cast.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = fpext float 3.125 to double ; yields double:3.125000e+00
|
|
%Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
|
|
|
|
'``fptoui .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fptoui <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fptoui``' converts a floating point ``value`` to its unsigned
|
|
integer equivalent of type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``fptoui``' instruction takes a value to cast, which must be a
|
|
scalar or vector :ref:`floating point <t_floating>` value, and a type to
|
|
cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
|
|
``ty`` is a vector floating point type, ``ty2`` must be a vector integer
|
|
type with the same number of elements as ``ty``
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``fptoui``' instruction converts its :ref:`floating
|
|
point <t_floating>` operand into the nearest (rounding towards zero)
|
|
unsigned integer value. If the value cannot fit in ``ty2``, the results
|
|
are undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = fptoui double 123.0 to i32 ; yields i32:123
|
|
%Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
|
|
%Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
|
|
|
|
'``fptosi .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fptosi <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fptosi``' instruction converts :ref:`floating point <t_floating>`
|
|
``value`` to type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``fptosi``' instruction takes a value to cast, which must be a
|
|
scalar or vector :ref:`floating point <t_floating>` value, and a type to
|
|
cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
|
|
``ty`` is a vector floating point type, ``ty2`` must be a vector integer
|
|
type with the same number of elements as ``ty``
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``fptosi``' instruction converts its :ref:`floating
|
|
point <t_floating>` operand into the nearest (rounding towards zero)
|
|
signed integer value. If the value cannot fit in ``ty2``, the results
|
|
are undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = fptosi double -123.0 to i32 ; yields i32:-123
|
|
%Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
|
|
%Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
|
|
|
|
'``uitofp .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = uitofp <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``uitofp``' instruction regards ``value`` as an unsigned integer
|
|
and converts that value to the ``ty2`` type.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``uitofp``' instruction takes a value to cast, which must be a
|
|
scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
|
|
``ty2``, which must be an :ref:`floating point <t_floating>` type. If
|
|
``ty`` is a vector integer type, ``ty2`` must be a vector floating point
|
|
type with the same number of elements as ``ty``
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``uitofp``' instruction interprets its operand as an unsigned
|
|
integer quantity and converts it to the corresponding floating point
|
|
value. If the value cannot fit in the floating point value, the results
|
|
are undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = uitofp i32 257 to float ; yields float:257.0
|
|
%Y = uitofp i8 -1 to double ; yields double:255.0
|
|
|
|
'``sitofp .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = sitofp <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``sitofp``' instruction regards ``value`` as a signed integer and
|
|
converts that value to the ``ty2`` type.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``sitofp``' instruction takes a value to cast, which must be a
|
|
scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
|
|
``ty2``, which must be an :ref:`floating point <t_floating>` type. If
|
|
``ty`` is a vector integer type, ``ty2`` must be a vector floating point
|
|
type with the same number of elements as ``ty``
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``sitofp``' instruction interprets its operand as a signed integer
|
|
quantity and converts it to the corresponding floating point value. If
|
|
the value cannot fit in the floating point value, the results are
|
|
undefined.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = sitofp i32 257 to float ; yields float:257.0
|
|
%Y = sitofp i8 -1 to double ; yields double:-1.0
|
|
|
|
.. _i_ptrtoint:
|
|
|
|
'``ptrtoint .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``ptrtoint``' instruction converts the pointer or a vector of
|
|
pointers ``value`` to the integer (or vector of integers) type ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
|
|
a a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
|
|
type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
|
|
a vector of integers type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``ptrtoint``' instruction converts ``value`` to integer type
|
|
``ty2`` by interpreting the pointer value as an integer and either
|
|
truncating or zero extending that value to the size of the integer type.
|
|
If ``value`` is smaller than ``ty2`` then a zero extension is done. If
|
|
``value`` is larger than ``ty2`` then a truncation is done. If they are
|
|
the same size, then nothing is done (*no-op cast*) other than a type
|
|
change.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture
|
|
%Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture
|
|
%Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
|
|
|
|
.. _i_inttoptr:
|
|
|
|
'``inttoptr .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = inttoptr <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``inttoptr``' instruction converts an integer ``value`` to a
|
|
pointer type, ``ty2``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
|
|
cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
|
|
applying either a zero extension or a truncation depending on the size
|
|
of the integer ``value``. If ``value`` is larger than the size of a
|
|
pointer then a truncation is done. If ``value`` is smaller than the size
|
|
of a pointer then a zero extension is done. If they are the same size,
|
|
nothing is done (*no-op cast*).
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture
|
|
%Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture
|
|
%Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture
|
|
%Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
|
|
|
|
.. _i_bitcast:
|
|
|
|
'``bitcast .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = bitcast <ty> <value> to <ty2> ; yields ty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
|
|
changing any bits.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``bitcast``' instruction takes a value to cast, which must be a
|
|
non-aggregate first class value, and a type to cast it to, which must
|
|
also be a non-aggregate :ref:`first class <t_firstclass>` type. The
|
|
bit sizes of ``value`` and the destination type, ``ty2``, must be
|
|
identical. If the source type is a pointer, the destination type must
|
|
also be a pointer of the same size. This instruction supports bitwise
|
|
conversion of vectors to integers and to vectors of other types (as
|
|
long as they have the same size).
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
|
|
is always a *no-op cast* because no bits change with this
|
|
conversion. The conversion is done as if the ``value`` had been stored
|
|
to memory and read back as type ``ty2``. Pointer (or vector of
|
|
pointers) types may only be converted to other pointer (or vector of
|
|
pointers) types with the same address space through this instruction.
|
|
To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
|
|
or :ref:`ptrtoint <i_ptrtoint>` instructions first.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = bitcast i8 255 to i8 ; yields i8 :-1
|
|
%Y = bitcast i32* %x to sint* ; yields sint*:%x
|
|
%Z = bitcast <2 x int> %V to i64; ; yields i64: %V
|
|
%Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
|
|
|
|
.. _i_addrspacecast:
|
|
|
|
'``addrspacecast .. to``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
|
|
address space ``n`` to type ``pty2`` in address space ``m``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``addrspacecast``' instruction takes a pointer or vector of pointer value
|
|
to cast and a pointer type to cast it to, which must have a different
|
|
address space.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``addrspacecast``' instruction converts the pointer value
|
|
``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
|
|
value modification, depending on the target and the address space
|
|
pair. Pointer conversions within the same address space must be
|
|
performed with the ``bitcast`` instruction. Note that if the address space
|
|
conversion is legal then both result and operand refer to the same memory
|
|
location.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x
|
|
%Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y
|
|
%Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z
|
|
|
|
.. _otherops:
|
|
|
|
Other Operations
|
|
----------------
|
|
|
|
The instructions in this category are the "miscellaneous" instructions,
|
|
which defy better classification.
|
|
|
|
.. _i_icmp:
|
|
|
|
'``icmp``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``icmp``' instruction returns a boolean value or a vector of
|
|
boolean values based on comparison of its two integer, integer vector,
|
|
pointer, or pointer vector operands.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``icmp``' instruction takes three operands. The first operand is
|
|
the condition code indicating the kind of comparison to perform. It is
|
|
not a value, just a keyword. The possible condition code are:
|
|
|
|
#. ``eq``: equal
|
|
#. ``ne``: not equal
|
|
#. ``ugt``: unsigned greater than
|
|
#. ``uge``: unsigned greater or equal
|
|
#. ``ult``: unsigned less than
|
|
#. ``ule``: unsigned less or equal
|
|
#. ``sgt``: signed greater than
|
|
#. ``sge``: signed greater or equal
|
|
#. ``slt``: signed less than
|
|
#. ``sle``: signed less or equal
|
|
|
|
The remaining two arguments must be :ref:`integer <t_integer>` or
|
|
:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
|
|
must also be identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``icmp``' compares ``op1`` and ``op2`` according to the condition
|
|
code given as ``cond``. The comparison performed always yields either an
|
|
:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
|
|
|
|
#. ``eq``: yields ``true`` if the operands are equal, ``false``
|
|
otherwise. No sign interpretation is necessary or performed.
|
|
#. ``ne``: yields ``true`` if the operands are unequal, ``false``
|
|
otherwise. No sign interpretation is necessary or performed.
|
|
#. ``ugt``: interprets the operands as unsigned values and yields
|
|
``true`` if ``op1`` is greater than ``op2``.
|
|
#. ``uge``: interprets the operands as unsigned values and yields
|
|
``true`` if ``op1`` is greater than or equal to ``op2``.
|
|
#. ``ult``: interprets the operands as unsigned values and yields
|
|
``true`` if ``op1`` is less than ``op2``.
|
|
#. ``ule``: interprets the operands as unsigned values and yields
|
|
``true`` if ``op1`` is less than or equal to ``op2``.
|
|
#. ``sgt``: interprets the operands as signed values and yields ``true``
|
|
if ``op1`` is greater than ``op2``.
|
|
#. ``sge``: interprets the operands as signed values and yields ``true``
|
|
if ``op1`` is greater than or equal to ``op2``.
|
|
#. ``slt``: interprets the operands as signed values and yields ``true``
|
|
if ``op1`` is less than ``op2``.
|
|
#. ``sle``: interprets the operands as signed values and yields ``true``
|
|
if ``op1`` is less than or equal to ``op2``.
|
|
|
|
If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
|
|
are compared as if they were integers.
|
|
|
|
If the operands are integer vectors, then they are compared element by
|
|
element. The result is an ``i1`` vector with the same number of elements
|
|
as the values being compared. Otherwise, the result is an ``i1``.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = icmp eq i32 4, 5 ; yields: result=false
|
|
<result> = icmp ne float* %X, %X ; yields: result=false
|
|
<result> = icmp ult i16 4, 5 ; yields: result=true
|
|
<result> = icmp sgt i16 4, 5 ; yields: result=false
|
|
<result> = icmp ule i16 -4, 5 ; yields: result=false
|
|
<result> = icmp sge i16 4, 5 ; yields: result=false
|
|
|
|
Note that the code generator does not yet support vector types with the
|
|
``icmp`` instruction.
|
|
|
|
.. _i_fcmp:
|
|
|
|
'``fcmp``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = fcmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``fcmp``' instruction returns a boolean value or vector of boolean
|
|
values based on comparison of its operands.
|
|
|
|
If the operands are floating point scalars, then the result type is a
|
|
boolean (:ref:`i1 <t_integer>`).
|
|
|
|
If the operands are floating point vectors, then the result type is a
|
|
vector of boolean with the same number of elements as the operands being
|
|
compared.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``fcmp``' instruction takes three operands. The first operand is
|
|
the condition code indicating the kind of comparison to perform. It is
|
|
not a value, just a keyword. The possible condition code are:
|
|
|
|
#. ``false``: no comparison, always returns false
|
|
#. ``oeq``: ordered and equal
|
|
#. ``ogt``: ordered and greater than
|
|
#. ``oge``: ordered and greater than or equal
|
|
#. ``olt``: ordered and less than
|
|
#. ``ole``: ordered and less than or equal
|
|
#. ``one``: ordered and not equal
|
|
#. ``ord``: ordered (no nans)
|
|
#. ``ueq``: unordered or equal
|
|
#. ``ugt``: unordered or greater than
|
|
#. ``uge``: unordered or greater than or equal
|
|
#. ``ult``: unordered or less than
|
|
#. ``ule``: unordered or less than or equal
|
|
#. ``une``: unordered or not equal
|
|
#. ``uno``: unordered (either nans)
|
|
#. ``true``: no comparison, always returns true
|
|
|
|
*Ordered* means that neither operand is a QNAN while *unordered* means
|
|
that either operand may be a QNAN.
|
|
|
|
Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating
|
|
point <t_floating>` type or a :ref:`vector <t_vector>` of floating point
|
|
type. They must have identical types.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
|
|
condition code given as ``cond``. If the operands are vectors, then the
|
|
vectors are compared element by element. Each comparison performed
|
|
always yields an :ref:`i1 <t_integer>` result, as follows:
|
|
|
|
#. ``false``: always yields ``false``, regardless of operands.
|
|
#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
|
|
is equal to ``op2``.
|
|
#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
|
|
is greater than ``op2``.
|
|
#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
|
|
is greater than or equal to ``op2``.
|
|
#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
|
|
is less than ``op2``.
|
|
#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
|
|
is less than or equal to ``op2``.
|
|
#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
|
|
is not equal to ``op2``.
|
|
#. ``ord``: yields ``true`` if both operands are not a QNAN.
|
|
#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
|
|
equal to ``op2``.
|
|
#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
|
|
greater than ``op2``.
|
|
#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
|
|
greater than or equal to ``op2``.
|
|
#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
|
|
less than ``op2``.
|
|
#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
|
|
less than or equal to ``op2``.
|
|
#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
|
|
not equal to ``op2``.
|
|
#. ``uno``: yields ``true`` if either operand is a QNAN.
|
|
#. ``true``: always yields ``true``, regardless of operands.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
<result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
|
|
<result> = fcmp one float 4.0, 5.0 ; yields: result=true
|
|
<result> = fcmp olt float 4.0, 5.0 ; yields: result=true
|
|
<result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
|
|
|
|
Note that the code generator does not yet support vector types with the
|
|
``fcmp`` instruction.
|
|
|
|
.. _i_phi:
|
|
|
|
'``phi``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = phi <ty> [ <val0>, <label0>], ...
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``phi``' instruction is used to implement the φ node in the SSA
|
|
graph representing the function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The type of the incoming values is specified with the first type field.
|
|
After this, the '``phi``' instruction takes a list of pairs as
|
|
arguments, with one pair for each predecessor basic block of the current
|
|
block. Only values of :ref:`first class <t_firstclass>` type may be used as
|
|
the value arguments to the PHI node. Only labels may be used as the
|
|
label arguments.
|
|
|
|
There must be no non-phi instructions between the start of a basic block
|
|
and the PHI instructions: i.e. PHI instructions must be first in a basic
|
|
block.
|
|
|
|
For the purposes of the SSA form, the use of each incoming value is
|
|
deemed to occur on the edge from the corresponding predecessor block to
|
|
the current block (but after any definition of an '``invoke``'
|
|
instruction's return value on the same edge).
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
At runtime, the '``phi``' instruction logically takes on the value
|
|
specified by the pair corresponding to the predecessor basic block that
|
|
executed just prior to the current block.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
Loop: ; Infinite loop that counts from 0 on up...
|
|
%indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
|
|
%nextindvar = add i32 %indvar, 1
|
|
br label %Loop
|
|
|
|
.. _i_select:
|
|
|
|
'``select``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
|
|
|
|
selty is either i1 or {<N x i1>}
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``select``' instruction is used to choose one value based on a
|
|
condition, without IR-level branching.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``select``' instruction requires an 'i1' value or a vector of 'i1'
|
|
values indicating the condition, and two values of the same :ref:`first
|
|
class <t_firstclass>` type. If the val1/val2 are vectors and the
|
|
condition is a scalar, then entire vectors are selected, not individual
|
|
elements.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
If the condition is an i1 and it evaluates to 1, the instruction returns
|
|
the first value argument; otherwise, it returns the second value
|
|
argument.
|
|
|
|
If the condition is a vector of i1, then the value arguments must be
|
|
vectors of the same size, and the selection is done element by element.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%X = select i1 true, i8 17, i8 42 ; yields i8:17
|
|
|
|
.. _i_call:
|
|
|
|
'``call``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs]
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``call``' instruction represents a simple function call.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
This instruction requires several arguments:
|
|
|
|
#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
|
|
should perform tail call optimization. The ``tail`` marker is a hint that
|
|
`can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
|
|
means that the call must be tail call optimized in order for the program to
|
|
be correct. The ``musttail`` marker provides these guarantees:
|
|
|
|
#. The call will not cause unbounded stack growth if it is part of a
|
|
recursive cycle in the call graph.
|
|
#. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are
|
|
forwarded in place.
|
|
|
|
Both markers imply that the callee does not access allocas or varargs from
|
|
the caller. Calls marked ``musttail`` must obey the following additional
|
|
rules:
|
|
|
|
- The call must immediately precede a :ref:`ret <i_ret>` instruction,
|
|
or a pointer bitcast followed by a ret instruction.
|
|
- The ret instruction must return the (possibly bitcasted) value
|
|
produced by the call or void.
|
|
- The caller and callee prototypes must match. Pointer types of
|
|
parameters or return types may differ in pointee type, but not
|
|
in address space.
|
|
- The calling conventions of the caller and callee must match.
|
|
- All ABI-impacting function attributes, such as sret, byval, inreg,
|
|
returned, and inalloca, must match.
|
|
- The callee must be varargs iff the caller is varargs. Bitcasting a
|
|
non-varargs function to the appropriate varargs type is legal so
|
|
long as the non-varargs prefixes obey the other rules.
|
|
|
|
Tail call optimization for calls marked ``tail`` is guaranteed to occur if
|
|
the following conditions are met:
|
|
|
|
- Caller and callee both have the calling convention ``fastcc``.
|
|
- The call is in tail position (ret immediately follows call and ret
|
|
uses value of call or is void).
|
|
- Option ``-tailcallopt`` is enabled, or
|
|
``llvm::GuaranteedTailCallOpt`` is ``true``.
|
|
- `Platform-specific constraints are
|
|
met. <CodeGenerator.html#tailcallopt>`_
|
|
|
|
#. The optional "cconv" marker indicates which :ref:`calling
|
|
convention <callingconv>` the call should use. If none is
|
|
specified, the call defaults to using C calling conventions. The
|
|
calling convention of the call must match the calling convention of
|
|
the target function, or else the behavior is undefined.
|
|
#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
|
|
values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
|
|
are valid here.
|
|
#. '``ty``': the type of the call instruction itself which is also the
|
|
type of the return value. Functions that return no value are marked
|
|
``void``.
|
|
#. '``fnty``': shall be the signature of the pointer to function value
|
|
being invoked. The argument types must match the types implied by
|
|
this signature. This type can be omitted if the function is not
|
|
varargs and if the function type does not return a pointer to a
|
|
function.
|
|
#. '``fnptrval``': An LLVM value containing a pointer to a function to
|
|
be invoked. In most cases, this is a direct function invocation, but
|
|
indirect ``call``'s are just as possible, calling an arbitrary pointer
|
|
to function value.
|
|
#. '``function args``': argument list whose types match the function
|
|
signature argument types and parameter attributes. All arguments must
|
|
be of :ref:`first class <t_firstclass>` type. If the function signature
|
|
indicates the function accepts a variable number of arguments, the
|
|
extra arguments can be specified.
|
|
#. The optional :ref:`function attributes <fnattrs>` list. Only
|
|
'``noreturn``', '``nounwind``', '``readonly``' and '``readnone``'
|
|
attributes are valid here.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``call``' instruction is used to cause control flow to transfer to
|
|
a specified function, with its incoming arguments bound to the specified
|
|
values. Upon a '``ret``' instruction in the called function, control
|
|
flow continues with the instruction after the function call, and the
|
|
return value of the function is bound to the result argument.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%retval = call i32 @test(i32 %argc)
|
|
call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32
|
|
%X = tail call i32 @foo() ; yields i32
|
|
%Y = tail call fastcc i32 @foo() ; yields i32
|
|
call void %foo(i8 97 signext)
|
|
|
|
%struct.A = type { i32, i8 }
|
|
%r = call %struct.A @foo() ; yields { i32, i8 }
|
|
%gr = extractvalue %struct.A %r, 0 ; yields i32
|
|
%gr1 = extractvalue %struct.A %r, 1 ; yields i8
|
|
%Z = call void @foo() noreturn ; indicates that %foo never returns normally
|
|
%ZZ = call zeroext i32 @bar() ; Return value is %zero extended
|
|
|
|
llvm treats calls to some functions with names and arguments that match
|
|
the standard C99 library as being the C99 library functions, and may
|
|
perform optimizations or generate code for them under that assumption.
|
|
This is something we'd like to change in the future to provide better
|
|
support for freestanding environments and non-C-based languages.
|
|
|
|
.. _i_va_arg:
|
|
|
|
'``va_arg``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<resultval> = va_arg <va_list*> <arglist>, <argty>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``va_arg``' instruction is used to access arguments passed through
|
|
the "variable argument" area of a function call. It is used to implement
|
|
the ``va_arg`` macro in C.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
This instruction takes a ``va_list*`` value and the type of the
|
|
argument. It returns a value of the specified argument type and
|
|
increments the ``va_list`` to point to the next argument. The actual
|
|
type of ``va_list`` is target specific.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``va_arg``' instruction loads an argument of the specified type
|
|
from the specified ``va_list`` and causes the ``va_list`` to point to
|
|
the next argument. For more information, see the variable argument
|
|
handling :ref:`Intrinsic Functions <int_varargs>`.
|
|
|
|
It is legal for this instruction to be called in a function which does
|
|
not take a variable number of arguments, for example, the ``vfprintf``
|
|
function.
|
|
|
|
``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
|
|
function <intrinsics>` because it takes a type as an argument.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
See the :ref:`variable argument processing <int_varargs>` section.
|
|
|
|
Note that the code generator does not yet fully support va\_arg on many
|
|
targets. Also, it does not currently support va\_arg with aggregate
|
|
types on any target.
|
|
|
|
.. _i_landingpad:
|
|
|
|
'``landingpad``' Instruction
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
<resultval> = landingpad <resultty> personality <type> <pers_fn> <clause>+
|
|
<resultval> = landingpad <resultty> personality <type> <pers_fn> cleanup <clause>*
|
|
|
|
<clause> := catch <type> <value>
|
|
<clause> := filter <array constant type> <array constant>
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``landingpad``' instruction is used by `LLVM's exception handling
|
|
system <ExceptionHandling.html#overview>`_ to specify that a basic block
|
|
is a landing pad --- one where the exception lands, and corresponds to the
|
|
code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
|
|
defines values supplied by the personality function (``pers_fn``) upon
|
|
re-entry to the function. The ``resultval`` has the type ``resultty``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
This instruction takes a ``pers_fn`` value. This is the personality
|
|
function associated with the unwinding mechanism. The optional
|
|
``cleanup`` flag indicates that the landing pad block is a cleanup.
|
|
|
|
A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
|
|
contains the global variable representing the "type" that may be caught
|
|
or filtered respectively. Unlike the ``catch`` clause, the ``filter``
|
|
clause takes an array constant as its argument. Use
|
|
"``[0 x i8**] undef``" for a filter which cannot throw. The
|
|
'``landingpad``' instruction must contain *at least* one ``clause`` or
|
|
the ``cleanup`` flag.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``landingpad``' instruction defines the values which are set by the
|
|
personality function (``pers_fn``) upon re-entry to the function, and
|
|
therefore the "result type" of the ``landingpad`` instruction. As with
|
|
calling conventions, how the personality function results are
|
|
represented in LLVM IR is target specific.
|
|
|
|
The clauses are applied in order from top to bottom. If two
|
|
``landingpad`` instructions are merged together through inlining, the
|
|
clauses from the calling function are appended to the list of clauses.
|
|
When the call stack is being unwound due to an exception being thrown,
|
|
the exception is compared against each ``clause`` in turn. If it doesn't
|
|
match any of the clauses, and the ``cleanup`` flag is not set, then
|
|
unwinding continues further up the call stack.
|
|
|
|
The ``landingpad`` instruction has several restrictions:
|
|
|
|
- A landing pad block is a basic block which is the unwind destination
|
|
of an '``invoke``' instruction.
|
|
- A landing pad block must have a '``landingpad``' instruction as its
|
|
first non-PHI instruction.
|
|
- There can be only one '``landingpad``' instruction within the landing
|
|
pad block.
|
|
- A basic block that is not a landing pad block may not include a
|
|
'``landingpad``' instruction.
|
|
- All '``landingpad``' instructions in a function must have the same
|
|
personality function.
|
|
|
|
Example:
|
|
""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
;; A landing pad which can catch an integer.
|
|
%res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
|
|
catch i8** @_ZTIi
|
|
;; A landing pad that is a cleanup.
|
|
%res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
|
|
cleanup
|
|
;; A landing pad which can catch an integer and can only throw a double.
|
|
%res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
|
|
catch i8** @_ZTIi
|
|
filter [1 x i8**] [@_ZTId]
|
|
|
|
.. _intrinsics:
|
|
|
|
Intrinsic Functions
|
|
===================
|
|
|
|
LLVM supports the notion of an "intrinsic function". These functions
|
|
have well known names and semantics and are required to follow certain
|
|
restrictions. Overall, these intrinsics represent an extension mechanism
|
|
for the LLVM language that does not require changing all of the
|
|
transformations in LLVM when adding to the language (or the bitcode
|
|
reader/writer, the parser, etc...).
|
|
|
|
Intrinsic function names must all start with an "``llvm.``" prefix. This
|
|
prefix is reserved in LLVM for intrinsic names; thus, function names may
|
|
not begin with this prefix. Intrinsic functions must always be external
|
|
functions: you cannot define the body of intrinsic functions. Intrinsic
|
|
functions may only be used in call or invoke instructions: it is illegal
|
|
to take the address of an intrinsic function. Additionally, because
|
|
intrinsic functions are part of the LLVM language, it is required if any
|
|
are added that they be documented here.
|
|
|
|
Some intrinsic functions can be overloaded, i.e., the intrinsic
|
|
represents a family of functions that perform the same operation but on
|
|
different data types. Because LLVM can represent over 8 million
|
|
different integer types, overloading is used commonly to allow an
|
|
intrinsic function to operate on any integer type. One or more of the
|
|
argument types or the result type can be overloaded to accept any
|
|
integer type. Argument types may also be defined as exactly matching a
|
|
previous argument's type or the result type. This allows an intrinsic
|
|
function which accepts multiple arguments, but needs all of them to be
|
|
of the same type, to only be overloaded with respect to a single
|
|
argument or the result.
|
|
|
|
Overloaded intrinsics will have the names of its overloaded argument
|
|
types encoded into its function name, each preceded by a period. Only
|
|
those types which are overloaded result in a name suffix. Arguments
|
|
whose type is matched against another type do not. For example, the
|
|
``llvm.ctpop`` function can take an integer of any width and returns an
|
|
integer of exactly the same integer width. This leads to a family of
|
|
functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
|
|
``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
|
|
overloaded, and only one type suffix is required. Because the argument's
|
|
type is matched against the return type, it does not require its own
|
|
name suffix.
|
|
|
|
To learn how to add an intrinsic function, please see the `Extending
|
|
LLVM Guide <ExtendingLLVM.html>`_.
|
|
|
|
.. _int_varargs:
|
|
|
|
Variable Argument Handling Intrinsics
|
|
-------------------------------------
|
|
|
|
Variable argument support is defined in LLVM with the
|
|
:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
|
|
functions. These functions are related to the similarly named macros
|
|
defined in the ``<stdarg.h>`` header file.
|
|
|
|
All of these functions operate on arguments that use a target-specific
|
|
value type "``va_list``". The LLVM assembly language reference manual
|
|
does not define what this type is, so all transformations should be
|
|
prepared to handle these functions regardless of the type used.
|
|
|
|
This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
|
|
variable argument handling intrinsic functions are used.
|
|
|
|
.. code-block:: llvm
|
|
|
|
define i32 @test(i32 %X, ...) {
|
|
; Initialize variable argument processing
|
|
%ap = alloca i8*
|
|
%ap2 = bitcast i8** %ap to i8*
|
|
call void @llvm.va_start(i8* %ap2)
|
|
|
|
; Read a single integer argument
|
|
%tmp = va_arg i8** %ap, i32
|
|
|
|
; Demonstrate usage of llvm.va_copy and llvm.va_end
|
|
%aq = alloca i8*
|
|
%aq2 = bitcast i8** %aq to i8*
|
|
call void @llvm.va_copy(i8* %aq2, i8* %ap2)
|
|
call void @llvm.va_end(i8* %aq2)
|
|
|
|
; Stop processing of arguments.
|
|
call void @llvm.va_end(i8* %ap2)
|
|
ret i32 %tmp
|
|
}
|
|
|
|
declare void @llvm.va_start(i8*)
|
|
declare void @llvm.va_copy(i8*, i8*)
|
|
declare void @llvm.va_end(i8*)
|
|
|
|
.. _int_va_start:
|
|
|
|
'``llvm.va_start``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.va_start(i8* <arglist>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
|
|
subsequent use by ``va_arg``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument is a pointer to a ``va_list`` element to initialize.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
|
|
available in C. In a target-dependent way, it initializes the
|
|
``va_list`` element to which the argument points, so that the next call
|
|
to ``va_arg`` will produce the first variable argument passed to the
|
|
function. Unlike the C ``va_start`` macro, this intrinsic does not need
|
|
to know the last argument of the function as the compiler can figure
|
|
that out.
|
|
|
|
'``llvm.va_end``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.va_end(i8* <arglist>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
|
|
initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument is a pointer to a ``va_list`` to destroy.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
|
|
available in C. In a target-dependent way, it destroys the ``va_list``
|
|
element to which the argument points. Calls to
|
|
:ref:`llvm.va_start <int_va_start>` and
|
|
:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
|
|
``llvm.va_end``.
|
|
|
|
.. _int_va_copy:
|
|
|
|
'``llvm.va_copy``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.va_copy``' intrinsic copies the current argument position
|
|
from the source argument list to the destination argument list.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a pointer to a ``va_list`` element to initialize.
|
|
The second argument is a pointer to a ``va_list`` element to copy from.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
|
|
available in C. In a target-dependent way, it copies the source
|
|
``va_list`` element into the destination ``va_list`` element. This
|
|
intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
|
|
arbitrarily complex and require, for example, memory allocation.
|
|
|
|
Accurate Garbage Collection Intrinsics
|
|
--------------------------------------
|
|
|
|
LLVM support for `Accurate Garbage Collection <GarbageCollection.html>`_
|
|
(GC) requires the implementation and generation of these intrinsics.
|
|
These intrinsics allow identification of :ref:`GC roots on the
|
|
stack <int_gcroot>`, as well as garbage collector implementations that
|
|
require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
|
|
Front-ends for type-safe garbage collected languages should generate
|
|
these intrinsics to make use of the LLVM garbage collectors. For more
|
|
details, see `Accurate Garbage Collection with
|
|
LLVM <GarbageCollection.html>`_.
|
|
|
|
The garbage collection intrinsics only operate on objects in the generic
|
|
address space (address space zero).
|
|
|
|
.. _int_gcroot:
|
|
|
|
'``llvm.gcroot``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
|
|
the code generator, and allows some metadata to be associated with it.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument specifies the address of a stack object that contains
|
|
the root pointer. The second pointer (which must be either a constant or
|
|
a global value address) contains the meta-data to be associated with the
|
|
root.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
At runtime, a call to this intrinsic stores a null pointer into the
|
|
"ptrloc" location. At compile-time, the code generator generates
|
|
information to allow the runtime to find the pointer at GC safe points.
|
|
The '``llvm.gcroot``' intrinsic may only be used in a function which
|
|
:ref:`specifies a GC algorithm <gc>`.
|
|
|
|
.. _int_gcread:
|
|
|
|
'``llvm.gcread``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.gcread``' intrinsic identifies reads of references from heap
|
|
locations, allowing garbage collector implementations that require read
|
|
barriers.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The second argument is the address to read from, which should be an
|
|
address allocated from the garbage collector. The first object is a
|
|
pointer to the start of the referenced object, if needed by the language
|
|
runtime (otherwise null).
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.gcread``' intrinsic has the same semantics as a load
|
|
instruction, but may be replaced with substantially more complex code by
|
|
the garbage collector runtime, as needed. The '``llvm.gcread``'
|
|
intrinsic may only be used in a function which :ref:`specifies a GC
|
|
algorithm <gc>`.
|
|
|
|
.. _int_gcwrite:
|
|
|
|
'``llvm.gcwrite``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
|
|
locations, allowing garbage collector implementations that require write
|
|
barriers (such as generational or reference counting collectors).
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is the reference to store, the second is the start of
|
|
the object to store it to, and the third is the address of the field of
|
|
Obj to store to. If the runtime does not require a pointer to the
|
|
object, Obj may be null.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.gcwrite``' intrinsic has the same semantics as a store
|
|
instruction, but may be replaced with substantially more complex code by
|
|
the garbage collector runtime, as needed. The '``llvm.gcwrite``'
|
|
intrinsic may only be used in a function which :ref:`specifies a GC
|
|
algorithm <gc>`.
|
|
|
|
Code Generator Intrinsics
|
|
-------------------------
|
|
|
|
These intrinsics are provided by LLVM to expose special features that
|
|
may only be implemented with code generator support.
|
|
|
|
'``llvm.returnaddress``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i8 *@llvm.returnaddress(i32 <level>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.returnaddress``' intrinsic attempts to compute a
|
|
target-specific value indicating the return address of the current
|
|
function or one of its callers.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument to this intrinsic indicates which function to return the
|
|
address for. Zero indicates the calling function, one indicates its
|
|
caller, etc. The argument is **required** to be a constant integer
|
|
value.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.returnaddress``' intrinsic either returns a pointer
|
|
indicating the return address of the specified call frame, or zero if it
|
|
cannot be identified. The value returned by this intrinsic is likely to
|
|
be incorrect or 0 for arguments other than zero, so it should only be
|
|
used for debugging purposes.
|
|
|
|
Note that calling this intrinsic does not prevent function inlining or
|
|
other aggressive transformations, so the value returned may not be that
|
|
of the obvious source-language caller.
|
|
|
|
'``llvm.frameaddress``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i8* @llvm.frameaddress(i32 <level>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.frameaddress``' intrinsic attempts to return the
|
|
target-specific frame pointer value for the specified stack frame.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument to this intrinsic indicates which function to return the
|
|
frame pointer for. Zero indicates the calling function, one indicates
|
|
its caller, etc. The argument is **required** to be a constant integer
|
|
value.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.frameaddress``' intrinsic either returns a pointer
|
|
indicating the frame address of the specified call frame, or zero if it
|
|
cannot be identified. The value returned by this intrinsic is likely to
|
|
be incorrect or 0 for arguments other than zero, so it should only be
|
|
used for debugging purposes.
|
|
|
|
Note that calling this intrinsic does not prevent function inlining or
|
|
other aggressive transformations, so the value returned may not be that
|
|
of the obvious source-language caller.
|
|
|
|
.. _int_read_register:
|
|
.. _int_write_register:
|
|
|
|
'``llvm.read_register``' and '``llvm.write_register``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i32 @llvm.read_register.i32(metadata)
|
|
declare i64 @llvm.read_register.i64(metadata)
|
|
declare void @llvm.write_register.i32(metadata, i32 @value)
|
|
declare void @llvm.write_register.i64(metadata, i64 @value)
|
|
!0 = metadata !{metadata !"sp\00"}
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.read_register``' and '``llvm.write_register``' intrinsics
|
|
provides access to the named register. The register must be valid on
|
|
the architecture being compiled to. The type needs to be compatible
|
|
with the register being read.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.read_register``' intrinsic returns the current value of the
|
|
register, where possible. The '``llvm.write_register``' intrinsic sets
|
|
the current value of the register, where possible.
|
|
|
|
This is useful to implement named register global variables that need
|
|
to always be mapped to a specific register, as is common practice on
|
|
bare-metal programs including OS kernels.
|
|
|
|
The compiler doesn't check for register availability or use of the used
|
|
register in surrounding code, including inline assembly. Because of that,
|
|
allocatable registers are not supported.
|
|
|
|
Warning: So far it only works with the stack pointer on selected
|
|
architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
|
|
work is needed to support other registers and even more so, allocatable
|
|
registers.
|
|
|
|
.. _int_stacksave:
|
|
|
|
'``llvm.stacksave``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i8* @llvm.stacksave()
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.stacksave``' intrinsic is used to remember the current state
|
|
of the function stack, for use with
|
|
:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
|
|
implementing language features like scoped automatic variable sized
|
|
arrays in C99.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic returns a opaque pointer value that can be passed to
|
|
:ref:`llvm.stackrestore <int_stackrestore>`. When an
|
|
``llvm.stackrestore`` intrinsic is executed with a value saved from
|
|
``llvm.stacksave``, it effectively restores the state of the stack to
|
|
the state it was in when the ``llvm.stacksave`` intrinsic executed. In
|
|
practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
|
|
were allocated after the ``llvm.stacksave`` was executed.
|
|
|
|
.. _int_stackrestore:
|
|
|
|
'``llvm.stackrestore``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.stackrestore(i8* %ptr)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.stackrestore``' intrinsic is used to restore the state of
|
|
the function stack to the state it was in when the corresponding
|
|
:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
|
|
useful for implementing language features like scoped automatic variable
|
|
sized arrays in C99.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
See the description for :ref:`llvm.stacksave <int_stacksave>`.
|
|
|
|
'``llvm.prefetch``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.prefetch``' intrinsic is a hint to the code generator to
|
|
insert a prefetch instruction if supported; otherwise, it is a noop.
|
|
Prefetches have no effect on the behavior of the program but can change
|
|
its performance characteristics.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
``address`` is the address to be prefetched, ``rw`` is the specifier
|
|
determining if the fetch should be for a read (0) or write (1), and
|
|
``locality`` is a temporal locality specifier ranging from (0) - no
|
|
locality, to (3) - extremely local keep in cache. The ``cache type``
|
|
specifies whether the prefetch is performed on the data (1) or
|
|
instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
|
|
arguments must be constant integers.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic does not modify the behavior of the program. In
|
|
particular, prefetches cannot trap and do not produce a value. On
|
|
targets that support this intrinsic, the prefetch can provide hints to
|
|
the processor cache for better performance.
|
|
|
|
'``llvm.pcmarker``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.pcmarker(i32 <id>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.pcmarker``' intrinsic is a method to export a Program
|
|
Counter (PC) in a region of code to simulators and other tools. The
|
|
method is target specific, but it is expected that the marker will use
|
|
exported symbols to transmit the PC of the marker. The marker makes no
|
|
guarantees that it will remain with any specific instruction after
|
|
optimizations. It is possible that the presence of a marker will inhibit
|
|
optimizations. The intended use is to be inserted after optimizations to
|
|
allow correlations of simulation runs.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
``id`` is a numerical id identifying the marker.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic does not modify the behavior of the program. Backends
|
|
that do not support this intrinsic may ignore it.
|
|
|
|
'``llvm.readcyclecounter``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i64 @llvm.readcyclecounter()
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
|
|
counter register (or similar low latency, high accuracy clocks) on those
|
|
targets that support it. On X86, it should map to RDTSC. On Alpha, it
|
|
should map to RPCC. As the backing counters overflow quickly (on the
|
|
order of 9 seconds on alpha), this should only be used for small
|
|
timings.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
When directly supported, reading the cycle counter should not modify any
|
|
memory. Implementations are allowed to either return a application
|
|
specific value or a system wide value. On backends without support, this
|
|
is lowered to a constant 0.
|
|
|
|
Note that runtime support may be conditional on the privilege-level code is
|
|
running at and the host platform.
|
|
|
|
'``llvm.clear_cache``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.clear_cache(i8*, i8*)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
|
|
in the specified range to the execution unit of the processor. On
|
|
targets with non-unified instruction and data cache, the implementation
|
|
flushes the instruction cache.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
On platforms with coherent instruction and data caches (e.g. x86), this
|
|
intrinsic is a nop. On platforms with non-coherent instruction and data
|
|
cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
|
|
instructions or a system call, if cache flushing requires special
|
|
privileges.
|
|
|
|
The default behavior is to emit a call to ``__clear_cache`` from the run
|
|
time library.
|
|
|
|
This instrinsic does *not* empty the instruction pipeline. Modifications
|
|
of the current function are outside the scope of the intrinsic.
|
|
|
|
Standard C Library Intrinsics
|
|
-----------------------------
|
|
|
|
LLVM provides intrinsics for a few important standard C library
|
|
functions. These intrinsics allow source-language front-ends to pass
|
|
information about the alignment of the pointer arguments to the code
|
|
generator, providing opportunity for more efficient code generation.
|
|
|
|
.. _int_memcpy:
|
|
|
|
'``llvm.memcpy``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
|
|
integer bit width and for different address spaces. Not all targets
|
|
support all bit widths however.
|
|
|
|
::
|
|
|
|
declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
|
|
i32 <len>, i32 <align>, i1 <isvolatile>)
|
|
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
|
|
i64 <len>, i32 <align>, i1 <isvolatile>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
|
|
source location to the destination location.
|
|
|
|
Note that, unlike the standard libc function, the ``llvm.memcpy.*``
|
|
intrinsics do not return a value, takes extra alignment/isvolatile
|
|
arguments and the pointers can be in specified address spaces.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a pointer to the destination, the second is a
|
|
pointer to the source. The third argument is an integer argument
|
|
specifying the number of bytes to copy, the fourth argument is the
|
|
alignment of the source and destination locations, and the fifth is a
|
|
boolean indicating a volatile access.
|
|
|
|
If the call to this intrinsic has an alignment value that is not 0 or 1,
|
|
then the caller guarantees that both the source and destination pointers
|
|
are aligned to that boundary.
|
|
|
|
If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
|
|
a :ref:`volatile operation <volatile>`. The detailed access behavior is not
|
|
very cleanly specified and it is unwise to depend on it.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
|
|
source location to the destination location, which are not allowed to
|
|
overlap. It copies "len" bytes of memory over. If the argument is known
|
|
to be aligned to some boundary, this can be specified as the fourth
|
|
argument, otherwise it should be set to 0 or 1 (both meaning no alignment).
|
|
|
|
'``llvm.memmove``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use llvm.memmove on any integer
|
|
bit width and for different address space. Not all targets support all
|
|
bit widths however.
|
|
|
|
::
|
|
|
|
declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
|
|
i32 <len>, i32 <align>, i1 <isvolatile>)
|
|
declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
|
|
i64 <len>, i32 <align>, i1 <isvolatile>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.memmove.*``' intrinsics move a block of memory from the
|
|
source location to the destination location. It is similar to the
|
|
'``llvm.memcpy``' intrinsic but allows the two memory locations to
|
|
overlap.
|
|
|
|
Note that, unlike the standard libc function, the ``llvm.memmove.*``
|
|
intrinsics do not return a value, takes extra alignment/isvolatile
|
|
arguments and the pointers can be in specified address spaces.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a pointer to the destination, the second is a
|
|
pointer to the source. The third argument is an integer argument
|
|
specifying the number of bytes to copy, the fourth argument is the
|
|
alignment of the source and destination locations, and the fifth is a
|
|
boolean indicating a volatile access.
|
|
|
|
If the call to this intrinsic has an alignment value that is not 0 or 1,
|
|
then the caller guarantees that the source and destination pointers are
|
|
aligned to that boundary.
|
|
|
|
If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
|
|
is a :ref:`volatile operation <volatile>`. The detailed access behavior is
|
|
not very cleanly specified and it is unwise to depend on it.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.memmove.*``' intrinsics copy a block of memory from the
|
|
source location to the destination location, which may overlap. It
|
|
copies "len" bytes of memory over. If the argument is known to be
|
|
aligned to some boundary, this can be specified as the fourth argument,
|
|
otherwise it should be set to 0 or 1 (both meaning no alignment).
|
|
|
|
'``llvm.memset.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use llvm.memset on any integer
|
|
bit width and for different address spaces. However, not all targets
|
|
support all bit widths.
|
|
|
|
::
|
|
|
|
declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
|
|
i32 <len>, i32 <align>, i1 <isvolatile>)
|
|
declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
|
|
i64 <len>, i32 <align>, i1 <isvolatile>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.memset.*``' intrinsics fill a block of memory with a
|
|
particular byte value.
|
|
|
|
Note that, unlike the standard libc function, the ``llvm.memset``
|
|
intrinsic does not return a value and takes extra alignment/volatile
|
|
arguments. Also, the destination can be in an arbitrary address space.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a pointer to the destination to fill, the second
|
|
is the byte value with which to fill it, the third argument is an
|
|
integer argument specifying the number of bytes to fill, and the fourth
|
|
argument is the known alignment of the destination location.
|
|
|
|
If the call to this intrinsic has an alignment value that is not 0 or 1,
|
|
then the caller guarantees that the destination pointer is aligned to
|
|
that boundary.
|
|
|
|
If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
|
|
a :ref:`volatile operation <volatile>`. The detailed access behavior is not
|
|
very cleanly specified and it is unwise to depend on it.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
|
|
at the destination location. If the argument is known to be aligned to
|
|
some boundary, this can be specified as the fourth argument, otherwise
|
|
it should be set to 0 or 1 (both meaning no alignment).
|
|
|
|
'``llvm.sqrt.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.sqrt.f32(float %Val)
|
|
declare double @llvm.sqrt.f64(double %Val)
|
|
declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.sqrt.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand,
|
|
returning the same value as the libm '``sqrt``' functions would. Unlike
|
|
``sqrt`` in libm, however, ``llvm.sqrt`` has undefined behavior for
|
|
negative numbers other than -0.0 (which allows for better optimization,
|
|
because there is no need to worry about errno being set).
|
|
``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the sqrt of the specified operand if it is a
|
|
nonnegative floating point number.
|
|
|
|
'``llvm.powi.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.powi`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.powi.f32(float %Val, i32 %power)
|
|
declare double @llvm.powi.f64(double %Val, i32 %power)
|
|
declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power)
|
|
declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power)
|
|
declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.powi.*``' intrinsics return the first operand raised to the
|
|
specified (positive or negative) power. The order of evaluation of
|
|
multiplications is not defined. When a vector of floating point type is
|
|
used, the second argument remains a scalar integer value.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The second argument is an integer power, and the first is a value to
|
|
raise to that power.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the first value raised to the second power with an
|
|
unspecified sequence of rounding operations.
|
|
|
|
'``llvm.sin.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.sin`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.sin.f32(float %Val)
|
|
declare double @llvm.sin.f64(double %Val)
|
|
declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.sin.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.sin.*``' intrinsics return the sine of the operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the sine of the specified operand, returning the
|
|
same values as the libm ``sin`` functions would, and handles error
|
|
conditions in the same way.
|
|
|
|
'``llvm.cos.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.cos`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.cos.f32(float %Val)
|
|
declare double @llvm.cos.f64(double %Val)
|
|
declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.cos.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.cos.*``' intrinsics return the cosine of the operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the cosine of the specified operand, returning the
|
|
same values as the libm ``cos`` functions would, and handles error
|
|
conditions in the same way.
|
|
|
|
'``llvm.pow.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.pow`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.pow.f32(float %Val, float %Power)
|
|
declare double @llvm.pow.f64(double %Val, double %Power)
|
|
declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
|
|
declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
|
|
declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.pow.*``' intrinsics return the first operand raised to the
|
|
specified (positive or negative) power.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The second argument is a floating point power, and the first is a value
|
|
to raise to that power.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the first value raised to the second power,
|
|
returning the same values as the libm ``pow`` functions would, and
|
|
handles error conditions in the same way.
|
|
|
|
'``llvm.exp.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.exp`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.exp.f32(float %Val)
|
|
declare double @llvm.exp.f64(double %Val)
|
|
declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.exp.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.exp.*``' intrinsics perform the exp function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``exp`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.exp2.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.exp2.f32(float %Val)
|
|
declare double @llvm.exp2.f64(double %Val)
|
|
declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.exp2.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.exp2.*``' intrinsics perform the exp2 function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``exp2`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.log.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.log`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.log.f32(float %Val)
|
|
declare double @llvm.log.f64(double %Val)
|
|
declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.log.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.log.*``' intrinsics perform the log function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``log`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.log10.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.log10`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.log10.f32(float %Val)
|
|
declare double @llvm.log10.f64(double %Val)
|
|
declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.log10.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.log10.*``' intrinsics perform the log10 function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``log10`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.log2.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.log2`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.log2.f32(float %Val)
|
|
declare double @llvm.log2.f64(double %Val)
|
|
declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.log2.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.log2.*``' intrinsics perform the log2 function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``log2`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.fma.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.fma`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.fma.f32(float %a, float %b, float %c)
|
|
declare double @llvm.fma.f64(double %a, double %b, double %c)
|
|
declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
|
|
declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
|
|
declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.fma.*``' intrinsics perform the fused multiply-add
|
|
operation.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``fma`` functions
|
|
would, and does not set errno.
|
|
|
|
'``llvm.fabs.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.fabs.f32(float %Val)
|
|
declare double @llvm.fabs.f64(double %Val)
|
|
declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.fabs.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.fabs.*``' intrinsics return the absolute value of the
|
|
operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``fabs`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.copysign.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.copysign.f32(float %Mag, float %Sgn)
|
|
declare double @llvm.copysign.f64(double %Mag, double %Sgn)
|
|
declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
|
|
declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
|
|
declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
|
|
first operand and the sign of the second operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``copysign``
|
|
functions would, and handles error conditions in the same way.
|
|
|
|
'``llvm.floor.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.floor`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.floor.f32(float %Val)
|
|
declare double @llvm.floor.f64(double %Val)
|
|
declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.floor.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.floor.*``' intrinsics return the floor of the operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``floor`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.ceil.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.ceil.f32(float %Val)
|
|
declare double @llvm.ceil.f64(double %Val)
|
|
declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.ceil.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``ceil`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.trunc.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.trunc.f32(float %Val)
|
|
declare double @llvm.trunc.f64(double %Val)
|
|
declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.trunc.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
|
|
nearest integer not larger in magnitude than the operand.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``trunc`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.rint.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.rint`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.rint.f32(float %Val)
|
|
declare double @llvm.rint.f64(double %Val)
|
|
declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.rint.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.rint.*``' intrinsics returns the operand rounded to the
|
|
nearest integer. It may raise an inexact floating-point exception if the
|
|
operand isn't an integer.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``rint`` functions
|
|
would, and handles error conditions in the same way.
|
|
|
|
'``llvm.nearbyint.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.nearbyint.f32(float %Val)
|
|
declare double @llvm.nearbyint.f64(double %Val)
|
|
declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.nearbyint.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
|
|
nearest integer.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``nearbyint``
|
|
functions would, and handles error conditions in the same way.
|
|
|
|
'``llvm.round.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.round`` on any
|
|
floating point or vector of floating point type. Not all targets support
|
|
all types however.
|
|
|
|
::
|
|
|
|
declare float @llvm.round.f32(float %Val)
|
|
declare double @llvm.round.f64(double %Val)
|
|
declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
|
|
declare fp128 @llvm.round.f128(fp128 %Val)
|
|
declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.round.*``' intrinsics returns the operand rounded to the
|
|
nearest integer.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The argument and return value are floating point numbers of the same
|
|
type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This function returns the same values as the libm ``round``
|
|
functions would, and handles error conditions in the same way.
|
|
|
|
Bit Manipulation Intrinsics
|
|
---------------------------
|
|
|
|
LLVM provides intrinsics for a few important bit manipulation
|
|
operations. These allow efficient code generation for some algorithms.
|
|
|
|
'``llvm.bswap.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic function. You can use bswap on any
|
|
integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
|
|
|
|
::
|
|
|
|
declare i16 @llvm.bswap.i16(i16 <id>)
|
|
declare i32 @llvm.bswap.i32(i32 <id>)
|
|
declare i64 @llvm.bswap.i64(i64 <id>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.bswap``' family of intrinsics is used to byte swap integer
|
|
values with an even number of bytes (positive multiple of 16 bits).
|
|
These are useful for performing operations on data that is not in the
|
|
target's native byte order.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
|
|
and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
|
|
intrinsic returns an i32 value that has the four bytes of the input i32
|
|
swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
|
|
returned i32 will have its bytes in 3, 2, 1, 0 order. The
|
|
``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
|
|
concept to additional even-byte lengths (6 bytes, 8 bytes and more,
|
|
respectively).
|
|
|
|
'``llvm.ctpop.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use llvm.ctpop on any integer
|
|
bit width, or on any vector with integer elements. Not all targets
|
|
support all bit widths or vector types, however.
|
|
|
|
::
|
|
|
|
declare i8 @llvm.ctpop.i8(i8 <src>)
|
|
declare i16 @llvm.ctpop.i16(i16 <src>)
|
|
declare i32 @llvm.ctpop.i32(i32 <src>)
|
|
declare i64 @llvm.ctpop.i64(i64 <src>)
|
|
declare i256 @llvm.ctpop.i256(i256 <src>)
|
|
declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.ctpop``' family of intrinsics counts the number of bits set
|
|
in a value.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The only argument is the value to be counted. The argument may be of any
|
|
integer type, or a vector with integer elements. The return type must
|
|
match the argument type.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
|
|
each element of a vector.
|
|
|
|
'``llvm.ctlz.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
|
|
integer bit width, or any vector whose elements are integers. Not all
|
|
targets support all bit widths or vector types, however.
|
|
|
|
::
|
|
|
|
declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>)
|
|
declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>)
|
|
declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>)
|
|
declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>)
|
|
declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
|
|
declase <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.ctlz``' family of intrinsic functions counts the number of
|
|
leading zeros in a variable.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is the value to be counted. This argument may be of
|
|
any integer type, or a vectory with integer element type. The return
|
|
type must match the first argument type.
|
|
|
|
The second argument must be a constant and is a flag to indicate whether
|
|
the intrinsic should ensure that a zero as the first argument produces a
|
|
defined result. Historically some architectures did not provide a
|
|
defined result for zero values as efficiently, and many algorithms are
|
|
now predicated on avoiding zero-value inputs.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.ctlz``' intrinsic counts the leading (most significant)
|
|
zeros in a variable, or within each element of the vector. If
|
|
``src == 0`` then the result is the size in bits of the type of ``src``
|
|
if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
|
|
``llvm.ctlz(i32 2) = 30``.
|
|
|
|
'``llvm.cttz.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
|
|
integer bit width, or any vector of integer elements. Not all targets
|
|
support all bit widths or vector types, however.
|
|
|
|
::
|
|
|
|
declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>)
|
|
declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>)
|
|
declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>)
|
|
declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>)
|
|
declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
|
|
declase <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.cttz``' family of intrinsic functions counts the number of
|
|
trailing zeros.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is the value to be counted. This argument may be of
|
|
any integer type, or a vectory with integer element type. The return
|
|
type must match the first argument type.
|
|
|
|
The second argument must be a constant and is a flag to indicate whether
|
|
the intrinsic should ensure that a zero as the first argument produces a
|
|
defined result. Historically some architectures did not provide a
|
|
defined result for zero values as efficiently, and many algorithms are
|
|
now predicated on avoiding zero-value inputs.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.cttz``' intrinsic counts the trailing (least significant)
|
|
zeros in a variable, or within each element of a vector. If ``src == 0``
|
|
then the result is the size in bits of the type of ``src`` if
|
|
``is_zero_undef == 0`` and ``undef`` otherwise. For example,
|
|
``llvm.cttz(2) = 1``.
|
|
|
|
Arithmetic with Overflow Intrinsics
|
|
-----------------------------------
|
|
|
|
LLVM provides intrinsics for some arithmetic with overflow operations.
|
|
|
|
'``llvm.sadd.with.overflow.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
|
|
on any integer bit width.
|
|
|
|
::
|
|
|
|
declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
|
|
declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
|
|
declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
|
|
a signed addition of the two arguments, and indicate whether an overflow
|
|
occurred during the signed summation.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments (%a and %b) and the first element of the result structure
|
|
may be of integer types of any bit width, but they must have the same
|
|
bit width. The second element of the result structure must be of type
|
|
``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
|
|
addition.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
|
|
a signed addition of the two variables. They return a structure --- the
|
|
first element of which is the signed summation, and the second element
|
|
of which is a bit specifying if the signed summation resulted in an
|
|
overflow.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
|
|
%sum = extractvalue {i32, i1} %res, 0
|
|
%obit = extractvalue {i32, i1} %res, 1
|
|
br i1 %obit, label %overflow, label %normal
|
|
|
|
'``llvm.uadd.with.overflow.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
|
|
on any integer bit width.
|
|
|
|
::
|
|
|
|
declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
|
|
declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
|
|
declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
|
|
an unsigned addition of the two arguments, and indicate whether a carry
|
|
occurred during the unsigned summation.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments (%a and %b) and the first element of the result structure
|
|
may be of integer types of any bit width, but they must have the same
|
|
bit width. The second element of the result structure must be of type
|
|
``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
|
|
addition.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
|
|
an unsigned addition of the two arguments. They return a structure --- the
|
|
first element of which is the sum, and the second element of which is a
|
|
bit specifying if the unsigned summation resulted in a carry.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
|
|
%sum = extractvalue {i32, i1} %res, 0
|
|
%obit = extractvalue {i32, i1} %res, 1
|
|
br i1 %obit, label %carry, label %normal
|
|
|
|
'``llvm.ssub.with.overflow.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
|
|
on any integer bit width.
|
|
|
|
::
|
|
|
|
declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
|
|
declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
|
|
declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
|
|
a signed subtraction of the two arguments, and indicate whether an
|
|
overflow occurred during the signed subtraction.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments (%a and %b) and the first element of the result structure
|
|
may be of integer types of any bit width, but they must have the same
|
|
bit width. The second element of the result structure must be of type
|
|
``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
|
|
subtraction.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
|
|
a signed subtraction of the two arguments. They return a structure --- the
|
|
first element of which is the subtraction, and the second element of
|
|
which is a bit specifying if the signed subtraction resulted in an
|
|
overflow.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
|
|
%sum = extractvalue {i32, i1} %res, 0
|
|
%obit = extractvalue {i32, i1} %res, 1
|
|
br i1 %obit, label %overflow, label %normal
|
|
|
|
'``llvm.usub.with.overflow.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
|
|
on any integer bit width.
|
|
|
|
::
|
|
|
|
declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
|
|
declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
|
|
declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.usub.with.overflow``' family of intrinsic functions perform
|
|
an unsigned subtraction of the two arguments, and indicate whether an
|
|
overflow occurred during the unsigned subtraction.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments (%a and %b) and the first element of the result structure
|
|
may be of integer types of any bit width, but they must have the same
|
|
bit width. The second element of the result structure must be of type
|
|
``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
|
|
subtraction.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.usub.with.overflow``' family of intrinsic functions perform
|
|
an unsigned subtraction of the two arguments. They return a structure ---
|
|
the first element of which is the subtraction, and the second element of
|
|
which is a bit specifying if the unsigned subtraction resulted in an
|
|
overflow.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
|
|
%sum = extractvalue {i32, i1} %res, 0
|
|
%obit = extractvalue {i32, i1} %res, 1
|
|
br i1 %obit, label %overflow, label %normal
|
|
|
|
'``llvm.smul.with.overflow.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
|
|
on any integer bit width.
|
|
|
|
::
|
|
|
|
declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
|
|
declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
|
|
declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.smul.with.overflow``' family of intrinsic functions perform
|
|
a signed multiplication of the two arguments, and indicate whether an
|
|
overflow occurred during the signed multiplication.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments (%a and %b) and the first element of the result structure
|
|
may be of integer types of any bit width, but they must have the same
|
|
bit width. The second element of the result structure must be of type
|
|
``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
|
|
multiplication.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.smul.with.overflow``' family of intrinsic functions perform
|
|
a signed multiplication of the two arguments. They return a structure ---
|
|
the first element of which is the multiplication, and the second element
|
|
of which is a bit specifying if the signed multiplication resulted in an
|
|
overflow.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
|
|
%sum = extractvalue {i32, i1} %res, 0
|
|
%obit = extractvalue {i32, i1} %res, 1
|
|
br i1 %obit, label %overflow, label %normal
|
|
|
|
'``llvm.umul.with.overflow.*``' Intrinsics
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
|
|
on any integer bit width.
|
|
|
|
::
|
|
|
|
declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
|
|
declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
|
|
declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.umul.with.overflow``' family of intrinsic functions perform
|
|
a unsigned multiplication of the two arguments, and indicate whether an
|
|
overflow occurred during the unsigned multiplication.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The arguments (%a and %b) and the first element of the result structure
|
|
may be of integer types of any bit width, but they must have the same
|
|
bit width. The second element of the result structure must be of type
|
|
``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
|
|
multiplication.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.umul.with.overflow``' family of intrinsic functions perform
|
|
an unsigned multiplication of the two arguments. They return a structure ---
|
|
the first element of which is the multiplication, and the second
|
|
element of which is a bit specifying if the unsigned multiplication
|
|
resulted in an overflow.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
|
|
%sum = extractvalue {i32, i1} %res, 0
|
|
%obit = extractvalue {i32, i1} %res, 1
|
|
br i1 %obit, label %overflow, label %normal
|
|
|
|
Specialised Arithmetic Intrinsics
|
|
---------------------------------
|
|
|
|
'``llvm.fmuladd.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
|
|
declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
|
|
expressions that can be fused if the code generator determines that (a) the
|
|
target instruction set has support for a fused operation, and (b) that the
|
|
fused operation is more efficient than the equivalent, separate pair of mul
|
|
and add instructions.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
|
|
multiplicands, a and b, and an addend c.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The expression:
|
|
|
|
::
|
|
|
|
%0 = call float @llvm.fmuladd.f32(%a, %b, %c)
|
|
|
|
is equivalent to the expression a \* b + c, except that rounding will
|
|
not be performed between the multiplication and addition steps if the
|
|
code generator fuses the operations. Fusion is not guaranteed, even if
|
|
the target platform supports it. If a fused multiply-add is required the
|
|
corresponding llvm.fma.\* intrinsic function should be used
|
|
instead. This never sets errno, just as '``llvm.fma.*``'.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
|
|
|
|
Half Precision Floating Point Intrinsics
|
|
----------------------------------------
|
|
|
|
For most target platforms, half precision floating point is a
|
|
storage-only format. This means that it is a dense encoding (in memory)
|
|
but does not support computation in the format.
|
|
|
|
This means that code must first load the half-precision floating point
|
|
value as an i16, then convert it to float with
|
|
:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
|
|
then be performed on the float value (including extending to double
|
|
etc). To store the value back to memory, it is first converted to float
|
|
if needed, then converted to i16 with
|
|
:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
|
|
i16 value.
|
|
|
|
.. _int_convert_to_fp16:
|
|
|
|
'``llvm.convert.to.fp16``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i16 @llvm.convert.to.fp16.f32(float %a)
|
|
declare i16 @llvm.convert.to.fp16.f64(double %a)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
|
|
conventional floating point type to half precision floating point format.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The intrinsic function contains single argument - the value to be
|
|
converted.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
|
|
conventional floating point format to half precision floating point format. The
|
|
return value is an ``i16`` which contains the converted number.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%res = call i16 @llvm.convert.to.fp16.f32(float %a)
|
|
store i16 %res, i16* @x, align 2
|
|
|
|
.. _int_convert_from_fp16:
|
|
|
|
'``llvm.convert.from.fp16``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare float @llvm.convert.from.fp16.f32(i16 %a)
|
|
declare double @llvm.convert.from.fp16.f64(i16 %a)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.convert.from.fp16``' intrinsic function performs a
|
|
conversion from half precision floating point format to single precision
|
|
floating point format.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The intrinsic function contains single argument - the value to be
|
|
converted.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The '``llvm.convert.from.fp16``' intrinsic function performs a
|
|
conversion from half single precision floating point format to single
|
|
precision floating point format. The input half-float value is
|
|
represented by an ``i16`` value.
|
|
|
|
Examples:
|
|
"""""""""
|
|
|
|
.. code-block:: llvm
|
|
|
|
%a = load i16* @x, align 2
|
|
%res = call float @llvm.convert.from.fp16(i16 %a)
|
|
|
|
Debugger Intrinsics
|
|
-------------------
|
|
|
|
The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
|
|
prefix), are described in the `LLVM Source Level
|
|
Debugging <SourceLevelDebugging.html#format_common_intrinsics>`_
|
|
document.
|
|
|
|
Exception Handling Intrinsics
|
|
-----------------------------
|
|
|
|
The LLVM exception handling intrinsics (which all start with
|
|
``llvm.eh.`` prefix), are described in the `LLVM Exception
|
|
Handling <ExceptionHandling.html#format_common_intrinsics>`_ document.
|
|
|
|
.. _int_trampoline:
|
|
|
|
Trampoline Intrinsics
|
|
---------------------
|
|
|
|
These intrinsics make it possible to excise one parameter, marked with
|
|
the :ref:`nest <nest>` attribute, from a function. The result is a
|
|
callable function pointer lacking the nest parameter - the caller does
|
|
not need to provide a value for it. Instead, the value to use is stored
|
|
in advance in a "trampoline", a block of memory usually allocated on the
|
|
stack, which also contains code to splice the nest value into the
|
|
argument list. This is used to implement the GCC nested function address
|
|
extension.
|
|
|
|
For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
|
|
then the resulting function pointer has signature ``i32 (i32, i32)*``.
|
|
It can be created as follows:
|
|
|
|
.. code-block:: llvm
|
|
|
|
%tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
|
|
%tramp1 = getelementptr [10 x i8]* %tramp, i32 0, i32 0
|
|
call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
|
|
%p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
|
|
%fp = bitcast i8* %p to i32 (i32, i32)*
|
|
|
|
The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
|
|
``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
|
|
|
|
.. _int_it:
|
|
|
|
'``llvm.init.trampoline``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
This fills the memory pointed to by ``tramp`` with executable code,
|
|
turning it into a trampoline.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The ``llvm.init.trampoline`` intrinsic takes three arguments, all
|
|
pointers. The ``tramp`` argument must point to a sufficiently large and
|
|
sufficiently aligned block of memory; this memory is written to by the
|
|
intrinsic. Note that the size and the alignment are target-specific -
|
|
LLVM currently provides no portable way of determining them, so a
|
|
front-end that generates this intrinsic needs to have some
|
|
target-specific knowledge. The ``func`` argument must hold a function
|
|
bitcast to an ``i8*``.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The block of memory pointed to by ``tramp`` is filled with target
|
|
dependent code, turning it into a function. Then ``tramp`` needs to be
|
|
passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
|
|
be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
|
|
function's signature is the same as that of ``func`` with any arguments
|
|
marked with the ``nest`` attribute removed. At most one such ``nest``
|
|
argument is allowed, and it must be of pointer type. Calling the new
|
|
function is equivalent to calling ``func`` with the same argument list,
|
|
but with ``nval`` used for the missing ``nest`` argument. If, after
|
|
calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
|
|
modified, then the effect of any later call to the returned function
|
|
pointer is undefined.
|
|
|
|
.. _int_at:
|
|
|
|
'``llvm.adjust.trampoline``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i8* @llvm.adjust.trampoline(i8* <tramp>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
This performs any required machine-specific adjustment to the address of
|
|
a trampoline (passed as ``tramp``).
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
``tramp`` must point to a block of memory which already has trampoline
|
|
code filled in by a previous call to
|
|
:ref:`llvm.init.trampoline <int_it>`.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
On some architectures the address of the code to be executed needs to be
|
|
different than the address where the trampoline is actually stored. This
|
|
intrinsic returns the executable address corresponding to ``tramp``
|
|
after performing the required machine specific adjustments. The pointer
|
|
returned can then be :ref:`bitcast and executed <int_trampoline>`.
|
|
|
|
Memory Use Markers
|
|
------------------
|
|
|
|
This class of intrinsics provides information about the lifetime of
|
|
memory objects and ranges where variables are immutable.
|
|
|
|
.. _int_lifestart:
|
|
|
|
'``llvm.lifetime.start``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
|
|
object's lifetime.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a constant integer representing the size of the
|
|
object, or -1 if it is variable sized. The second argument is a pointer
|
|
to the object.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic indicates that before this point in the code, the value
|
|
of the memory pointed to by ``ptr`` is dead. This means that it is known
|
|
to never be used and has an undefined value. A load from the pointer
|
|
that precedes this intrinsic can be replaced with ``'undef'``.
|
|
|
|
.. _int_lifeend:
|
|
|
|
'``llvm.lifetime.end``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.lifetime.end``' intrinsic specifies the end of a memory
|
|
object's lifetime.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a constant integer representing the size of the
|
|
object, or -1 if it is variable sized. The second argument is a pointer
|
|
to the object.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic indicates that after this point in the code, the value of
|
|
the memory pointed to by ``ptr`` is dead. This means that it is known to
|
|
never be used and has an undefined value. Any stores into the memory
|
|
object following this intrinsic may be removed as dead.
|
|
|
|
'``llvm.invariant.start``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.invariant.start``' intrinsic specifies that the contents of
|
|
a memory object will not change.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a constant integer representing the size of the
|
|
object, or -1 if it is variable sized. The second argument is a pointer
|
|
to the object.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic indicates that until an ``llvm.invariant.end`` that uses
|
|
the return value, the referenced memory location is constant and
|
|
unchanging.
|
|
|
|
'``llvm.invariant.end``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.invariant.end``' intrinsic specifies that the contents of a
|
|
memory object are mutable.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is the matching ``llvm.invariant.start`` intrinsic.
|
|
The second argument is a constant integer representing the size of the
|
|
object, or -1 if it is variable sized and the third argument is a
|
|
pointer to the object.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic indicates that the memory is mutable again.
|
|
|
|
General Intrinsics
|
|
------------------
|
|
|
|
This class of intrinsics is designed to be generic and has no specific
|
|
purpose.
|
|
|
|
'``llvm.var.annotation``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.var.annotation``' intrinsic.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a pointer to a value, the second is a pointer to a
|
|
global string, the third is a pointer to a global string which is the
|
|
source file name, and the last argument is the line number.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic allows annotation of local variables with arbitrary
|
|
strings. This can be useful for special purpose optimizations that want
|
|
to look for these annotations. These have no other defined use; they are
|
|
ignored by code generation and optimization.
|
|
|
|
'``llvm.ptr.annotation.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
|
|
pointer to an integer of any width. *NOTE* you must specify an address space for
|
|
the pointer. The identifier for the default address space is the integer
|
|
'``0``'.
|
|
|
|
::
|
|
|
|
declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.ptr.annotation``' intrinsic.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is a pointer to an integer value of arbitrary bitwidth
|
|
(result of some expression), the second is a pointer to a global string, the
|
|
third is a pointer to a global string which is the source file name, and the
|
|
last argument is the line number. It returns the value of the first argument.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic allows annotation of a pointer to an integer with arbitrary
|
|
strings. This can be useful for special purpose optimizations that want to look
|
|
for these annotations. These have no other defined use; they are ignored by code
|
|
generation and optimization.
|
|
|
|
'``llvm.annotation.*``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use '``llvm.annotation``' on
|
|
any integer bit width.
|
|
|
|
::
|
|
|
|
declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.annotation``' intrinsic.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The first argument is an integer value (result of some expression), the
|
|
second is a pointer to a global string, the third is a pointer to a
|
|
global string which is the source file name, and the last argument is
|
|
the line number. It returns the value of the first argument.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic allows annotations to be put on arbitrary expressions
|
|
with arbitrary strings. This can be useful for special purpose
|
|
optimizations that want to look for these annotations. These have no
|
|
other defined use; they are ignored by code generation and optimization.
|
|
|
|
'``llvm.trap``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.trap() noreturn nounwind
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.trap``' intrinsic.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
None.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic is lowered to the target dependent trap instruction. If
|
|
the target does not have a trap instruction, this intrinsic will be
|
|
lowered to a call of the ``abort()`` function.
|
|
|
|
'``llvm.debugtrap``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.debugtrap() nounwind
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The '``llvm.debugtrap``' intrinsic.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
None.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic is lowered to code which is intended to cause an
|
|
execution trap with the intention of requesting the attention of a
|
|
debugger.
|
|
|
|
'``llvm.stackprotector``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
|
|
onto the stack at ``slot``. The stack slot is adjusted to ensure that it
|
|
is placed on the stack before local variables.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
|
|
The first argument is the value loaded from the stack guard
|
|
``@__stack_chk_guard``. The second variable is an ``alloca`` that has
|
|
enough space to hold the value of the guard.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic causes the prologue/epilogue inserter to force the position of
|
|
the ``AllocaInst`` stack slot to be before local variables on the stack. This is
|
|
to ensure that if a local variable on the stack is overwritten, it will destroy
|
|
the value of the guard. When the function exits, the guard on the stack is
|
|
checked against the original guard by ``llvm.stackprotectorcheck``. If they are
|
|
different, then ``llvm.stackprotectorcheck`` causes the program to abort by
|
|
calling the ``__stack_chk_fail()`` function.
|
|
|
|
'``llvm.stackprotectorcheck``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.stackprotectorcheck(i8** <guard>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The ``llvm.stackprotectorcheck`` intrinsic compares ``guard`` against an already
|
|
created stack protector and if they are not equal calls the
|
|
``__stack_chk_fail()`` function.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The ``llvm.stackprotectorcheck`` intrinsic requires one pointer argument, the
|
|
the variable ``@__stack_chk_guard``.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic is provided to perform the stack protector check by comparing
|
|
``guard`` with the stack slot created by ``llvm.stackprotector`` and if the
|
|
values do not match call the ``__stack_chk_fail()`` function.
|
|
|
|
The reason to provide this as an IR level intrinsic instead of implementing it
|
|
via other IR operations is that in order to perform this operation at the IR
|
|
level without an intrinsic, one would need to create additional basic blocks to
|
|
handle the success/failure cases. This makes it difficult to stop the stack
|
|
protector check from disrupting sibling tail calls in Codegen. With this
|
|
intrinsic, we are able to generate the stack protector basic blocks late in
|
|
codegen after the tail call decision has occurred.
|
|
|
|
'``llvm.objectsize``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>)
|
|
declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The ``llvm.objectsize`` intrinsic is designed to provide information to
|
|
the optimizers to determine at compile time whether a) an operation
|
|
(like memcpy) will overflow a buffer that corresponds to an object, or
|
|
b) that a runtime check for overflow isn't necessary. An object in this
|
|
context means an allocation of a specific class, structure, array, or
|
|
other object.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The ``llvm.objectsize`` intrinsic takes two arguments. The first
|
|
argument is a pointer to or into the ``object``. The second argument is
|
|
a boolean and determines whether ``llvm.objectsize`` returns 0 (if true)
|
|
or -1 (if false) when the object size is unknown. The second argument
|
|
only accepts constants.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The ``llvm.objectsize`` intrinsic is lowered to a constant representing
|
|
the size of the object concerned. If the size cannot be determined at
|
|
compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending
|
|
on the ``min`` argument).
|
|
|
|
'``llvm.expect``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
This is an overloaded intrinsic. You can use ``llvm.expect`` on any
|
|
integer bit width.
|
|
|
|
::
|
|
|
|
declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
|
|
declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
|
|
declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The ``llvm.expect`` intrinsic provides information about expected (the
|
|
most probable) value of ``val``, which can be used by optimizers.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The ``llvm.expect`` intrinsic takes two arguments. The first argument is
|
|
a value. The second argument is an expected value, this needs to be a
|
|
constant value, variables are not allowed.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic is lowered to the ``val``.
|
|
|
|
'``llvm.assume``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.assume(i1 %cond)
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The ``llvm.assume`` allows the optimizer to assume that the provided
|
|
condition is true. This information can then be used in simplifying other parts
|
|
of the code.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
The condition which the optimizer may assume is always true.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
The intrinsic allows the optimizer to assume that the provided condition is
|
|
always true whenever the control flow reaches the intrinsic call. No code is
|
|
generated for this intrinsic, and instructions that contribute only to the
|
|
provided condition are not used for code generation. If the condition is
|
|
violated during execution, the behavior is undefined.
|
|
|
|
Please note that optimizer might limit the transformations performed on values
|
|
used by the ``llvm.assume`` intrinsic in order to preserve the instructions
|
|
only used to form the intrinsic's input argument. This might prove undesirable
|
|
if the extra information provided by the ``llvm.assume`` intrinsic does cause
|
|
sufficient overall improvement in code quality. For this reason,
|
|
``llvm.assume`` should not be used to document basic mathematical invariants
|
|
that the optimizer can otherwise deduce or facts that are of little use to the
|
|
optimizer.
|
|
|
|
'``llvm.donothing``' Intrinsic
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Syntax:
|
|
"""""""
|
|
|
|
::
|
|
|
|
declare void @llvm.donothing() nounwind readnone
|
|
|
|
Overview:
|
|
"""""""""
|
|
|
|
The ``llvm.donothing`` intrinsic doesn't perform any operation. It's the
|
|
only intrinsic that can be called with an invoke instruction.
|
|
|
|
Arguments:
|
|
""""""""""
|
|
|
|
None.
|
|
|
|
Semantics:
|
|
""""""""""
|
|
|
|
This intrinsic does nothing, and it's removed by optimizers and ignored
|
|
by codegen.
|
|
|
|
Stack Map Intrinsics
|
|
--------------------
|
|
|
|
LLVM provides experimental intrinsics to support runtime patching
|
|
mechanisms commonly desired in dynamic language JITs. These intrinsics
|
|
are described in :doc:`StackMaps`.
|