mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-23 19:23:23 +01:00
e24a4fa7a0
Allow InvokeInst to have the second optional prof branch weight for its unwind branch. InvokeInst is a terminator with two successors. It might have its unwind branch taken many times. If so the BranchProbabilityInfo unwind branch heuristic can be inaccurate. This patch allows a higher accuracy calculated with both branch weights set. Changes: - A new section about InvokeInst is added to the BranchWeightMetadata page. It states the old information that missed in the doc and adds new about the second branch weight. - Verifier is changed to allow either 1 or 2 branch weights for InvokeInst. - A new test is written for BranchProbabilityInfo to demonstrate the main improvement of the simple fix in calcMetadataWeights(). - Several new testcases are created for Inliner. Those check that both weights are accounted for invoke instruction weight calculation. - PGOUseFunc::setBranchWeights() is fixed to be applicable to InvokeInst. Reviewers: davidxl, reames, xur, yamauchi Tags: #llvm Differential Revision: https://reviews.llvm.org/D80618
186 lines
5.5 KiB
ReStructuredText
186 lines
5.5 KiB
ReStructuredText
===========================
|
|
LLVM Branch Weight Metadata
|
|
===========================
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
Introduction
|
|
============
|
|
|
|
Branch Weight Metadata represents branch weights as its likeliness to be taken
|
|
(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an
|
|
``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind.
|
|
The first operator is always a ``MDString`` node with the string
|
|
"branch_weights". Number of operators depends on the terminator type.
|
|
|
|
Branch weights might be fetch from the profiling file, or generated based on
|
|
`__builtin_expect`_ instruction.
|
|
|
|
All weights are represented as an unsigned 32-bit values, where higher value
|
|
indicates greater chance to be taken.
|
|
|
|
Supported Instructions
|
|
======================
|
|
|
|
``BranchInst``
|
|
^^^^^^^^^^^^^^
|
|
|
|
Metadata is only assigned to the conditional branches. There are two extra
|
|
operands for the true and the false branch.
|
|
|
|
.. code-block:: none
|
|
|
|
!0 = metadata !{
|
|
metadata !"branch_weights",
|
|
i32 <TRUE_BRANCH_WEIGHT>,
|
|
i32 <FALSE_BRANCH_WEIGHT>
|
|
}
|
|
|
|
``SwitchInst``
|
|
^^^^^^^^^^^^^^
|
|
|
|
Branch weights are assigned to every case (including the ``default`` case which
|
|
is always case #0).
|
|
|
|
.. code-block:: none
|
|
|
|
!0 = metadata !{
|
|
metadata !"branch_weights",
|
|
i32 <DEFAULT_BRANCH_WEIGHT>
|
|
[ , i32 <CASE_BRANCH_WEIGHT> ... ]
|
|
}
|
|
|
|
``IndirectBrInst``
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
Branch weights are assigned to every destination.
|
|
|
|
.. code-block:: none
|
|
|
|
!0 = metadata !{
|
|
metadata !"branch_weights",
|
|
i32 <LABEL_BRANCH_WEIGHT>
|
|
[ , i32 <LABEL_BRANCH_WEIGHT> ... ]
|
|
}
|
|
|
|
``CallInst``
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
Calls may have branch weight metadata, containing the execution count of
|
|
the call. It is currently used in SamplePGO mode only, to augment the
|
|
block and entry counts which may not be accurate with sampling.
|
|
|
|
.. code-block:: none
|
|
|
|
!0 = metadata !{
|
|
metadata !"branch_weights",
|
|
i32 <CALL_BRANCH_WEIGHT>
|
|
}
|
|
|
|
``InvokeInst``
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
Invoke instruction may have branch weight metadata with one or two weights.
|
|
The second weight is optional and corresponds to the unwind branch.
|
|
If only one weight is set then it contains the execution count of the call
|
|
and used in SamplePGO mode only as described for the call instruction. If both
|
|
weights are specified then the second weight contains count of unwind branch
|
|
taken and the first weights contains the execution count of the call minus
|
|
the count of unwind branch taken. Both weights specified are used to calculate
|
|
BranchProbability as for BranchInst and for SamplePGO the sum of both weights
|
|
is used.
|
|
|
|
.. code-block:: none
|
|
|
|
!0 = metadata !{
|
|
metadata !"branch_weights",
|
|
i32 <INVOKE_NORMAL_WEIGHT>
|
|
[ , i32 <INVOKE_UNWIND_WEIGHT> ]
|
|
}
|
|
|
|
Other
|
|
^^^^^
|
|
|
|
Other terminator instructions are not allowed to contain Branch Weight Metadata.
|
|
|
|
.. _\__builtin_expect:
|
|
|
|
Built-in ``expect`` Instructions
|
|
================================
|
|
|
|
``__builtin_expect(long exp, long c)`` instruction provides branch prediction
|
|
information. The return value is the value of ``exp``.
|
|
|
|
It is especially useful in conditional statements. Currently Clang supports two
|
|
conditional statements:
|
|
|
|
``if`` statement
|
|
^^^^^^^^^^^^^^^^
|
|
|
|
The ``exp`` parameter is the condition. The ``c`` parameter is the expected
|
|
comparison value. If it is equal to 1 (true), the condition is likely to be
|
|
true, in other case condition is likely to be false. For example:
|
|
|
|
.. code-block:: c++
|
|
|
|
if (__builtin_expect(x > 0, 1)) {
|
|
// This block is likely to be taken.
|
|
}
|
|
|
|
``switch`` statement
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
The ``exp`` parameter is the value. The ``c`` parameter is the expected
|
|
value. If the expected value doesn't show on the cases list, the ``default``
|
|
case is assumed to be likely taken.
|
|
|
|
.. code-block:: c++
|
|
|
|
switch (__builtin_expect(x, 5)) {
|
|
default: break;
|
|
case 0: // ...
|
|
case 3: // ...
|
|
case 5: // This case is likely to be taken.
|
|
}
|
|
|
|
CFG Modifications
|
|
=================
|
|
|
|
Branch Weight Metatada is not proof against CFG changes. If terminator operands'
|
|
are changed some action should be taken. In other case some misoptimizations may
|
|
occur due to incorrect branch prediction information.
|
|
|
|
Function Entry Counts
|
|
=====================
|
|
|
|
To allow comparing different functions during inter-procedural analysis and
|
|
optimization, ``MD_prof`` nodes can also be assigned to a function definition.
|
|
The first operand is a string indicating the name of the associated counter.
|
|
|
|
Currently, one counter is supported: "function_entry_count". The second operand
|
|
is a 64-bit counter that indicates the number of times that this function was
|
|
invoked (in the case of instrumentation-based profiles). In the case of
|
|
sampling-based profiles, this operand is an approximation of how many times
|
|
the function was invoked.
|
|
|
|
For example, in the code below, the instrumentation for function foo()
|
|
indicates that it was called 2,590 times at runtime.
|
|
|
|
.. code-block:: llvm
|
|
|
|
define i32 @foo() !prof !1 {
|
|
ret i32 0
|
|
}
|
|
!1 = !{!"function_entry_count", i64 2590}
|
|
|
|
If "function_entry_count" has more than 2 operands, the later operands are
|
|
the GUID of the functions that needs to be imported by ThinLTO. This is only
|
|
set by sampling based profile. It is needed because the sampling based profile
|
|
was collected on a binary that had already imported and inlined these functions,
|
|
and we need to ensure the IR matches in the ThinLTO backends for profile
|
|
annotation. The reason why we cannot annotate this on the callsite is that it
|
|
can only goes down 1 level in the call chain. For the cases where
|
|
foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels
|
|
in the call chain to import both bar_in_b_cc and baz_in_c_cc.
|