2016-08-27 01:11:48 +02:00
|
|
|
====================================================
|
|
|
|
Using -opt-bisect-limit to debug optimization errors
|
|
|
|
====================================================
|
|
|
|
.. contents::
|
|
|
|
:local:
|
|
|
|
:depth: 1
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
============
|
|
|
|
|
|
|
|
The -opt-bisect-limit option provides a way to disable all optimization passes
|
|
|
|
above a specified limit without modifying the way in which the Pass Managers
|
|
|
|
are populated. The intention of this option is to assist in tracking down
|
|
|
|
problems where incorrect transformations during optimization result in incorrect
|
|
|
|
run-time behavior.
|
|
|
|
|
|
|
|
This feature is implemented on an opt-in basis. Passes which can be safely
|
|
|
|
skipped while still allowing correct code generation call a function to
|
|
|
|
check the opt-bisect limit before performing optimizations. Passes which
|
|
|
|
either must be run or do not modify the IR do not perform this check and are
|
|
|
|
therefore never skipped. Generally, this means analysis passes, passes
|
|
|
|
that are run at CodeGenOpt::None and passes which are required for register
|
|
|
|
allocation.
|
|
|
|
|
|
|
|
The -opt-bisect-limit option can be used with any tool, including front ends
|
|
|
|
such as clang, that uses the core LLVM library for optimization and code
|
|
|
|
generation. The exact syntax for invoking the option is discussed below.
|
|
|
|
|
|
|
|
This feature is not intended to replace other debugging tools such as bugpoint.
|
|
|
|
Rather it provides an alternate course of action when reproducing the problem
|
|
|
|
requires a complex build infrastructure that would make using bugpoint
|
|
|
|
impractical or when reproducing the failure requires a sequence of
|
|
|
|
transformations that is difficult to replicate with tools like opt and llc.
|
|
|
|
|
|
|
|
|
|
|
|
Getting Started
|
|
|
|
===============
|
|
|
|
|
|
|
|
The -opt-bisect-limit command line option can be passed directly to tools such
|
|
|
|
as opt, llc and lli. The syntax is as follows:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
<tool name> [other options] -opt-bisect-limit=<limit>
|
|
|
|
|
|
|
|
If a value of -1 is used the tool will perform all optimizations but a message
|
|
|
|
will be printed to stderr for each optimization that could be skipped
|
|
|
|
indicating the index value that is associated with that optimization. To skip
|
|
|
|
optimizations, pass the value of the last optimization to be performed as the
|
|
|
|
opt-bisect-limit. All optimizations with a higher index value will be skipped.
|
|
|
|
|
|
|
|
In order to use the -opt-bisect-limit option with a driver that provides a
|
|
|
|
wrapper around the LLVM core library, an additional prefix option may be
|
|
|
|
required, as defined by the driver. For example, to use this option with
|
|
|
|
clang, the "-mllvm" prefix must be used. A typical clang invocation would look
|
|
|
|
like this:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
clang -O2 -mllvm -opt-bisect-limit=256 my_file.c
|
|
|
|
|
|
|
|
The -opt-bisect-limit option may also be applied to link-time optimizations by
|
2017-02-10 08:21:06 +01:00
|
|
|
using a prefix to indicate that this is a plug-in option for the linker. The
|
2016-08-27 01:11:48 +02:00
|
|
|
following syntax will set a bisect limit for LTO transformations:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2017-02-10 18:16:00 +01:00
|
|
|
# When using lld, or ld64 (macOS)
|
2017-02-10 08:21:06 +01:00
|
|
|
clang -flto -Wl,-mllvm,-opt-bisect-limit=256 my_file.o my_other_file.o
|
2017-02-10 18:16:00 +01:00
|
|
|
# When using Gold
|
|
|
|
clang -flto -Wl,-plugin-opt,-opt-bisect-limit=256 my_file.o my_other_file.o
|
2016-08-27 01:11:48 +02:00
|
|
|
|
|
|
|
LTO passes are run by a library instance invoked by the linker. Therefore any
|
|
|
|
passes run in the primary driver compilation phase are not affected by options
|
|
|
|
passed via '-Wl,-plugin-opt' and LTO passes are not affected by options
|
|
|
|
passed to the driver-invoked LLVM invocation via '-mllvm'.
|
|
|
|
|
|
|
|
|
|
|
|
Bisection Index Values
|
|
|
|
======================
|
|
|
|
|
|
|
|
The granularity of the optimizations associated with a single index value is
|
|
|
|
variable. Depending on how the optimization pass has been instrumented the
|
|
|
|
value may be associated with as much as all transformations that would have
|
|
|
|
been performed by an optimization pass on an IR unit for which it is invoked
|
|
|
|
(for instance, during a single call of runOnFunction for a FunctionPass) or as
|
|
|
|
little as a single transformation. The index values may also be nested so that
|
|
|
|
if an invocation of the pass is not skipped individual transformations within
|
|
|
|
that invocation may still be skipped.
|
|
|
|
|
|
|
|
The order of the values assigned is guaranteed to remain stable and consistent
|
|
|
|
from one run to the next up to and including the value specified as the limit.
|
|
|
|
Above the limit value skipping of optimizations can cause a change in the
|
|
|
|
numbering, but because all optimizations above the limit are skipped this
|
|
|
|
is not a problem.
|
|
|
|
|
|
|
|
When an opt-bisect index value refers to an entire invocation of the run
|
|
|
|
function for a pass, the pass will query whether or not it should be skipped
|
|
|
|
each time it is invoked and each invocation will be assigned a unique value.
|
|
|
|
For example, if a FunctionPass is used with a module containing three functions
|
|
|
|
a different index value will be assigned to the pass for each of the functions
|
|
|
|
as the pass is run. The pass may be run on two functions but skipped for the
|
|
|
|
third.
|
|
|
|
|
|
|
|
If the pass internally performs operations on a smaller IR unit the pass must be
|
|
|
|
specifically instrumented to enable bisection at this finer level of granularity
|
|
|
|
(see below for details).
|
|
|
|
|
|
|
|
|
|
|
|
Example Usage
|
|
|
|
=============
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
$ opt -O2 -o test-opt.bc -opt-bisect-limit=16 test.ll
|
|
|
|
|
|
|
|
BISECT: running pass (1) Simplify the CFG on function (g)
|
|
|
|
BISECT: running pass (2) SROA on function (g)
|
|
|
|
BISECT: running pass (3) Early CSE on function (g)
|
|
|
|
BISECT: running pass (4) Infer set function attributes on module (test.ll)
|
|
|
|
BISECT: running pass (5) Interprocedural Sparse Conditional Constant Propagation on module (test.ll)
|
|
|
|
BISECT: running pass (6) Global Variable Optimizer on module (test.ll)
|
|
|
|
BISECT: running pass (7) Promote Memory to Register on function (g)
|
|
|
|
BISECT: running pass (8) Dead Argument Elimination on module (test.ll)
|
|
|
|
BISECT: running pass (9) Combine redundant instructions on function (g)
|
|
|
|
BISECT: running pass (10) Simplify the CFG on function (g)
|
|
|
|
BISECT: running pass (11) Remove unused exception handling info on SCC (<<null function>>)
|
|
|
|
BISECT: running pass (12) Function Integration/Inlining on SCC (<<null function>>)
|
|
|
|
BISECT: running pass (13) Deduce function attributes on SCC (<<null function>>)
|
|
|
|
BISECT: running pass (14) Remove unused exception handling info on SCC (f)
|
|
|
|
BISECT: running pass (15) Function Integration/Inlining on SCC (f)
|
|
|
|
BISECT: running pass (16) Deduce function attributes on SCC (f)
|
|
|
|
BISECT: NOT running pass (17) Remove unused exception handling info on SCC (g)
|
|
|
|
BISECT: NOT running pass (18) Function Integration/Inlining on SCC (g)
|
|
|
|
BISECT: NOT running pass (19) Deduce function attributes on SCC (g)
|
|
|
|
BISECT: NOT running pass (20) SROA on function (g)
|
|
|
|
BISECT: NOT running pass (21) Early CSE on function (g)
|
|
|
|
BISECT: NOT running pass (22) Speculatively execute instructions if target has divergent branches on function (g)
|
|
|
|
... etc. ...
|
|
|
|
|
|
|
|
|
|
|
|
Pass Skipping Implementation
|
|
|
|
============================
|
|
|
|
|
|
|
|
The -opt-bisect-limit implementation depends on individual passes opting in to
|
|
|
|
the opt-bisect process. The OptBisect object that manages the process is
|
|
|
|
entirely passive and has no knowledge of how any pass is implemented. When a
|
|
|
|
pass is run if the pass may be skipped, it should call the OptBisect object to
|
|
|
|
see if it should be skipped.
|
|
|
|
|
|
|
|
The OptBisect object is intended to be accessed through LLVMContext and each
|
|
|
|
Pass base class contains a helper function that abstracts the details in order
|
|
|
|
to make this check uniform across all passes. These helper functions are:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
bool ModulePass::skipModule(Module &M);
|
|
|
|
bool CallGraphSCCPass::skipSCC(CallGraphSCC &SCC);
|
|
|
|
bool FunctionPass::skipFunction(const Function &F);
|
|
|
|
bool LoopPass::skipLoop(const Loop *L);
|
|
|
|
|
|
|
|
A MachineFunctionPass should use FunctionPass::skipFunction() as such:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
bool MyMachineFunctionPass::runOnMachineFunction(Function &MF) {
|
|
|
|
if (skipFunction(*MF.getFunction())
|
2018-07-27 08:54:13 +02:00
|
|
|
return false;
|
2016-08-27 01:11:48 +02:00
|
|
|
// Otherwise, run the pass normally.
|
|
|
|
}
|
|
|
|
|
|
|
|
In addition to checking with the OptBisect class to see if the pass should be
|
|
|
|
skipped, the skipFunction(), skipLoop() and skipBasicBlock() helper functions
|
|
|
|
also look for the presence of the "optnone" function attribute. The calling
|
|
|
|
pass will be unable to determine whether it is being skipped because the
|
|
|
|
"optnone" attribute is present or because the opt-bisect-limit has been
|
|
|
|
reached. This is desirable because the behavior should be the same in either
|
|
|
|
case.
|
|
|
|
|
|
|
|
The majority of LLVM passes which can be skipped have already been instrumented
|
|
|
|
in the manner described above. If you are adding a new pass or believe you
|
|
|
|
have found a pass which is not being included in the opt-bisect process but
|
|
|
|
should be, you can add it as described above.
|
|
|
|
|
|
|
|
|
|
|
|
Adding Finer Granularity
|
|
|
|
========================
|
|
|
|
|
|
|
|
Once the pass in which an incorrect transformation is performed has been
|
|
|
|
determined, it may be useful to perform further analysis in order to determine
|
2017-03-11 02:41:03 +01:00
|
|
|
which specific transformation is causing the problem. Debug counters
|
|
|
|
can be used for this purpose.
|