Don't print a line multiple times, each for different inlining contexts, if
nothing happened in any context. This prevents situations like this:
[[
> main:
65 | if ((i * ni + j) % 20 == 0) fprintf
> print_array:
65 | if ((i * ni + j) % 20 == 0) fprintf
]]
which could happen if different optimizations were missed in different inlining
contexts.
llvm-svn: 291361
Because screen space is precious, if an optimization (vectorization, for
example) never happens, don't leave empty space for the associated markers on
every line of the output. This makes the output much more compact, and allows
for the later inclusion of markers for more (although perhaps rare)
optimizations.
llvm-svn: 283626
In the left part of the reports, we have things like U<number>; if some of
these numbers use more digits than others, we don't want a space in between the
U and the start of the number. Instead, the space should come afterward. This
way it is clear that the number goes with the U and not any other optimization
indicator that might come later on the line.
llvm-svn: 283518
When there are multiple optimizations on one line, record the vectorization
factors, etc. correctly (instead of incorrectly substituting default values).
llvm-svn: 283443
How code is optimized sometimes, perhaps often, depends on the context into
which it was inlined. This change allows llvm-opt-report to track the
differences between the optimizations performed, or not, in different contexts,
and when these differ, display those differences.
For example, this code:
$ cat /tmp/q.cpp
void bar();
void foo(int n) {
for (int i = 0; i < n; ++i)
bar();
}
void quack() {
foo(4);
}
void quack2() {
foo(4);
}
will now produce this report:
< /home/hfinkel/src/llvm/test/tools/llvm-opt-report/Inputs/q.cpp
2 | void bar();
3 | void foo(int n) {
[[
> foo(int):
4 | for (int i = 0; i < n; ++i)
> quack(), quack2():
4 U4 | for (int i = 0; i < n; ++i)
]]
5 | bar();
6 | }
7 |
8 | void quack() {
9 I | foo(4);
10 | }
11 |
12 | void quack2() {
13 I | foo(4);
14 | }
15 |
Note that the tool has demangled the function names, and grouped the reports
associated with line 4. This shows that the loop on line 4 was unrolled by a
factor of 4 when inlined into the functions quack() and quack2(), but not in
the function foo(int) itself.
llvm-svn: 283402
LLVM now has the ability to record information from optimization remarks in a
machine-consumable YAML file for later analysis. This can be enabled in opt
(see r282539), and D25225 adds a Clang flag to do the same. This patch adds
llvm-opt-report, a tool to generate basic optimization "listing" files
(annotated sources with information about what optimizations were performed)
from one of these YAML inputs.
D19678 proposed to add this capability directly to Clang, but this more-general
YAML-based infrastructure was the direction we decided upon in that review
thread.
For this optimization report, I focused on making the output as succinct as
possible while providing information on inlining and loop transformations. The
goal here is that the source code should still be easily readable in the
report. My primary inspiration here is the reports generated by Cray's tools
(http://docs.cray.com/books/S-2496-4101/html-S-2496-4101/z1112823641oswald.html).
These reports are highly regarded within the HPC community. Intel's compiler,
for example, also has an optimization-report capability
(https://software.intel.com/sites/default/files/managed/55/b1/new-compiler-optimization-reports.pdf).
$ cat /tmp/v.c
void bar();
void foo() { bar(); }
void Test(int *res, int *c, int *d, int *p, int n) {
int i;
#pragma clang loop vectorize(assume_safety)
for (i = 0; i < 1600; i++) {
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
for (i = 0; i < 16; i++) {
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
foo();
foo(); bar(); foo();
}
D25225 adds -fsave-optimization-record (and
-fsave-optimization-record=filename), and this would be used as follows:
$ clang -O3 -o /tmp/v.o -c /tmp/v.c -fsave-optimization-record
$ llvm-opt-report /tmp/v.yaml > /tmp/v.lst
$ cat /tmp/v.lst
< /tmp/v.c
2 | void bar();
3 | void foo() { bar(); }
4 |
5 | void Test(int *res, int *c, int *d, int *p, int n) {
6 | int i;
7 |
8 | #pragma clang loop vectorize(assume_safety)
9 V4,2 | for (i = 0; i < 1600; i++) {
10 | res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
11 | }
12 |
13 U16 | for (i = 0; i < 16; i++) {
14 | res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
15 | }
16 |
17 I | foo();
18 |
19 | foo(); bar(); foo();
I | ^
I | ^
20 | }
Each source line gets a prefix giving the line number, and a few columns for
important optimizations: inlining, loop unrolling and loop vectorization. An
'I' is printed next to a line where a function was inlined, a 'U' next to an
unrolled loop, and 'V' next to a vectorized loop. These are printed on the
relevant code line when that seems unambiguous, or on subsequent lines when
multiple potential options exist (messages, both positive and negative, from
the same optimization with different column numbers are taken to indicate
potential ambiguity). When on subsequent lines, a '^' is output in the relevant
column.
Annotated source for all relevant input files are put into the listing file
(each starting with '<' and then the file name).
You can disable having the unrolling/vectorization factors appear by using the
-s flag.
Differential Revision: https://reviews.llvm.org/D25262
llvm-svn: 283398