That patch is the fix for https://bugs.llvm.org/show_bug.cgi?id=40703
"wrong line number info for obj file compiled with -ffunction-sections"
bug. The problem happened with only .o files. If object file contains
several .text sections then line number information showed incorrectly.
The reason for this is that DwarfLineTable could not detect section which
corresponds to specified address(because address is the local to the
section). And as the result it could not select proper sequence in the
line table. The fix is to pass SectionIndex with the address. So that it
would be possible to differentiate addresses from various sections. With
this fix llvm-objdump shows correct line numbers for disassembled code.
Differential review: https://reviews.llvm.org/D58194
llvm-svn: 354972
llvm-readobj's error messages were broken for bad archive members. This
patch fixes them, and also adds testing for archive and thin archive
handling within llvm-readobj.
Reviewed by: rupprecht, grimar, higuoxing
Differential Revision: https://reviews.llvm.org/D58681
llvm-svn: 354960
Current PGO profile counts are not context sensitive. The branch probabilities
for the inlined functions are kept the same for all call-sites, and they might
be very different from the actual branch probabilities. These suboptimal
profiles can greatly affect some downstream optimizations, in particular for
the machine basic block placement optimization.
In this patch, we propose to have a post-inline PGO instrumentation/use pass,
which we called Context Sensitive PGO (CSPGO). For the users who want the best
possible performance, they can perform a second round of PGO instrument/use on
the top of the regular PGO. They will have two sets of profile counts. The
first pass profile will be manly for inline, indirect-call promotion, and
CGSCC simplification pass optimizations. The second pass profile is for
post-inline optimizations and code-gen optimizations.
A typical usage:
// Regular PGO instrumentation and generate pass1 profile.
> clang -O2 -fprofile-generate source.c -o gen
> ./gen
> llvm-profdata merge default.*profraw -o pass1.profdata
// CSPGO instrumentation.
> clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2
> ./gen2
// Merge two sets of profiles
> llvm-profdata merge default.*profraw pass1.profdata -o profile.profdata
// Use the combined profile. Pass manager will invoke two PGO use passes.
> clang -O2 -fprofile-use=profile.profdata -o use
This change touches many components in the compiler. The reviewed patch
(D54175) will committed in phrases.
Differential Revision: https://reviews.llvm.org/D54175
llvm-svn: 354930
The --disassembler-options, or -M, are used to customize
the disassembler and affect its output.
The two implemented options allow selecting register names on ARM:
* With -Mreg-names-raw, the disassembler uses rNN for all registers.
* With -Mreg-names-std it prints sp, lr and pc for r13, r14 and r15,
which is the default behavior of llvm-objdump.
Differential Revision: https://reviews.llvm.org/D57680
llvm-svn: 354870
Recently, support was added to yaml2obj to allow dynamic sections to
have a list of entries, to make it easier to write tests with dynamic
sections. However, this change also removed the ability to provide
custom contents to the dynamic section, making it hard to test
malformed contents (e.g. because the section is not a valid size to
contain an array of entries). This change reinstates this. An error is
emitted if raw content and dynamic entries are both specified.
Reviewed by: grimar, ruiu
Differential Review: https://reviews.llvm.org/D58543
llvm-svn: 354770
Summary:
This reverts D50129 / rL338834: [XRay][tools] Use Support/JSON.h in llvm-xray convert
Abstractions are great.
Readable code is great.
JSON support library is a *good* idea.
However unfortunately, there is an internal detail that one needs
to be aware of in `llvm::json::Object` - it uses `llvm::DenseMap`.
So for **every** `llvm::json::Object`, even if you only store a single `int`
entry there, you pay the whole price of `llvm::DenseMap`.
Unfortunately, it matters for `llvm-xray`.
I was trying to analyse the `llvm-exegesis` analysis mode performance,
and for that i wanted to view the LLVM X-Ray log visualization in Chrome
trace viewer. And the `llvm-xray convert` is sluggish, and sometimes
even ended up being killed by OOM.
`xray-log.llvm-exegesis.lwZ0sT` was acquired from `llvm-exegesis`
(compiled with ` -fxray-instruction-threshold=128`)
analysis mode over `-benchmarks-file` with 10099 points (one full
latency measurement set), with normal runtime of 0.387s.
Timings:
Old: (copied from D58580)
```
$ perf stat -r 5 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT
Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (5 runs):
21346.24 msec task-clock # 1.000 CPUs utilized ( +- 0.28% )
314 context-switches # 14.701 M/sec ( +- 59.13% )
1 cpu-migrations # 0.037 M/sec ( +-100.00% )
2181354 page-faults # 102191.251 M/sec ( +- 0.02% )
85477442102 cycles # 4004415.019 GHz ( +- 0.28% ) (83.33%)
14526427066 stalled-cycles-frontend # 16.99% frontend cycles idle ( +- 0.70% ) (83.33%)
32371533721 stalled-cycles-backend # 37.87% backend cycles idle ( +- 0.27% ) (33.34%)
67896890228 instructions # 0.79 insn per cycle
# 0.48 stalled cycles per insn ( +- 0.03% ) (50.00%)
14592654840 branches # 683631198.653 M/sec ( +- 0.02% ) (66.67%)
212207534 branch-misses # 1.45% of all branches ( +- 0.94% ) (83.34%)
21.3502 +- 0.0585 seconds time elapsed ( +- 0.27% )
```
New:
```
$ perf stat -r 9 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT
Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (9 runs):
7178.38 msec task-clock # 1.000 CPUs utilized ( +- 0.26% )
182 context-switches # 25.402 M/sec ( +- 28.84% )
0 cpu-migrations # 0.046 M/sec ( +- 70.71% )
33701 page-faults # 4694.994 M/sec ( +- 0.88% )
28761053971 cycles # 4006833.933 GHz ( +- 0.26% ) (83.32%)
2028297997 stalled-cycles-frontend # 7.05% frontend cycles idle ( +- 1.61% ) (83.32%)
10773154901 stalled-cycles-backend # 37.46% backend cycles idle ( +- 0.38% ) (33.36%)
36199132874 instructions # 1.26 insn per cycle
# 0.30 stalled cycles per insn ( +- 0.03% ) (50.02%)
6434504227 branches # 896420204.421 M/sec ( +- 0.03% ) (66.68%)
73355176 branch-misses # 1.14% of all branches ( +- 1.46% ) (83.33%)
7.1807 +- 0.0190 seconds time elapsed ( +- 0.26% )
```
So using `llvm::json` nearly triples run-time on that test case.
(+3x is times, not percent.)
Memory:
Old:
```
total runtime: 39.88s.
bytes allocated in total (ignoring deallocations): 79.07GB (1.98GB/s)
calls to allocation functions: 33267816 (834135/s)
temporary memory allocations: 5832298 (146235/s)
peak heap memory consumption: 9.21GB
peak RSS (including heaptrack overhead): 147.98GB
total memory leaked: 1.09MB
```
New:
```
total runtime: 17.42s.
bytes allocated in total (ignoring deallocations): 5.12GB (293.86MB/s)
calls to allocation functions: 21382982 (1227284/s)
temporary memory allocations: 232858 (13364/s)
peak heap memory consumption: 350.69MB
peak RSS (including heaptrack overhead): 2.55GB
total memory leaked: 79.95KB
```
Diff:
```
total runtime: -22.46s.
bytes allocated in total (ignoring deallocations): -73.95GB (3.29GB/s)
calls to allocation functions: -11884834 (529155/s)
temporary memory allocations: -5599440 (249307/s)
peak heap memory consumption: -8.86GB
peak RSS (including heaptrack overhead): 0B
total memory leaked: -1.01MB
```
So using `llvm::json` increases *peak* memory consumption on *this* testcase ~+27x.
And total allocation count +15x. Both of these numbers are times, *not* percent.
And note that memory usage is clearly unbound with `llvm::json`, it directly depends
on the length of the log, so peak memory consumption is always increasing.
This isn't so with the dumb code, there is no accumulating memory consumption,
peak memory consumption is fixed. Naturally, that means it will handle *much*
larger logs without OOM'ing.
Readability is good, but the price is simply unacceptable here.
Too bad none of this analysis was done as part of the development/review D50129 itself.
Reviewers: dberris, kpw, sammccall
Reviewed By: dberris
Subscribers: riccibruno, hans, courbet, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58584
llvm-svn: 354764
This fixes https://bugs.llvm.org/show_bug.cgi?id=40786
("obj2yaml symbol output missing section index for SHN_ABS and SHN_COMMON symbols")
Since SHN_ABS and SHN_COMMON symbols are special, we should preserve
the st_shndx for them. The patch does this for them and the other special symbols.
The test case is based on the test provided by James Henderson at the bug page!
Differential revision: https://reviews.llvm.org/D58498
llvm-svn: 354661
Summary:
This removes calls to `error()`/`reportError()` in the main driver (llvm-objcopy.cpp) as well as the associated argv-parsing (CopyConfig.cpp). `logAllUnhandledErrors()` is now the main way to print errors.
There are still a few uses from within the per-arch drivers, so we can't delete them yet... but almost!
Reviewers: jhenderson, alexshap, espindola
Reviewed By: jhenderson
Subscribers: emaste, arichardson, jakehehrlich, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58316
llvm-svn: 354600
Summary:
Removing a large number of sections from a file with a lot of symbols can have abysmal (i.e. O(n^2)) performance, e.g. when running `--only-section` to extract one section out of a large file.
This comes from iterating over all symbols in the symbol table each time we remove a section, to remove references to the section we just removed.
Instead, do just one pass of symbol removal by passing a hash set of all the sections we'd like to remove references to.
This fixes a regression when running llvm-objcopy -j <one section> on an object file with many sections and symbols -- on my machine, running `objcopy -j .keep_me huge-input.o /tmp/foo.o` takes .3s with GNU objcopy, 1.3s with an updated llvm-objcopy, and 7+ minutes with llvm-objcopy prior to this patch.
Reviewers: MaskRay, jhenderson, jakehehrlich, alexshap, espindola
Reviewed By: MaskRay, jhenderson
Subscribers: echristo, emaste, arichardson, mgrang, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58296
llvm-svn: 354597
Summary:
This is to be consistent with the display of other MIPS section types.
This string is also used by binutils-gdb/binutils/readelf.c:get_mips_section_type_name
Since we are here, reorder the two enum constatns because SHT_MIPS_DWARF < SHT_MIPS_ABIFLAGS.
Reviewers: jhenderson, atanasyan
Reviewed By: jhenderson
Subscribers: aprantl, sdardis, arichardson, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58496
llvm-svn: 354571
This patch adds a lookup table to speed up resource queries in the ResourceManager.
This patch also moves helper function 'getResourceStateIndex()' from
ResourceManager.cpp to Support.h, so that we can reuse that logic in the
SummaryView (and potentially other views in llvm-mca).
No functional change intended.
llvm-svn: 354470
Summary:
Rename MemoryIndex to InitFlags and implement logic for determining
data segment layout in ObjectYAML and MC. Also adds a "passive" flag
for the .section assembler directive although this cannot be assembled
yet because the assembler does not support data sections.
Reviewers: sbc100, aardappel, aheejin, dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57938
llvm-svn: 354397
The patch adds support for --hash-filenames to llvm-cov. This option adds md5
hash of the source path to the name of the generated .gcov file. The option is
crucial for cases where you have multiple files with the same name but can't
use --preserve-paths as resulting filenames exceed the limit.
from gcov(1):
```
-x
--hash-filenames
By default, gcov uses the full pathname of the source files to to
create an output filename. This can lead to long filenames that
can overflow filesystem limits. This option creates names of the
form source-file##md5.gcov, where the source-file component is
the final filename part and the md5 component is calculated from
the full mangled name that would have been used otherwise.
```
Patch by Igor Ignatev!
Differential Revision: https://reviews.llvm.org/D58370
llvm-svn: 354379
- Tests that use multiple short switches now test them grouped and ungrouped.
- Ensure the output of ungrouped and grouped variants is identical
Differential Revision: https://reviews.llvm.org/D57904
llvm-svn: 354375
Fix:
Replace
assert(!IO.getContext() && "The IO context is initialized already");
with
assert(IO.getContext() && "The IO context is not initialized");
(this was introduced in r354329, where I tried to quickfix the darwin BB
and seems copypasted the assert from the wrong place).
Original commit message:
The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html
Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.
We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.
Everything else changed in this patch seems to be straightforward.
Differential revision: https://reviews.llvm.org/D58119
llvm-svn: 354335
The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html
Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.
We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.
Everything else changed in this patch seems to be straightforward.
Differential revision: https://reviews.llvm.org/D58119
llvm-svn: 354328
This is for -D -reloc combination.
With this patch, we do not skip the zero bytes that have a relocation against
them when -reloc is used. If -reloc is not used, then the behavior will be the same.
Differential revision: https://reviews.llvm.org/D58174
llvm-svn: 354319
Summary:
This change fixes the `-no-llvm-bc` flag to work with object files within
archives. Currently the `-no-llvm-bc` flag works for regular object files, but
not static libraries, where it continues to show bitcode symbol info.
Original support was added in D4371.
Reviewers: compnerd, smeenai, pcc
Reviewed By: compnerd
Subscribers: rupprecht, keith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D48798
llvm-svn: 354196
Summary:
GNU ar has a `P` modifier that changes filename comparisons to use full paths instead of the basename. As noted in the GNU docs, regular archives are not created with full path names, so P is used to deal with archives created by other archive programs (e.g. see the updated `absolute-paths.test` test case).
Since thin archives use full path names -- paths are relative to the archive -- it seems very error prone to not imply P when dealing with thin archives, so P is implied in those cases. (I think this is a deviation from GNU ar that makes sense).
This fixes PR37436 via https://github.com/ClangBuiltLinux/linux/issues/33.
Reviewers: mstorsjo, pcc, ruiu, davide, david2050, rnk
Subscribers: tpimh, llvm-commits, nickdesaulniers
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57927
llvm-svn: 354044
Summary:
When adding one thin archive to another, we currently chop off the relative path to the flattened members. For instance, when adding `foo/child.a` (which contains `x.txt`) to `parent.a`, when flattening it we should add it as `foo/x.txt` (which exists) instead of `x.txt` (which does not exist).
As a note, this also undoes the `IsNew` parameter of handling relative paths in r288280. The unit test there still passes.
This was reported as part of testing the kernel build with llvm-ar: https://patchwork.kernel.org/patch/10767545/ (see the second point).
Reviewers: mstorsjo, pcc, ruiu, davide, david2050, inglorion
Reviewed By: ruiu
Subscribers: void, jdoerfert, tpimh, mgorny, hans, nickdesaulniers, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57842
llvm-svn: 353995