1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

10264 Commits

Author SHA1 Message Date
Alexey Lapshin
279a200561 [DebugInfo] add SectionedAddress to DebugInfo interfaces.
That patch is the fix for https://bugs.llvm.org/show_bug.cgi?id=40703
   "wrong line number info for obj file compiled with -ffunction-sections"
   bug. The problem happened with only .o files. If object file contains
   several .text sections then line number information showed incorrectly.
   The reason for this is that DwarfLineTable could not detect section which
   corresponds to specified address(because address is the local to the
   section). And as the result it could not select proper sequence in the
   line table. The fix is to pass SectionIndex with the address. So that it
   would be possible to differentiate addresses from various sections. With
   this fix llvm-objdump shows correct line numbers for disassembled code.

   Differential review: https://reviews.llvm.org/D58194

llvm-svn: 354972
2019-02-27 13:17:36 +00:00
George Rimar
25f42156b9 [llvm-objcopy] - Check for invalidated relocations when removing a section.
This is https://bugs.llvm.org/show_bug.cgi?id=40818

Removing a section that is used by relocation is an error
we did not report. The patch fixes that.

Differential revision: https://reviews.llvm.org/D58625

llvm-svn: 354962
2019-02-27 11:18:27 +00:00
James Henderson
bc60bd01e4 [llvm-readobj]Fix error messages for bad archive members and add testing for archive handling
llvm-readobj's error messages were broken for bad archive members. This
patch fixes them, and also adds testing for archive and thin archive
handling within llvm-readobj.

Reviewed by: rupprecht, grimar, higuoxing

Differential Revision: https://reviews.llvm.org/D58681

llvm-svn: 354960
2019-02-27 11:07:08 +00:00
Simon Pilgrim
42fad8ef25 Fix Wenum-compare gcc7 warning. NFCI.
llvm-svn: 354958
2019-02-27 10:19:53 +00:00
Fangrui Song
c8c777b508 [llvm-readobj] Print DF_1_DISPRELPND
The test will be added by D58677.

llvm-svn: 354955
2019-02-27 05:37:11 +00:00
Vlad Tsyrklevich
3def5ac311 Revert "[PGO] Context sensitive PGO (part 1)"
This reverts commit r354930, it was causing UBSan failures.

llvm-svn: 354953
2019-02-27 03:45:28 +00:00
Rong Xu
a0b71feeed [PGO] Context sensitive PGO (part 1)
Current PGO profile counts are not context sensitive. The branch probabilities
for the inlined functions are kept the same for all call-sites, and they might
be very different from the actual branch probabilities. These suboptimal
profiles can greatly affect some downstream optimizations, in particular for
the machine basic block placement optimization.

In this patch, we propose to have a post-inline PGO instrumentation/use pass,
which we called Context Sensitive PGO (CSPGO). For the users who want the best
possible performance, they can perform a second round of PGO instrument/use on
the top of the regular PGO. They will have two sets of profile counts. The
first pass profile will be manly for inline, indirect-call promotion, and
CGSCC simplification pass optimizations. The second pass profile is for
post-inline optimizations and code-gen optimizations.

A typical usage:
// Regular PGO instrumentation and generate pass1 profile.
> clang -O2 -fprofile-generate source.c -o gen
> ./gen
> llvm-profdata merge default.*profraw -o pass1.profdata
// CSPGO instrumentation.
> clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2
> ./gen2
// Merge two sets of profiles
> llvm-profdata merge default.*profraw pass1.profdata -o profile.profdata
// Use the combined profile. Pass manager will invoke two PGO use passes.
> clang -O2 -fprofile-use=profile.profdata -o use

This change touches many components in the compiler. The reviewed patch
(D54175) will committed in phrases.

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 354930
2019-02-26 22:37:46 +00:00
Xing GUO
ab7e1b886f [llvm-objdump] Add Version Definitions dumper
Summary: `llvm-objdump` needs a `Version Definitions` dumper.

Reviewers: grimar, jhenderson

Reviewed By: grimar, jhenderson

Subscribers: rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58615

llvm-svn: 354871
2019-02-26 13:06:16 +00:00
Igor Kudrin
4ca8c7b7fa [llvm-objdump] Implement -Mreg-names-raw/-std options.
The --disassembler-options, or -M, are used to customize
the disassembler and affect its output.

The two implemented options allow selecting register names on ARM:
* With -Mreg-names-raw, the disassembler uses rNN for all registers.
* With -Mreg-names-std it prints sp, lr and pc for r13, r14 and r15,
  which is the default behavior of llvm-objdump.

Differential Revision: https://reviews.llvm.org/D57680

llvm-svn: 354870
2019-02-26 12:15:14 +00:00
Clement Courbet
05fe47d6a1 [llvm-exegesis] Teach llvm-exegesis to handle instructions with multiple tied variables.
Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58285

llvm-svn: 354862
2019-02-26 10:54:45 +00:00
Eugene Leviant
1850b79062 [llvm-objcopy] Add --set-start, --change-start and --adjust-start
Differential revision: https://reviews.llvm.org/D58173

llvm-svn: 354854
2019-02-26 09:24:22 +00:00
Vlad Tsyrklevich
257ec8cb69 Revert "Improve "llvm-nm -f sysv" output for Elf files"
This reverts commit r354833, it was causing ASan test failures on
sanitizer-x86_64-linux-fast.

llvm-svn: 354849
2019-02-26 07:04:56 +00:00
Sunil Srivastava
7c3181b5bd Improve "llvm-nm -f sysv" output for Elf files
Specifically, compute and Print Type and Section columns.

Differential Revision: https://reviews.llvm.org/D58263

llvm-svn: 354833
2019-02-26 00:19:39 +00:00
Eugene Leviant
ea2ba5133c [llvm-objcopy] Add --add-symbol
Differential revision: https://reviews.llvm.org/D58234

llvm-svn: 354787
2019-02-25 14:12:41 +00:00
Xing GUO
8acff0230c [llvm-objdump] Add Version References dumper
Summary: Add symbol version dumper for [#30241](https://bugs.llvm.org/show_bug.cgi?id=30241)

Reviewers: jhenderson, MaskRay, kristina, emaste, grimar

Reviewed By: jhenderson, grimar

Subscribers: grimar, rupprecht, jakehehrlich, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54697

llvm-svn: 354782
2019-02-25 13:13:19 +00:00
James Henderson
841d01b0ae [yaml2obj]Re-allow dynamic sections to have raw content
Recently, support was added to yaml2obj to allow dynamic sections to
have a list of entries, to make it easier to write tests with dynamic
sections. However, this change also removed the ability to provide
custom contents to the dynamic section, making it hard to test
malformed contents (e.g. because the section is not a valid size to
contain an array of entries). This change reinstates this. An error is
emitted if raw content and dynamic entries are both specified.

Reviewed by: grimar, ruiu

Differential Review: https://reviews.llvm.org/D58543

llvm-svn: 354770
2019-02-25 11:02:24 +00:00
Roman Lebedev
8691a62a19 [llvm-exegesis] Split Epsilon param into two (PR40787)
Summary:
This eps param is used for two distinct things:
* initial point clusterization
* checking clusters against the llvm values

What if one wants to only look at highly different clusters, without changing
the clustering itself? In particular, this helps to weed out noisy measurements
(since the clusterization epsilon is still small, so there is a better chance
that noisy measurements from the same opcode will go into different clusters)

By splitting it into two params it is now possible.

This is nearly-free performance-wise:
Old:
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 10099 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs):

            390.01 msec task-clock                #    0.998 CPUs utilized            ( +-  0.25% )
                12      context-switches          #   31.735 M/sec                    ( +- 27.38% )
                 0      cpu-migrations            #    0.000 K/sec
              4745      page-faults               # 12183.732 M/sec                   ( +-  0.54% )
        1562711900      cycles                    # 4012303.327 GHz                   ( +-  0.24% )  (82.90%)
         185567822      stalled-cycles-frontend   #   11.87% frontend cycles idle     ( +-  0.52% )  (83.30%)
         392106234      stalled-cycles-backend    #   25.09% backend cycles idle      ( +-  1.31% )  (33.79%)
        1839236666      instructions              #    1.18  insn per cycle
                                                  #    0.21  stalled cycles per insn  ( +-  0.15% )  (50.37%)
         407035764      branches                  # 1045074878.710 M/sec              ( +-  0.12% )  (66.80%)
          10896459      branch-misses             #    2.68% of all branches          ( +-  0.17% )  (83.20%)

          0.390629 +- 0.000972 seconds time elapsed  ( +-  0.25% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 50572 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs):

           6803.36 msec task-clock                #    0.999 CPUs utilized            ( +-  0.96% )
               262      context-switches          #   38.546 M/sec                    ( +- 23.06% )
                 0      cpu-migrations            #    0.065 M/sec                    ( +- 76.03% )
             13287      page-faults               # 1953.206 M/sec                    ( +-  0.32% )
       27252537904      cycles                    # 4006024.257 GHz                   ( +-  0.95% )  (83.31%)
        1496314935      stalled-cycles-frontend   #    5.49% frontend cycles idle     ( +-  0.97% )  (83.32%)
       16128404524      stalled-cycles-backend    #   59.18% backend cycles idle      ( +-  0.30% )  (33.37%)
       17611143370      instructions              #    0.65  insn per cycle
                                                  #    0.92  stalled cycles per insn  ( +-  0.05% )  (50.04%)
        3894906599      branches                  # 572537147.437 M/sec               ( +-  0.03% )  (66.69%)
         116314514      branch-misses             #    2.99% of all branches          ( +-  0.20% )  (83.35%)

            6.8118 +- 0.0689 seconds time elapsed  ( +-  1.01%)
```
New:
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 10099 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new.html'
...
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new.html' (25 runs):

            400.14 msec task-clock                #    0.998 CPUs utilized            ( +-  0.66% )
                12      context-switches          #   29.429 M/sec                    ( +- 25.95% )
                 0      cpu-migrations            #    0.100 M/sec                    ( +-100.00% )
              4714      page-faults               # 11796.496 M/sec                   ( +-  0.55% )
        1603131306      cycles                    # 4011840.105 GHz                   ( +-  0.66% )  (82.85%)
         199538509      stalled-cycles-frontend   #   12.45% frontend cycles idle     ( +-  2.40% )  (83.10%)
         402249109      stalled-cycles-backend    #   25.09% backend cycles idle      ( +-  1.19% )  (34.05%)
        1847783963      instructions              #    1.15  insn per cycle
                                                  #    0.22  stalled cycles per insn  ( +-  0.18% )  (50.64%)
         407162722      branches                  # 1018925730.631 M/sec              ( +-  0.12% )  (67.02%)
          10932779      branch-misses             #    2.69% of all branches          ( +-  0.51% )  (83.28%)

           0.40077 +- 0.00267 seconds time elapsed  ( +-  0.67% )

lebedevri@pini-pini:/build/llvm-build-Clang-release$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-new.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 50572 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new.html'
...
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-new.html' (9 runs):

           6947.79 msec task-clock                #    1.000 CPUs utilized            ( +-  0.90% )
               217      context-switches          #   31.236 M/sec                    ( +- 36.16% )
                 1      cpu-migrations            #    0.096 M/sec                    ( +- 50.00% )
             13258      page-faults               # 1908.389 M/sec                    ( +-  0.34% )
       27830796523      cycles                    # 4006032.286 GHz                   ( +-  0.89% )  (83.30%)
        1504554006      stalled-cycles-frontend   #    5.41% frontend cycles idle     ( +-  2.10% )  (83.32%)
       16716574843      stalled-cycles-backend    #   60.07% backend cycles idle      ( +-  0.65% )  (33.38%)
       17755545931      instructions              #    0.64  insn per cycle
                                                  #    0.94  stalled cycles per insn  ( +-  0.09% )  (50.04%)
        3897255686      branches                  # 560980426.597 M/sec               ( +-  0.06% )  (66.70%)
         117045395      branch-misses             #    3.00% of all branches          ( +-  0.47% )  (83.34%)

            6.9507 +- 0.0627 seconds time elapsed  ( +-  0.90% )
```

I.e. it's +2.6% slowdown for one whole sweep, or +2% for 5 whole sweeps.
Within noise i'd say.

Should help with [[ https://bugs.llvm.org/show_bug.cgi?id=40787 | PR40787 ]].

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58476

llvm-svn: 354767
2019-02-25 09:36:12 +00:00
Roman Lebedev
0ea4992de1 [XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert"
Summary:
This reverts D50129 / rL338834: [XRay][tools] Use Support/JSON.h in llvm-xray convert

Abstractions are great.
Readable code is great.
JSON support library is a *good* idea.

However unfortunately, there is an internal detail that one needs
to be aware of in `llvm::json::Object` - it uses `llvm::DenseMap`.
So for **every** `llvm::json::Object`, even if you only store a single `int`
entry there, you pay the whole price of `llvm::DenseMap`.

Unfortunately, it matters for `llvm-xray`.

I was trying to analyse the `llvm-exegesis` analysis mode performance,
and for that i wanted to view the LLVM X-Ray log visualization in Chrome
trace viewer. And the `llvm-xray convert` is sluggish, and sometimes
even ended up being killed by OOM.

`xray-log.llvm-exegesis.lwZ0sT` was acquired from `llvm-exegesis`
(compiled with ` -fxray-instruction-threshold=128`)
analysis mode over `-benchmarks-file` with 10099 points (one full
latency measurement set), with normal runtime of 0.387s.

Timings:
Old: (copied from D58580)
```
$ perf stat -r 5 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT

 Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (5 runs):

          21346.24 msec task-clock                #    1.000 CPUs utilized            ( +-  0.28% )
               314      context-switches          #   14.701 M/sec                    ( +- 59.13% )
                 1      cpu-migrations            #    0.037 M/sec                    ( +-100.00% )
           2181354      page-faults               # 102191.251 M/sec                  ( +-  0.02% )
       85477442102      cycles                    # 4004415.019 GHz                   ( +-  0.28% )  (83.33%)
       14526427066      stalled-cycles-frontend   #   16.99% frontend cycles idle     ( +-  0.70% )  (83.33%)
       32371533721      stalled-cycles-backend    #   37.87% backend cycles idle      ( +-  0.27% )  (33.34%)
       67896890228      instructions              #    0.79  insn per cycle
                                                  #    0.48  stalled cycles per insn  ( +-  0.03% )  (50.00%)
       14592654840      branches                  # 683631198.653 M/sec               ( +-  0.02% )  (66.67%)
         212207534      branch-misses             #    1.45% of all branches          ( +-  0.94% )  (83.34%)

           21.3502 +- 0.0585 seconds time elapsed  ( +-  0.27% )
```
New:
```
$ perf stat -r 9 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT

 Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (9 runs):

           7178.38 msec task-clock                #    1.000 CPUs utilized            ( +-  0.26% )
               182      context-switches          #   25.402 M/sec                    ( +- 28.84% )
                 0      cpu-migrations            #    0.046 M/sec                    ( +- 70.71% )
             33701      page-faults               # 4694.994 M/sec                    ( +-  0.88% )
       28761053971      cycles                    # 4006833.933 GHz                   ( +-  0.26% )  (83.32%)
        2028297997      stalled-cycles-frontend   #    7.05% frontend cycles idle     ( +-  1.61% )  (83.32%)
       10773154901      stalled-cycles-backend    #   37.46% backend cycles idle      ( +-  0.38% )  (33.36%)
       36199132874      instructions              #    1.26  insn per cycle
                                                  #    0.30  stalled cycles per insn  ( +-  0.03% )  (50.02%)
        6434504227      branches                  # 896420204.421 M/sec               ( +-  0.03% )  (66.68%)
          73355176      branch-misses             #    1.14% of all branches          ( +-  1.46% )  (83.33%)

            7.1807 +- 0.0190 seconds time elapsed  ( +-  0.26% )
```

So using `llvm::json` nearly triples run-time on that test case.
(+3x is times, not percent.)

Memory:
Old:
```
total runtime: 39.88s.
bytes allocated in total (ignoring deallocations): 79.07GB (1.98GB/s)
calls to allocation functions: 33267816 (834135/s)
temporary memory allocations: 5832298 (146235/s)
peak heap memory consumption: 9.21GB
peak RSS (including heaptrack overhead): 147.98GB
total memory leaked: 1.09MB
```
New:
```
total runtime: 17.42s.
bytes allocated in total (ignoring deallocations): 5.12GB (293.86MB/s)
calls to allocation functions: 21382982 (1227284/s)
temporary memory allocations: 232858 (13364/s)
peak heap memory consumption: 350.69MB
peak RSS (including heaptrack overhead): 2.55GB
total memory leaked: 79.95KB
```
Diff:
```
total runtime: -22.46s.
bytes allocated in total (ignoring deallocations): -73.95GB (3.29GB/s)
calls to allocation functions: -11884834 (529155/s)
temporary memory allocations: -5599440 (249307/s)
peak heap memory consumption: -8.86GB
peak RSS (including heaptrack overhead): 0B
total memory leaked: -1.01MB
```
So using `llvm::json` increases *peak* memory consumption on *this* testcase ~+27x.
And total allocation count +15x. Both of these numbers are times, *not* percent.

And note that memory usage is clearly unbound with `llvm::json`, it directly depends
on the length of the log, so peak memory consumption is always increasing.
This isn't so with the dumb code, there is no accumulating memory consumption,
peak memory consumption is fixed. Naturally, that means it will handle *much*
larger logs without OOM'ing.

Readability is good, but the price is simply unacceptable here.
Too bad none of this analysis was done as part of the development/review D50129 itself.

Reviewers: dberris, kpw, sammccall

Reviewed By: dberris

Subscribers: riccibruno, hans, courbet, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58584

llvm-svn: 354764
2019-02-25 07:39:07 +00:00
George Rimar
ec3690fdaf [obj2yaml] - Do not miss section index for special symbols.
This fixes https://bugs.llvm.org/show_bug.cgi?id=40786 
("obj2yaml symbol output missing section index for SHN_ABS and SHN_COMMON symbols")

Since SHN_ABS and SHN_COMMON symbols are special, we should preserve
the st_shndx for them. The patch does this for them and the other special symbols.

The test case is based on the test provided by James Henderson at the bug page!

Differential revision: https://reviews.llvm.org/D58498

llvm-svn: 354661
2019-02-22 08:45:21 +00:00
Jordan Rupprecht
b79b906ee1 [llvm-objcopy][NFC] Add std::move() to fix older BB
llvm-svn: 354603
2019-02-21 17:24:55 +00:00
Jordan Rupprecht
b93a103dbb [llvm-objcopy][NFC] More error cleanup
Summary:
This removes calls to `error()`/`reportError()` in the main driver (llvm-objcopy.cpp) as well as the associated argv-parsing (CopyConfig.cpp). `logAllUnhandledErrors()` is now the main way to print errors.

There are still a few uses from within the per-arch drivers, so we can't delete them yet... but almost!

Reviewers: jhenderson, alexshap, espindola

Reviewed By: jhenderson

Subscribers: emaste, arichardson, jakehehrlich, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58316

llvm-svn: 354600
2019-02-21 17:05:19 +00:00
Jordan Rupprecht
d23e0bcdb9 [llvm-objcopy] Make removeSectionReferences batched
Summary:
Removing a large number of sections from a file with a lot of symbols can have abysmal (i.e. O(n^2)) performance, e.g. when running `--only-section` to extract one section out of a large file.

This comes from iterating over all symbols in the symbol table each time we remove a section, to remove references to the section we just removed.
Instead, do just one pass of symbol removal by passing a hash set of all the sections we'd like to remove references to.

This fixes a regression when running llvm-objcopy -j <one section> on an object file with many sections and symbols -- on my machine, running `objcopy -j .keep_me huge-input.o /tmp/foo.o` takes .3s with GNU objcopy, 1.3s with an updated llvm-objcopy, and 7+ minutes with llvm-objcopy prior to this patch.

Reviewers: MaskRay, jhenderson, jakehehrlich, alexshap, espindola

Reviewed By: MaskRay, jhenderson

Subscribers: echristo, emaste, arichardson, mgrang, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58296

llvm-svn: 354597
2019-02-21 16:45:42 +00:00
George Rimar
b6a05e83ac [yaml2obj][obj2yaml] - Support SHT_GNU_verdef (.gnu.version_d) section.
This patch adds support for parsing/dumping the .gnu.version section.

Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverdefs.html

Differential revision: https://reviews.llvm.org/D58437

llvm-svn: 354574
2019-02-21 12:21:43 +00:00
Fangrui Song
a13d56908a [llvm-readobj] Change "SHT_MIPS_DWARF" to "MIPS_DWARF"
Summary:
This is to be consistent with the display of other MIPS section types.
This string is also used by binutils-gdb/binutils/readelf.c:get_mips_section_type_name

Since we are here, reorder the two enum constatns because SHT_MIPS_DWARF < SHT_MIPS_ABIFLAGS.

Reviewers: jhenderson, atanasyan

Reviewed By: jhenderson

Subscribers: aprantl, sdardis, arichardson, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58496

llvm-svn: 354571
2019-02-21 11:35:41 +00:00
Fangrui Song
6412e8e039 Fix file header issues in fuzzers. NFC
llvm-svn: 354551
2019-02-21 07:57:14 +00:00
Fangrui Song
34a4c8f773 Fix some include order and file headers issues. NFC
llvm-svn: 354550
2019-02-21 07:42:31 +00:00
Andrea Di Biagio
8cb6c4646d [MCA][ResourceManager] Add a table that maps processor resource indices to processor resource identifiers.
This patch adds a lookup table to speed up resource queries in the ResourceManager.
This patch also moves helper function 'getResourceStateIndex()' from
ResourceManager.cpp to Support.h, so that we can reuse that logic in the
SummaryView (and potentially other views in llvm-mca).
No functional change intended.

llvm-svn: 354470
2019-02-20 14:53:18 +00:00
Hans Wennborg
25bdbd561c Fix the build with gcc/libstdc++ 4.8.2 after r354441
llvm-svn: 354469
2019-02-20 14:50:08 +00:00
George Rimar
4a031d3273 [yaml2elf] - Rename a variable. NFC.
Was suggested during review of D58441.

llvm-svn: 354463
2019-02-20 14:01:02 +00:00
George Rimar
5e2d8305b6 [yaml2obj] - Simplify implementation. NFCI.
Knowing about how types are declared for 32/64 bit platforms:
https://github.com/llvm-mirror/llvm/blob/master/include/llvm/BinaryFormat/ELF.h#L28

it is possible to simplify code that writes a binary a bit.
The patch does that.

Differential revision: https://reviews.llvm.org/D58441

llvm-svn: 354462
2019-02-20 13:58:43 +00:00
Roman Lebedev
2a88b46050 [llvm-exegesis] Opcode stabilization / reclusterization (PR40715)
Summary:
Given an instruction `Opcode`, we can make benchmarks (measurements) of the
instruction characteristics/performance. Then, to facilitate further analysis
we group the benchmarks with *similar* characteristics into clusters.
Now, this is all not entirely deterministic. Some instructions have variable
characteristics, depending on their arguments. And thus, if we do several
benchmarks of the same instruction `Opcode`, we may end up with *different*
performance characteristics measurements. And when we then do clustering,
these several benchmarks of the same instruction `Opcode` may end up being
clustered into *different* clusters. This is not great for further analysis.

We shall find every `Opcode` with benchmarks not in just one cluster, and move
*all* the benchmarks of said `Opcode` into one new unstable cluster per `Opcode`.

I have solved this by making `ClusterId` a bit field, adding a `IsUnstable` bit,
and introducing `-analysis-display-unstable-clusters` switch to toggle between
displaying stable-only clusters and unstable-only clusters.

The reclusterization is deterministically stable, produces identical reports
between runs. (Or at least that is what i'm seeing, maybe it isn't)

Timings/comparisons:
old (current trunk/head) {F8303582}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs):

           6624.73 msec task-clock                #    0.999 CPUs utilized            ( +-  0.53% )
               172      context-switches          #   25.965 M/sec                    ( +- 29.89% )
                 0      cpu-migrations            #    0.042 M/sec                    ( +- 56.54% )
             31073      page-faults               # 4690.754 M/sec                    ( +-  0.08% )
       26538711696      cycles                    # 4006230.292 GHz                   ( +-  0.53% )  (83.31%)
        2017496807      stalled-cycles-frontend   #    7.60% frontend cycles idle     ( +-  0.93% )  (83.32%)
       13403650062      stalled-cycles-backend    #   50.51% backend cycles idle      ( +-  0.33% )  (33.37%)
       19770706799      instructions              #    0.74  insn per cycle
                                                  #    0.68  stalled cycles per insn  ( +-  0.04% )  (50.04%)
        4419821812      branches                  # 667207369.714 M/sec               ( +-  0.03% )  (66.69%)
         121741669      branch-misses             #    2.75% of all branches          ( +-  0.28% )  (83.34%)

            6.6283 +- 0.0358 seconds time elapsed  ( +-  0.54% )
```

patch, with reclustering but without filtering (i.e. outputting all the stable *and* unstable clusters) {F8303586}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html' (25 runs):

           6475.29 msec task-clock                #    0.999 CPUs utilized            ( +-  0.31% )
               213      context-switches          #   32.952 M/sec                    ( +- 23.81% )
                 1      cpu-migrations            #    0.130 M/sec                    ( +- 43.84% )
             31287      page-faults               # 4832.057 M/sec                    ( +-  0.08% )
       25939086577      cycles                    # 4006160.279 GHz                   ( +-  0.31% )  (83.31%)
        1958812858      stalled-cycles-frontend   #    7.55% frontend cycles idle     ( +-  0.68% )  (83.32%)
       13218961512      stalled-cycles-backend    #   50.96% backend cycles idle      ( +-  0.29% )  (33.37%)
       19752995402      instructions              #    0.76  insn per cycle
                                                  #    0.67  stalled cycles per insn  ( +-  0.04% )  (50.04%)
        4417079244      branches                  # 682195472.305 M/sec               ( +-  0.03% )  (66.70%)
         121510065      branch-misses             #    2.75% of all branches          ( +-  0.19% )  (83.34%)

            6.4832 +- 0.0229 seconds time elapsed  ( +-  0.35% )
```
Funnily, *this* measurement shows that said reclustering actually improved performance.

patch, with reclustering, only the stable clusters {F8303594}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html' (25 runs):

           6387.71 msec task-clock                #    0.999 CPUs utilized            ( +-  0.13% )
               133      context-switches          #   20.792 M/sec                    ( +- 23.39% )
                 0      cpu-migrations            #    0.063 M/sec                    ( +- 61.24% )
             31318      page-faults               # 4903.256 M/sec                    ( +-  0.08% )
       25591984967      cycles                    # 4006786.266 GHz                   ( +-  0.13% )  (83.31%)
        1881234904      stalled-cycles-frontend   #    7.35% frontend cycles idle     ( +-  0.25% )  (83.33%)
       13209749965      stalled-cycles-backend    #   51.62% backend cycles idle      ( +-  0.16% )  (33.36%)
       19767554347      instructions              #    0.77  insn per cycle
                                                  #    0.67  stalled cycles per insn  ( +-  0.04% )  (50.03%)
        4417480305      branches                  # 691618858.046 M/sec               ( +-  0.03% )  (66.68%)
         118676358      branch-misses             #    2.69% of all branches          ( +-  0.07% )  (83.33%)

            6.3954 +- 0.0118 seconds time elapsed  ( +-  0.18% )
```
Performance improved even further?! Makes sense i guess, less clusters to print.

patch, with reclustering, only the unstable clusters {F8303601}
```
$ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 43970 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters' (25 runs):

           6124.96 msec task-clock                #    1.000 CPUs utilized            ( +-  0.20% )
               194      context-switches          #   31.709 M/sec                    ( +- 20.46% )
                 0      cpu-migrations            #    0.039 M/sec                    ( +- 49.77% )
             31413      page-faults               # 5129.261 M/sec                    ( +-  0.06% )
       24536794267      cycles                    # 4006425.858 GHz                   ( +-  0.19% )  (83.31%)
        1676085087      stalled-cycles-frontend   #    6.83% frontend cycles idle     ( +-  0.46% )  (83.32%)
       13035595603      stalled-cycles-backend    #   53.13% backend cycles idle      ( +-  0.16% )  (33.36%)
       18260877653      instructions              #    0.74  insn per cycle
                                                  #    0.71  stalled cycles per insn  ( +-  0.05% )  (50.03%)
        4112411983      branches                  # 671484364.603 M/sec               ( +-  0.03% )  (66.68%)
         114066929      branch-misses             #    2.77% of all branches          ( +-  0.11% )  (83.32%)

            6.1278 +- 0.0121 seconds time elapsed  ( +-  0.20% )
```
This tells us that the actual `-analysis-inconsistencies-output-file=` outputting only takes ~0.4 sec for 43970 benchmark points (3 whole sweeps)
(Also, wow this is fast, it used to take several minutes originally)

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40715 | PR40715 ]].

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, jdoerfert, llvm-commits, RKSimon

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58355

llvm-svn: 354441
2019-02-20 09:14:04 +00:00
Thomas Lively
065031a098 [WebAssembly] Update MC for bulk memory
Summary:
Rename MemoryIndex to InitFlags and implement logic for determining
data segment layout in ObjectYAML and MC. Also adds a "passive" flag
for the .section assembler directive although this cannot be assembled
yet because the assembler does not support data sections.

Reviewers: sbc100, aardappel, aheejin, dschuff

Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57938

llvm-svn: 354397
2019-02-19 22:56:19 +00:00
Vedant Kumar
c8387a61d8 [llvm-cov] Add support for gcov --hash-filenames option
The patch adds support for --hash-filenames to llvm-cov. This option adds md5
hash of the source path to the name of the generated .gcov file. The option is
crucial for cases where you have multiple files with the same name but can't
use --preserve-paths as resulting filenames exceed the limit.

from gcov(1):

```
-x
--hash-filenames
    By default, gcov uses the full pathname of the source files to to
    create an output filename.  This can lead to long filenames that
    can overflow filesystem limits.  This option creates names of the
    form source-file##md5.gcov, where the source-file component is
    the final filename part and the md5 component is calculated from
    the full mangled name that would have been used otherwise.
```

Patch by Igor Ignatev!

Differential Revision: https://reviews.llvm.org/D58370

llvm-svn: 354379
2019-02-19 20:45:00 +00:00
Matthew Voss
3fb0589d43 Revert "Revert "[llvm-objdump] Allow short options without arguments to be grouped""
- Tests that use multiple short switches now test them grouped and ungrouped.

  - Ensure the output of ungrouped and grouped variants is identical

Differential Revision: https://reviews.llvm.org/D57904

llvm-svn: 354375
2019-02-19 19:46:08 +00:00
George Rimar
efc720024f [yaml2obj][obj2yaml] - Support SHT_GNU_versym (.gnu.version) section.
This patch adds support for parsing dumping the .gnu.version section.
Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symversion.html#SYMVERTBL

Differential revision: https://reviews.llvm.org/D58280

llvm-svn: 354338
2019-02-19 15:29:07 +00:00
George Rimar
028b8cf90b Recommit r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section."
Fix:
Replace
assert(!IO.getContext() && "The IO context is initialized already");
with
assert(IO.getContext() && "The IO context is not initialized");
(this was introduced in r354329, where I tried to quickfix the darwin BB
and seems copypasted the assert from the wrong place).

Original commit message:

The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html

Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.

We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.

Everything else changed in this patch seems to be straightforward.

Differential revision: https://reviews.llvm.org/D58119

llvm-svn: 354335
2019-02-19 14:53:48 +00:00
George Rimar
44d3bd8fa4 Revert r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section."
Something went wrong. Bots are unhappy:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/44113/steps/test/logs/stdio

llvm-svn: 354332
2019-02-19 14:38:25 +00:00
George Rimar
1625c306b8 [obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section.
The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html

Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.

We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.

Everything else changed in this patch seems to be straightforward.

Differential revision: https://reviews.llvm.org/D58119

llvm-svn: 354328
2019-02-19 14:03:14 +00:00
George Rimar
c43ce2fac3 [yaml2obj] - Do not skip zeroes blocks if there are relocations against them.
This is for -D -reloc combination.

With this patch, we do not skip the zero bytes that have a relocation against
them when -reloc is used. If -reloc is not used, then the behavior will be the same.

Differential revision: https://reviews.llvm.org/D58174

llvm-svn: 354319
2019-02-19 12:38:36 +00:00
George Rimar
35adabb3a5 [yaml2obj] - Do not ignore explicit addresses for .dynsym and .dynstr
This fixes https://bugs.llvm.org/show_bug.cgi?id=40339

Previously if the addresses were set in YAML they were ignored for
.dynsym and .dynstr sections. The patch fixes that.

Differential revision: https://reviews.llvm.org/D58168

llvm-svn: 354318
2019-02-19 12:15:04 +00:00
George Rimar
c54720f1f4 [llvm-readobj] - Simplify .gnu.version_d dumping.
This is similar to D58048.

Instead of scanning the dynamic table to read the
DT_VERDEFNUM, we could take it from the sh_info field.
(https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-94076/index.html)

The patch does this.

llvm-svn: 354270
2019-02-18 13:58:12 +00:00
Dave Lee
58afb80dde llvm-nm: Observe -no-llvm-bc for archive members
Summary:
This change fixes the `-no-llvm-bc` flag to work with object files within
archives. Currently the `-no-llvm-bc` flag works for regular object files, but
not static libraries, where it continues to show bitcode symbol info.

Original support was added in D4371.

Reviewers: compnerd, smeenai, pcc

Reviewed By: compnerd

Subscribers: rupprecht, keith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D48798

llvm-svn: 354196
2019-02-16 06:59:49 +00:00
Matt Davis
f5f5656288 [llvm-cxxfilt] Fix a comment typo. NFC.
llvm-svn: 354094
2019-02-15 02:43:14 +00:00
Jordan Rupprecht
e6ec8f30c9 [llvm-ar] Implement the P modifier.
Summary:
GNU ar has a `P` modifier that changes filename comparisons to use full paths instead of the basename. As noted in the GNU docs, regular archives are not created with full path names, so P is used to deal with archives created by other archive programs (e.g. see the updated `absolute-paths.test` test case).

Since thin archives use full path names -- paths are relative to the archive -- it seems very error prone to not imply P when dealing with thin archives, so P is implied in those cases. (I think this is a deviation from GNU ar that makes sense).

This fixes PR37436 via https://github.com/ClangBuiltLinux/linux/issues/33.

Reviewers: mstorsjo, pcc, ruiu, davide, david2050, rnk

Subscribers: tpimh, llvm-commits, nickdesaulniers

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57927

llvm-svn: 354044
2019-02-14 18:35:13 +00:00
Matthew Voss
e79e227d55 Revert "[llvm-objdump] Allow short options without arguments to be grouped"
Reverted due to failures on the llvm-hexagon-elf.

This reverts commit 77e1f27476c89f65eeb496d131065177e6417f23.

llvm-svn: 354002
2019-02-14 01:39:43 +00:00
Matthew Voss
174bace66f [llvm-objdump] Allow short options without arguments to be grouped
Summary:

https://bugs.llvm.org/show_bug.cgi?id=31679

Reviewers: kristina, jhenderson, grimar, jakehehrlich, rupprecht

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57904

llvm-svn: 353998
2019-02-14 00:39:40 +00:00
Jordan Rupprecht
088fd9a6ad [llvm-ar][libObject] Fix relative paths when nesting thin archives.
Summary:
When adding one thin archive to another, we currently chop off the relative path to the flattened members. For instance, when adding `foo/child.a` (which contains `x.txt`) to `parent.a`, when flattening it we should add it as `foo/x.txt` (which exists) instead of `x.txt` (which does not exist).

As a note, this also undoes the `IsNew` parameter of handling relative paths in r288280. The unit test there still passes.

This was reported as part of testing the kernel build with llvm-ar: https://patchwork.kernel.org/patch/10767545/ (see the second point).

Reviewers: mstorsjo, pcc, ruiu, davide, david2050, inglorion

Reviewed By: ruiu

Subscribers: void, jdoerfert, tpimh, mgorny, hans, nickdesaulniers, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57842

llvm-svn: 353995
2019-02-13 23:39:41 +00:00
Fangrui Song
aa6ef5a3d2 [llvm-readobj] Dump GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} notes in .note.gnu.property
Reviewers: grimar, rupprecht

Reviewed By: rupprecht

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58175

llvm-svn: 353991
2019-02-13 23:18:05 +00:00
Fangrui Song
9ae96f09ec [llvm-readobj] Rename pr_data to PrData
As requested by grimar in D58112.

llvm-svn: 353951
2019-02-13 15:58:23 +00:00
Eugene Leviant
29576825e5 [llvm-objcopy] Add --strip-unneeded-symbol(s)
Differential revision: https://reviews.llvm.org/D58027

llvm-svn: 353919
2019-02-13 07:34:54 +00:00