llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Alexey Lapshin	279a200561	[DebugInfo] add SectionedAddress to DebugInfo interfaces. That patch is the fix for https://bugs.llvm.org/show_bug.cgi?id=40703 "wrong line number info for obj file compiled with -ffunction-sections" bug. The problem happened with only .o files. If object file contains several .text sections then line number information showed incorrectly. The reason for this is that DwarfLineTable could not detect section which corresponds to specified address(because address is the local to the section). And as the result it could not select proper sequence in the line table. The fix is to pass SectionIndex with the address. So that it would be possible to differentiate addresses from various sections. With this fix llvm-objdump shows correct line numbers for disassembled code. Differential review: https://reviews.llvm.org/D58194 llvm-svn: 354972	2019-02-27 13:17:36 +00:00
George Rimar	25f42156b9	[llvm-objcopy] - Check for invalidated relocations when removing a section. This is https://bugs.llvm.org/show_bug.cgi?id=40818 Removing a section that is used by relocation is an error we did not report. The patch fixes that. Differential revision: https://reviews.llvm.org/D58625 llvm-svn: 354962	2019-02-27 11:18:27 +00:00
James Henderson	bc60bd01e4	[llvm-readobj]Fix error messages for bad archive members and add testing for archive handling llvm-readobj's error messages were broken for bad archive members. This patch fixes them, and also adds testing for archive and thin archive handling within llvm-readobj. Reviewed by: rupprecht, grimar, higuoxing Differential Revision: https://reviews.llvm.org/D58681 llvm-svn: 354960	2019-02-27 11:07:08 +00:00
Simon Pilgrim	42fad8ef25	Fix Wenum-compare gcc7 warning. NFCI. llvm-svn: 354958	2019-02-27 10:19:53 +00:00
Fangrui Song	c8c777b508	[llvm-readobj] Print DF_1_DISPRELPND The test will be added by D58677. llvm-svn: 354955	2019-02-27 05:37:11 +00:00
Vlad Tsyrklevich	3def5ac311	Revert "[PGO] Context sensitive PGO (part 1)" This reverts commit r354930, it was causing UBSan failures. llvm-svn: 354953	2019-02-27 03:45:28 +00:00
Rong Xu	a0b71feeed	[PGO] Context sensitive PGO (part 1) Current PGO profile counts are not context sensitive. The branch probabilities for the inlined functions are kept the same for all call-sites, and they might be very different from the actual branch probabilities. These suboptimal profiles can greatly affect some downstream optimizations, in particular for the machine basic block placement optimization. In this patch, we propose to have a post-inline PGO instrumentation/use pass, which we called Context Sensitive PGO (CSPGO). For the users who want the best possible performance, they can perform a second round of PGO instrument/use on the top of the regular PGO. They will have two sets of profile counts. The first pass profile will be manly for inline, indirect-call promotion, and CGSCC simplification pass optimizations. The second pass profile is for post-inline optimizations and code-gen optimizations. A typical usage: // Regular PGO instrumentation and generate pass1 profile. > clang -O2 -fprofile-generate source.c -o gen > ./gen > llvm-profdata merge default.profraw -o pass1.profdata // CSPGO instrumentation. > clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2 > ./gen2 // Merge two sets of profiles > llvm-profdata merge default.profraw pass1.profdata -o profile.profdata // Use the combined profile. Pass manager will invoke two PGO use passes. > clang -O2 -fprofile-use=profile.profdata -o use This change touches many components in the compiler. The reviewed patch (D54175) will committed in phrases. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 354930	2019-02-26 22:37:46 +00:00
Xing GUO	ab7e1b886f	[llvm-objdump] Add `Version Definitions` dumper Summary: `llvm-objdump` needs a `Version Definitions` dumper. Reviewers: grimar, jhenderson Reviewed By: grimar, jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58615 llvm-svn: 354871	2019-02-26 13:06:16 +00:00
Igor Kudrin	4ca8c7b7fa	[llvm-objdump] Implement -Mreg-names-raw/-std options. The --disassembler-options, or -M, are used to customize the disassembler and affect its output. The two implemented options allow selecting register names on ARM: * With -Mreg-names-raw, the disassembler uses rNN for all registers. * With -Mreg-names-std it prints sp, lr and pc for r13, r14 and r15, which is the default behavior of llvm-objdump. Differential Revision: https://reviews.llvm.org/D57680 llvm-svn: 354870	2019-02-26 12:15:14 +00:00
Clement Courbet	05fe47d6a1	[llvm-exegesis] Teach llvm-exegesis to handle instructions with multiple tied variables. Reviewers: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58285 llvm-svn: 354862	2019-02-26 10:54:45 +00:00
Eugene Leviant	1850b79062	[llvm-objcopy] Add --set-start, --change-start and --adjust-start Differential revision: https://reviews.llvm.org/D58173 llvm-svn: 354854	2019-02-26 09:24:22 +00:00
Vlad Tsyrklevich	257ec8cb69	Revert "Improve "llvm-nm -f sysv" output for Elf files" This reverts commit r354833, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 354849	2019-02-26 07:04:56 +00:00
Sunil Srivastava	7c3181b5bd	Improve "llvm-nm -f sysv" output for Elf files Specifically, compute and Print Type and Section columns. Differential Revision: https://reviews.llvm.org/D58263 llvm-svn: 354833	2019-02-26 00:19:39 +00:00
Eugene Leviant	ea2ba5133c	[llvm-objcopy] Add --add-symbol Differential revision: https://reviews.llvm.org/D58234 llvm-svn: 354787	2019-02-25 14:12:41 +00:00
Xing GUO	8acff0230c	[llvm-objdump] Add `Version References` dumper Summary: Add symbol version dumper for [#30241](https://bugs.llvm.org/show_bug.cgi?id=30241) Reviewers: jhenderson, MaskRay, kristina, emaste, grimar Reviewed By: jhenderson, grimar Subscribers: grimar, rupprecht, jakehehrlich, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54697 llvm-svn: 354782	2019-02-25 13:13:19 +00:00
James Henderson	841d01b0ae	[yaml2obj]Re-allow dynamic sections to have raw content Recently, support was added to yaml2obj to allow dynamic sections to have a list of entries, to make it easier to write tests with dynamic sections. However, this change also removed the ability to provide custom contents to the dynamic section, making it hard to test malformed contents (e.g. because the section is not a valid size to contain an array of entries). This change reinstates this. An error is emitted if raw content and dynamic entries are both specified. Reviewed by: grimar, ruiu Differential Review: https://reviews.llvm.org/D58543 llvm-svn: 354770	2019-02-25 11:02:24 +00:00
Roman Lebedev	8691a62a19	[llvm-exegesis] Split Epsilon param into two (PR40787) Summary: This eps param is used for two distinct things: * initial point clusterization * checking clusters against the llvm values What if one wants to only look at highly different clusters, without changing the clustering itself? In particular, this helps to weed out noisy measurements (since the clusterization epsilon is still small, so there is a better chance that noisy measurements from the same opcode will go into different clusters) By splitting it into two params it is now possible. This is nearly-free performance-wise: Old: ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 10099 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-old.html' ... Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs): 390.01 msec task-clock # 0.998 CPUs utilized ( +- 0.25% ) 12 context-switches # 31.735 M/sec ( +- 27.38% ) 0 cpu-migrations # 0.000 K/sec 4745 page-faults # 12183.732 M/sec ( +- 0.54% ) 1562711900 cycles # 4012303.327 GHz ( +- 0.24% ) (82.90%) 185567822 stalled-cycles-frontend # 11.87% frontend cycles idle ( +- 0.52% ) (83.30%) 392106234 stalled-cycles-backend # 25.09% backend cycles idle ( +- 1.31% ) (33.79%) 1839236666 instructions # 1.18 insn per cycle # 0.21 stalled cycles per insn ( +- 0.15% ) (50.37%) 407035764 branches # 1045074878.710 M/sec ( +- 0.12% ) (66.80%) 10896459 branch-misses # 2.68% of all branches ( +- 0.17% ) (83.20%) 0.390629 +- 0.000972 seconds time elapsed ( +- 0.25% ) ``` ``` $ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-old.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 50572 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-old.html' ... Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs): 6803.36 msec task-clock # 0.999 CPUs utilized ( +- 0.96% ) 262 context-switches # 38.546 M/sec ( +- 23.06% ) 0 cpu-migrations # 0.065 M/sec ( +- 76.03% ) 13287 page-faults # 1953.206 M/sec ( +- 0.32% ) 27252537904 cycles # 4006024.257 GHz ( +- 0.95% ) (83.31%) 1496314935 stalled-cycles-frontend # 5.49% frontend cycles idle ( +- 0.97% ) (83.32%) 16128404524 stalled-cycles-backend # 59.18% backend cycles idle ( +- 0.30% ) (33.37%) 17611143370 instructions # 0.65 insn per cycle # 0.92 stalled cycles per insn ( +- 0.05% ) (50.04%) 3894906599 branches # 572537147.437 M/sec ( +- 0.03% ) (66.69%) 116314514 branch-misses # 2.99% of all branches ( +- 0.20% ) (83.35%) 6.8118 +- 0.0689 seconds time elapsed ( +- 1.01%) ``` New: ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 10099 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new.html' ... Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency-1.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new.html' (25 runs): 400.14 msec task-clock # 0.998 CPUs utilized ( +- 0.66% ) 12 context-switches # 29.429 M/sec ( +- 25.95% ) 0 cpu-migrations # 0.100 M/sec ( +-100.00% ) 4714 page-faults # 11796.496 M/sec ( +- 0.55% ) 1603131306 cycles # 4011840.105 GHz ( +- 0.66% ) (82.85%) 199538509 stalled-cycles-frontend # 12.45% frontend cycles idle ( +- 2.40% ) (83.10%) 402249109 stalled-cycles-backend # 25.09% backend cycles idle ( +- 1.19% ) (34.05%) 1847783963 instructions # 1.15 insn per cycle # 0.22 stalled cycles per insn ( +- 0.18% ) (50.64%) 407162722 branches # 1018925730.631 M/sec ( +- 0.12% ) (67.02%) 10932779 branch-misses # 2.69% of all branches ( +- 0.51% ) (83.28%) 0.40077 +- 0.00267 seconds time elapsed ( +- 0.67% ) lebedevri@pini-pini:/build/llvm-build-Clang-release$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-new.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 50572 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new.html' ... Performance counter stats for './bin/llvm-exegesis -mode=analysis -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-latency.yml -analysis-inconsistencies-output-file=/tmp/clusters-new.html' (9 runs): 6947.79 msec task-clock # 1.000 CPUs utilized ( +- 0.90% ) 217 context-switches # 31.236 M/sec ( +- 36.16% ) 1 cpu-migrations # 0.096 M/sec ( +- 50.00% ) 13258 page-faults # 1908.389 M/sec ( +- 0.34% ) 27830796523 cycles # 4006032.286 GHz ( +- 0.89% ) (83.30%) 1504554006 stalled-cycles-frontend # 5.41% frontend cycles idle ( +- 2.10% ) (83.32%) 16716574843 stalled-cycles-backend # 60.07% backend cycles idle ( +- 0.65% ) (33.38%) 17755545931 instructions # 0.64 insn per cycle # 0.94 stalled cycles per insn ( +- 0.09% ) (50.04%) 3897255686 branches # 560980426.597 M/sec ( +- 0.06% ) (66.70%) 117045395 branch-misses # 3.00% of all branches ( +- 0.47% ) (83.34%) 6.9507 +- 0.0627 seconds time elapsed ( +- 0.90% ) ``` I.e. it's +2.6% slowdown for one whole sweep, or +2% for 5 whole sweeps. Within noise i'd say. Should help with [[ https://bugs.llvm.org/show_bug.cgi?id=40787 \| PR40787 ]]. Reviewers: courbet, gchatelet Reviewed By: courbet Subscribers: tschuett, RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58476 llvm-svn: 354767	2019-02-25 09:36:12 +00:00
Roman Lebedev	0ea4992de1	[XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert" Summary: This reverts D50129 / rL338834: [XRay][tools] Use Support/JSON.h in llvm-xray convert Abstractions are great. Readable code is great. JSON support library is a good idea. However unfortunately, there is an internal detail that one needs to be aware of in `llvm::json::Object` - it uses `llvm::DenseMap`. So for every `llvm::json::Object`, even if you only store a single `int` entry there, you pay the whole price of `llvm::DenseMap`. Unfortunately, it matters for `llvm-xray`. I was trying to analyse the `llvm-exegesis` analysis mode performance, and for that i wanted to view the LLVM X-Ray log visualization in Chrome trace viewer. And the `llvm-xray convert` is sluggish, and sometimes even ended up being killed by OOM. `xray-log.llvm-exegesis.lwZ0sT` was acquired from `llvm-exegesis` (compiled with ` -fxray-instruction-threshold=128`) analysis mode over `-benchmarks-file` with 10099 points (one full latency measurement set), with normal runtime of 0.387s. Timings: Old: (copied from D58580) ``` $ perf stat -r 5 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (5 runs): 21346.24 msec task-clock # 1.000 CPUs utilized ( +- 0.28% ) 314 context-switches # 14.701 M/sec ( +- 59.13% ) 1 cpu-migrations # 0.037 M/sec ( +-100.00% ) 2181354 page-faults # 102191.251 M/sec ( +- 0.02% ) 85477442102 cycles # 4004415.019 GHz ( +- 0.28% ) (83.33%) 14526427066 stalled-cycles-frontend # 16.99% frontend cycles idle ( +- 0.70% ) (83.33%) 32371533721 stalled-cycles-backend # 37.87% backend cycles idle ( +- 0.27% ) (33.34%) 67896890228 instructions # 0.79 insn per cycle # 0.48 stalled cycles per insn ( +- 0.03% ) (50.00%) 14592654840 branches # 683631198.653 M/sec ( +- 0.02% ) (66.67%) 212207534 branch-misses # 1.45% of all branches ( +- 0.94% ) (83.34%) 21.3502 +- 0.0585 seconds time elapsed ( +- 0.27% ) ``` New: ``` $ perf stat -r 9 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (9 runs): 7178.38 msec task-clock # 1.000 CPUs utilized ( +- 0.26% ) 182 context-switches # 25.402 M/sec ( +- 28.84% ) 0 cpu-migrations # 0.046 M/sec ( +- 70.71% ) 33701 page-faults # 4694.994 M/sec ( +- 0.88% ) 28761053971 cycles # 4006833.933 GHz ( +- 0.26% ) (83.32%) 2028297997 stalled-cycles-frontend # 7.05% frontend cycles idle ( +- 1.61% ) (83.32%) 10773154901 stalled-cycles-backend # 37.46% backend cycles idle ( +- 0.38% ) (33.36%) 36199132874 instructions # 1.26 insn per cycle # 0.30 stalled cycles per insn ( +- 0.03% ) (50.02%) 6434504227 branches # 896420204.421 M/sec ( +- 0.03% ) (66.68%) 73355176 branch-misses # 1.14% of all branches ( +- 1.46% ) (83.33%) 7.1807 +- 0.0190 seconds time elapsed ( +- 0.26% ) ``` So using `llvm::json` nearly triples run-time on that test case. (+3x is times, not percent.) Memory: Old: ``` total runtime: 39.88s. bytes allocated in total (ignoring deallocations): 79.07GB (1.98GB/s) calls to allocation functions: 33267816 (834135/s) temporary memory allocations: 5832298 (146235/s) peak heap memory consumption: 9.21GB peak RSS (including heaptrack overhead): 147.98GB total memory leaked: 1.09MB ``` New: ``` total runtime: 17.42s. bytes allocated in total (ignoring deallocations): 5.12GB (293.86MB/s) calls to allocation functions: 21382982 (1227284/s) temporary memory allocations: 232858 (13364/s) peak heap memory consumption: 350.69MB peak RSS (including heaptrack overhead): 2.55GB total memory leaked: 79.95KB ``` Diff: ``` total runtime: -22.46s. bytes allocated in total (ignoring deallocations): -73.95GB (3.29GB/s) calls to allocation functions: -11884834 (529155/s) temporary memory allocations: -5599440 (249307/s) peak heap memory consumption: -8.86GB peak RSS (including heaptrack overhead): 0B total memory leaked: -1.01MB ``` So using `llvm::json` increases peak memory consumption on this testcase ~+27x. And total allocation count +15x. Both of these numbers are times, not percent. And note that memory usage is clearly unbound with `llvm::json`, it directly depends on the length of the log, so peak memory consumption is always increasing. This isn't so with the dumb code, there is no accumulating memory consumption, peak memory consumption is fixed. Naturally, that means it will handle much larger logs without OOM'ing. Readability is good, but the price is simply unacceptable here. Too bad none of this analysis was done as part of the development/review D50129 itself. Reviewers: dberris, kpw, sammccall Reviewed By: dberris Subscribers: riccibruno, hans, courbet, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58584 llvm-svn: 354764	2019-02-25 07:39:07 +00:00
George Rimar	ec3690fdaf	[obj2yaml] - Do not miss section index for special symbols. This fixes https://bugs.llvm.org/show_bug.cgi?id=40786 ("obj2yaml symbol output missing section index for SHN_ABS and SHN_COMMON symbols") Since SHN_ABS and SHN_COMMON symbols are special, we should preserve the st_shndx for them. The patch does this for them and the other special symbols. The test case is based on the test provided by James Henderson at the bug page! Differential revision: https://reviews.llvm.org/D58498 llvm-svn: 354661	2019-02-22 08:45:21 +00:00
Jordan Rupprecht	b79b906ee1	[llvm-objcopy][NFC] Add std::move() to fix older BB llvm-svn: 354603	2019-02-21 17:24:55 +00:00
Jordan Rupprecht	b93a103dbb	[llvm-objcopy][NFC] More error cleanup Summary: This removes calls to `error()`/`reportError()` in the main driver (llvm-objcopy.cpp) as well as the associated argv-parsing (CopyConfig.cpp). `logAllUnhandledErrors()` is now the main way to print errors. There are still a few uses from within the per-arch drivers, so we can't delete them yet... but almost! Reviewers: jhenderson, alexshap, espindola Reviewed By: jhenderson Subscribers: emaste, arichardson, jakehehrlich, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58316 llvm-svn: 354600	2019-02-21 17:05:19 +00:00
Jordan Rupprecht	d23e0bcdb9	[llvm-objcopy] Make removeSectionReferences batched Summary: Removing a large number of sections from a file with a lot of symbols can have abysmal (i.e. O(n^2)) performance, e.g. when running `--only-section` to extract one section out of a large file. This comes from iterating over all symbols in the symbol table each time we remove a section, to remove references to the section we just removed. Instead, do just one pass of symbol removal by passing a hash set of all the sections we'd like to remove references to. This fixes a regression when running llvm-objcopy -j <one section> on an object file with many sections and symbols -- on my machine, running `objcopy -j .keep_me huge-input.o /tmp/foo.o` takes .3s with GNU objcopy, 1.3s with an updated llvm-objcopy, and 7+ minutes with llvm-objcopy prior to this patch. Reviewers: MaskRay, jhenderson, jakehehrlich, alexshap, espindola Reviewed By: MaskRay, jhenderson Subscribers: echristo, emaste, arichardson, mgrang, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58296 llvm-svn: 354597	2019-02-21 16:45:42 +00:00
George Rimar	b6a05e83ac	[yaml2obj][obj2yaml] - Support SHT_GNU_verdef (.gnu.version_d) section. This patch adds support for parsing/dumping the .gnu.version section. Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverdefs.html Differential revision: https://reviews.llvm.org/D58437 llvm-svn: 354574	2019-02-21 12:21:43 +00:00
Fangrui Song	a13d56908a	[llvm-readobj] Change "SHT_MIPS_DWARF" to "MIPS_DWARF" Summary: This is to be consistent with the display of other MIPS section types. This string is also used by binutils-gdb/binutils/readelf.c:get_mips_section_type_name Since we are here, reorder the two enum constatns because SHT_MIPS_DWARF < SHT_MIPS_ABIFLAGS. Reviewers: jhenderson, atanasyan Reviewed By: jhenderson Subscribers: aprantl, sdardis, arichardson, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58496 llvm-svn: 354571	2019-02-21 11:35:41 +00:00
Fangrui Song	6412e8e039	Fix file header issues in fuzzers. NFC llvm-svn: 354551	2019-02-21 07:57:14 +00:00
Fangrui Song	34a4c8f773	Fix some include order and file headers issues. NFC llvm-svn: 354550	2019-02-21 07:42:31 +00:00
Andrea Di Biagio	8cb6c4646d	[MCA][ResourceManager] Add a table that maps processor resource indices to processor resource identifiers. This patch adds a lookup table to speed up resource queries in the ResourceManager. This patch also moves helper function 'getResourceStateIndex()' from ResourceManager.cpp to Support.h, so that we can reuse that logic in the SummaryView (and potentially other views in llvm-mca). No functional change intended. llvm-svn: 354470	2019-02-20 14:53:18 +00:00
Hans Wennborg	25bdbd561c	Fix the build with gcc/libstdc++ 4.8.2 after r354441 llvm-svn: 354469	2019-02-20 14:50:08 +00:00
George Rimar	4a031d3273	[yaml2elf] - Rename a variable. NFC. Was suggested during review of D58441. llvm-svn: 354463	2019-02-20 14:01:02 +00:00
George Rimar	5e2d8305b6	[yaml2obj] - Simplify implementation. NFCI. Knowing about how types are declared for 32/64 bit platforms: https://github.com/llvm-mirror/llvm/blob/master/include/llvm/BinaryFormat/ELF.h#L28 it is possible to simplify code that writes a binary a bit. The patch does that. Differential revision: https://reviews.llvm.org/D58441 llvm-svn: 354462	2019-02-20 13:58:43 +00:00
Roman Lebedev	2a88b46050	[llvm-exegesis] Opcode stabilization / reclusterization (PR40715) Summary: Given an instruction `Opcode`, we can make benchmarks (measurements) of the instruction characteristics/performance. Then, to facilitate further analysis we group the benchmarks with similar characteristics into clusters. Now, this is all not entirely deterministic. Some instructions have variable characteristics, depending on their arguments. And thus, if we do several benchmarks of the same instruction `Opcode`, we may end up with different performance characteristics measurements. And when we then do clustering, these several benchmarks of the same instruction `Opcode` may end up being clustered into different clusters. This is not great for further analysis. We shall find every `Opcode` with benchmarks not in just one cluster, and move all the benchmarks of said `Opcode` into one new unstable cluster per `Opcode`. I have solved this by making `ClusterId` a bit field, adding a `IsUnstable` bit, and introducing `-analysis-display-unstable-clusters` switch to toggle between displaying stable-only clusters and unstable-only clusters. The reclusterization is deterministically stable, produces identical reports between runs. (Or at least that is what i'm seeing, maybe it isn't) Timings/comparisons: old (current trunk/head) {F8303582} ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-old.html' ... no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-old.html' Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs): 6624.73 msec task-clock # 0.999 CPUs utilized ( +- 0.53% ) 172 context-switches # 25.965 M/sec ( +- 29.89% ) 0 cpu-migrations # 0.042 M/sec ( +- 56.54% ) 31073 page-faults # 4690.754 M/sec ( +- 0.08% ) 26538711696 cycles # 4006230.292 GHz ( +- 0.53% ) (83.31%) 2017496807 stalled-cycles-frontend # 7.60% frontend cycles idle ( +- 0.93% ) (83.32%) 13403650062 stalled-cycles-backend # 50.51% backend cycles idle ( +- 0.33% ) (33.37%) 19770706799 instructions # 0.74 insn per cycle # 0.68 stalled cycles per insn ( +- 0.04% ) (50.04%) 4419821812 branches # 667207369.714 M/sec ( +- 0.03% ) (66.69%) 121741669 branch-misses # 2.75% of all branches ( +- 0.28% ) (83.34%) 6.6283 +- 0.0358 seconds time elapsed ( +- 0.54% ) ``` patch, with reclustering but without filtering (i.e. outputting all the stable and unstable clusters) {F8303586} ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html' ... no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html' Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html' (25 runs): 6475.29 msec task-clock # 0.999 CPUs utilized ( +- 0.31% ) 213 context-switches # 32.952 M/sec ( +- 23.81% ) 1 cpu-migrations # 0.130 M/sec ( +- 43.84% ) 31287 page-faults # 4832.057 M/sec ( +- 0.08% ) 25939086577 cycles # 4006160.279 GHz ( +- 0.31% ) (83.31%) 1958812858 stalled-cycles-frontend # 7.55% frontend cycles idle ( +- 0.68% ) (83.32%) 13218961512 stalled-cycles-backend # 50.96% backend cycles idle ( +- 0.29% ) (33.37%) 19752995402 instructions # 0.76 insn per cycle # 0.67 stalled cycles per insn ( +- 0.04% ) (50.04%) 4417079244 branches # 682195472.305 M/sec ( +- 0.03% ) (66.70%) 121510065 branch-misses # 2.75% of all branches ( +- 0.19% ) (83.34%) 6.4832 +- 0.0229 seconds time elapsed ( +- 0.35% ) ``` Funnily, this measurement shows that said reclustering actually improved performance. patch, with reclustering, only the stable clusters {F8303594} ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html' ... no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html' Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html' (25 runs): 6387.71 msec task-clock # 0.999 CPUs utilized ( +- 0.13% ) 133 context-switches # 20.792 M/sec ( +- 23.39% ) 0 cpu-migrations # 0.063 M/sec ( +- 61.24% ) 31318 page-faults # 4903.256 M/sec ( +- 0.08% ) 25591984967 cycles # 4006786.266 GHz ( +- 0.13% ) (83.31%) 1881234904 stalled-cycles-frontend # 7.35% frontend cycles idle ( +- 0.25% ) (83.33%) 13209749965 stalled-cycles-backend # 51.62% backend cycles idle ( +- 0.16% ) (33.36%) 19767554347 instructions # 0.77 insn per cycle # 0.67 stalled cycles per insn ( +- 0.04% ) (50.03%) 4417480305 branches # 691618858.046 M/sec ( +- 0.03% ) (66.68%) 118676358 branch-misses # 2.69% of all branches ( +- 0.07% ) (83.33%) 6.3954 +- 0.0118 seconds time elapsed ( +- 0.18% ) ``` Performance improved even further?! Makes sense i guess, less clusters to print. patch, with reclustering, only the unstable clusters {F8303601} ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html' ... no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 43970 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html' Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters' (25 runs): 6124.96 msec task-clock # 1.000 CPUs utilized ( +- 0.20% ) 194 context-switches # 31.709 M/sec ( +- 20.46% ) 0 cpu-migrations # 0.039 M/sec ( +- 49.77% ) 31413 page-faults # 5129.261 M/sec ( +- 0.06% ) 24536794267 cycles # 4006425.858 GHz ( +- 0.19% ) (83.31%) 1676085087 stalled-cycles-frontend # 6.83% frontend cycles idle ( +- 0.46% ) (83.32%) 13035595603 stalled-cycles-backend # 53.13% backend cycles idle ( +- 0.16% ) (33.36%) 18260877653 instructions # 0.74 insn per cycle # 0.71 stalled cycles per insn ( +- 0.05% ) (50.03%) 4112411983 branches # 671484364.603 M/sec ( +- 0.03% ) (66.68%) 114066929 branch-misses # 2.77% of all branches ( +- 0.11% ) (83.32%) 6.1278 +- 0.0121 seconds time elapsed ( +- 0.20% ) ``` This tells us that the actual `-analysis-inconsistencies-output-file=` outputting only takes ~0.4 sec for 43970 benchmark points (3 whole sweeps) (Also, wow this is fast, it used to take several minutes originally) Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40715 \| PR40715 ]]. Reviewers: courbet, gchatelet Reviewed By: courbet Subscribers: tschuett, jdoerfert, llvm-commits, RKSimon Tags: #llvm Differential Revision: https://reviews.llvm.org/D58355 llvm-svn: 354441	2019-02-20 09:14:04 +00:00
Thomas Lively	065031a098	[WebAssembly] Update MC for bulk memory Summary: Rename MemoryIndex to InitFlags and implement logic for determining data segment layout in ObjectYAML and MC. Also adds a "passive" flag for the .section assembler directive although this cannot be assembled yet because the assembler does not support data sections. Reviewers: sbc100, aardappel, aheejin, dschuff Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57938 llvm-svn: 354397	2019-02-19 22:56:19 +00:00
Vedant Kumar	c8387a61d8	[llvm-cov] Add support for gcov --hash-filenames option The patch adds support for --hash-filenames to llvm-cov. This option adds md5 hash of the source path to the name of the generated .gcov file. The option is crucial for cases where you have multiple files with the same name but can't use --preserve-paths as resulting filenames exceed the limit. from gcov(1): ``` -x --hash-filenames By default, gcov uses the full pathname of the source files to to create an output filename. This can lead to long filenames that can overflow filesystem limits. This option creates names of the form source-file##md5.gcov, where the source-file component is the final filename part and the md5 component is calculated from the full mangled name that would have been used otherwise. ``` Patch by Igor Ignatev! Differential Revision: https://reviews.llvm.org/D58370 llvm-svn: 354379	2019-02-19 20:45:00 +00:00
Matthew Voss	3fb0589d43	Revert "Revert "[llvm-objdump] Allow short options without arguments to be grouped"" - Tests that use multiple short switches now test them grouped and ungrouped. - Ensure the output of ungrouped and grouped variants is identical Differential Revision: https://reviews.llvm.org/D57904 llvm-svn: 354375	2019-02-19 19:46:08 +00:00
George Rimar	efc720024f	[yaml2obj][obj2yaml] - Support SHT_GNU_versym (.gnu.version) section. This patch adds support for parsing dumping the .gnu.version section. Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symversion.html#SYMVERTBL Differential revision: https://reviews.llvm.org/D58280 llvm-svn: 354338	2019-02-19 15:29:07 +00:00
George Rimar	028b8cf90b	Recommit r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section." Fix: Replace assert(!IO.getContext() && "The IO context is initialized already"); with assert(IO.getContext() && "The IO context is not initialized"); (this was introduced in r354329, where I tried to quickfix the darwin BB and seems copypasted the assert from the wrong place). Original commit message: The section is described here: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html Patch just teaches obj2yaml/yaml2obj to dump and parse such sections. We did the finalization of string tables very late, and I had to move the logic to make it a bit earlier. That was needed in this patch since .gnu.version_r adds strings to .dynstr. This might also be useful for implementing other special sections. Everything else changed in this patch seems to be straightforward. Differential revision: https://reviews.llvm.org/D58119 llvm-svn: 354335	2019-02-19 14:53:48 +00:00
George Rimar	44d3bd8fa4	Revert r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section." Something went wrong. Bots are unhappy: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/44113/steps/test/logs/stdio llvm-svn: 354332	2019-02-19 14:38:25 +00:00
George Rimar	1625c306b8	[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section. The section is described here: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html Patch just teaches obj2yaml/yaml2obj to dump and parse such sections. We did the finalization of string tables very late, and I had to move the logic to make it a bit earlier. That was needed in this patch since .gnu.version_r adds strings to .dynstr. This might also be useful for implementing other special sections. Everything else changed in this patch seems to be straightforward. Differential revision: https://reviews.llvm.org/D58119 llvm-svn: 354328	2019-02-19 14:03:14 +00:00
George Rimar	c43ce2fac3	[yaml2obj] - Do not skip zeroes blocks if there are relocations against them. This is for -D -reloc combination. With this patch, we do not skip the zero bytes that have a relocation against them when -reloc is used. If -reloc is not used, then the behavior will be the same. Differential revision: https://reviews.llvm.org/D58174 llvm-svn: 354319	2019-02-19 12:38:36 +00:00
George Rimar	35adabb3a5	[yaml2obj] - Do not ignore explicit addresses for .dynsym and .dynstr This fixes https://bugs.llvm.org/show_bug.cgi?id=40339 Previously if the addresses were set in YAML they were ignored for .dynsym and .dynstr sections. The patch fixes that. Differential revision: https://reviews.llvm.org/D58168 llvm-svn: 354318	2019-02-19 12:15:04 +00:00
George Rimar	c54720f1f4	[llvm-readobj] - Simplify .gnu.version_d dumping. This is similar to D58048. Instead of scanning the dynamic table to read the DT_VERDEFNUM, we could take it from the sh_info field. (https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-94076/index.html) The patch does this. llvm-svn: 354270	2019-02-18 13:58:12 +00:00
Dave Lee	58afb80dde	llvm-nm: Observe -no-llvm-bc for archive members Summary: This change fixes the `-no-llvm-bc` flag to work with object files within archives. Currently the `-no-llvm-bc` flag works for regular object files, but not static libraries, where it continues to show bitcode symbol info. Original support was added in D4371. Reviewers: compnerd, smeenai, pcc Reviewed By: compnerd Subscribers: rupprecht, keith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D48798 llvm-svn: 354196	2019-02-16 06:59:49 +00:00
Matt Davis	f5f5656288	[llvm-cxxfilt] Fix a comment typo. NFC. llvm-svn: 354094	2019-02-15 02:43:14 +00:00
Jordan Rupprecht	e6ec8f30c9	[llvm-ar] Implement the P modifier. Summary: GNU ar has a `P` modifier that changes filename comparisons to use full paths instead of the basename. As noted in the GNU docs, regular archives are not created with full path names, so P is used to deal with archives created by other archive programs (e.g. see the updated `absolute-paths.test` test case). Since thin archives use full path names -- paths are relative to the archive -- it seems very error prone to not imply P when dealing with thin archives, so P is implied in those cases. (I think this is a deviation from GNU ar that makes sense). This fixes PR37436 via https://github.com/ClangBuiltLinux/linux/issues/33. Reviewers: mstorsjo, pcc, ruiu, davide, david2050, rnk Subscribers: tpimh, llvm-commits, nickdesaulniers Tags: #llvm Differential Revision: https://reviews.llvm.org/D57927 llvm-svn: 354044	2019-02-14 18:35:13 +00:00
Matthew Voss	e79e227d55	Revert "[llvm-objdump] Allow short options without arguments to be grouped" Reverted due to failures on the llvm-hexagon-elf. This reverts commit 77e1f27476c89f65eeb496d131065177e6417f23. llvm-svn: 354002	2019-02-14 01:39:43 +00:00
Matthew Voss	174bace66f	[llvm-objdump] Allow short options without arguments to be grouped Summary: https://bugs.llvm.org/show_bug.cgi?id=31679 Reviewers: kristina, jhenderson, grimar, jakehehrlich, rupprecht Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57904 llvm-svn: 353998	2019-02-14 00:39:40 +00:00
Jordan Rupprecht	088fd9a6ad	[llvm-ar][libObject] Fix relative paths when nesting thin archives. Summary: When adding one thin archive to another, we currently chop off the relative path to the flattened members. For instance, when adding `foo/child.a` (which contains `x.txt`) to `parent.a`, when flattening it we should add it as `foo/x.txt` (which exists) instead of `x.txt` (which does not exist). As a note, this also undoes the `IsNew` parameter of handling relative paths in r288280. The unit test there still passes. This was reported as part of testing the kernel build with llvm-ar: https://patchwork.kernel.org/patch/10767545/ (see the second point). Reviewers: mstorsjo, pcc, ruiu, davide, david2050, inglorion Reviewed By: ruiu Subscribers: void, jdoerfert, tpimh, mgorny, hans, nickdesaulniers, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57842 llvm-svn: 353995	2019-02-13 23:39:41 +00:00
Fangrui Song	aa6ef5a3d2	[llvm-readobj] Dump GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} notes in .note.gnu.property Reviewers: grimar, rupprecht Reviewed By: rupprecht Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58175 llvm-svn: 353991	2019-02-13 23:18:05 +00:00
Fangrui Song	9ae96f09ec	[llvm-readobj] Rename pr_data to PrData As requested by grimar in D58112. llvm-svn: 353951	2019-02-13 15:58:23 +00:00
Eugene Leviant	29576825e5	[llvm-objcopy] Add --strip-unneeded-symbol(s) Differential revision: https://reviews.llvm.org/D58027 llvm-svn: 353919	2019-02-13 07:34:54 +00:00

1 2 3 4 5 ...

10264 Commits