llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00

Author	SHA1	Message	Date
wlei	c4ed78c10b	[CSSPGO] Aggregation by the last K context frames for cold profiles This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one. This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D104131	2021-06-14 10:33:43 -07:00
David Blaikie	d3ee11b29a	llvm-objcopy: fix section size truncation/extension when dumping sections Since this only comes up with inputs containing sections at least 4GB large (I guess I could use a bzero section or something, so the input file doesn't have to be 4GB, but even then the output file would have to be 4GB, right?) I've skipped testing this. If there's a nice way to test this without needing 4GB inputs or output files. The subtlety here is demonstrated by this code: struct t { operator uint64_t(); }; static_assert(std::is_same_v<int, decltype(std::declval<bool>() ? 0 : std::declval<t>())>); static_assert(std::is_same_v<uint64_t, decltype(std::declval<bool>() ? 0 : std::declval<uint64_t>())>); Because of this difference, the original source code was getting an int type (truncating the actual size) and then extending it again, resulting in bogus values (I haven't thought through this hard enough to explain why the resulting value was 0xffff... - sign extension, possible UB, but in any case it's the wrong answer - in this particular case I was looking at that resulted in a size so large that we couldn't open a file large enough to write to and ended up with a rather vague: error: 'file_name.o': Invalid argument	2021-06-12 19:00:10 -07:00
Ian McIntyre	78819ccd55	[llvm-objcopy] Exclude empty sections in IHexWriter output IHexWriter was evaluating a section's physical address when deciding if that section should be written to an output. This approach does not account for a zero-sized section that has the same physical address as a sized section. The behavior varies from GNU objcopy, and may result in a HEX file that does not include all program sections. The IHexWriter now excludes zero-sized sections when deciding what should be written to the output. This affects the contents of the writer's `Sections` collection; we will not try to insert multiple sections that could have the same physical address. The behavior seems consistent with GNU objcopy, which always excludes empty sections, no matter the address. The new test case evaluates the IHexWriter behavior when provided a variety of empty sections that overlap or append a filled section. See the input file's comments for more information. Given that test input, and the change to the IHexWriter, GNU objcopy and llvm-objcopy produce the same output. Reviewed By: jhenderson, MaskRay, evgeny777 Differential Revision: https://reviews.llvm.org/D101332	2021-06-12 12:23:07 -07:00
Alexander Shaposhnikov	29e2b0118b	[llvm-objcopy][MachO] Do not strip symbols with the flag REFERENCED_DYNAMICALLY set Do not strip symbols having the flag REFERENCED_DYNAMICALLY set. Test plan: make check-all Differential revision: https://reviews.llvm.org/D104092	2021-06-11 16:34:59 -07:00
Andrew Litteken	b0b7d897d8	Revert "[IRSim] Adding basic implementation of llvm-sim." This reverts commit f47d00c54b52bd8adf9b8725912ea1cd0f1873d5.	2021-06-11 15:44:19 -05:00
Andrew Litteken	f5356265d8	[IRSim] Adding basic implementation of llvm-sim. This is a similarity visualization tool that accepts a Module and passes it to the IRSimilarityIdentifier. The resulting SimilarityGroups are output in a JSON file. Tests are found in test/tools/llvm-sim and check for the file not found, a bad module, and that the JSON is created correctly. Reviewers: paquette, jroelofs, MaskRay Recommit of: 15645d044bcfe2a0f63156048b302f997a717688 to fix linking errors. Differential Revision: https://reviews.llvm.org/D86974	2021-06-11 14:56:41 -05:00
Simon Pilgrim	165132af1b	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Simon Pilgrim	bb3852411e	[llvm-stress] Fix dead code preventing us generating per-element vector selects This has been reported several times by the PVS Studio team as well as coming up in some static analysis. getRandom() % 1 always returns 0 so we never actually test this codepath, (git blame suggests this has always been like this) - given that we have plenty of other "getRandom() & 1" the typo is pretty obvious, and matches the intention in the comment above - with this change we generate a nice mixture of scalar/vector condition selects of vectors. I don't know llvm-stress that well - but I don't think we guarantee that the same seed value will always generate the same IR for later versions of the program - just that the same binary would. Differential Revision: https://reviews.llvm.org/D104022	2021-06-11 10:56:19 +01:00
Simon Pilgrim	2e422e4f72	Fix implicit dependency on <string> header. NFCI.	2021-06-11 10:24:14 +01:00
David Tenty	dcf97ff292	[AIX] Build libLTO as MODULE rather than SHARED On CMake versions greater that >= 3.16 on AIX, shared libraries are created as archives (which is the normal form for the platform). However plugins libraries which are passed directly to a executable, like libLTO to the linker, are usual build as plain `.so`, so this patch restores this behaviour for libLTO on AIX (and adjust the name if need be to account for the fact that llvm_add_library likes to force an empty name prefix on modules), so we end up with the expected libLTO.so Reviewed By: w2yehia Differential Revision: https://reviews.llvm.org/D103824	2021-06-10 12:08:59 -04:00
Sam Powell	edcfd20e4e	Reland "[llvm] llvm-tapi-diff" This is relanding commit d1d36f7ad2ae82bea8a6fcc40d6c42a72e21f096 . This patch additionally addresses failures found in buildbots due to unstable build ordering & post review comments. This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-09 21:17:34 -07:00
Eric Astor	a76f35c86b	[ms] [llvm-ml] Add support for INCLUDE environment variable Also adds support for the ML.exe command-line flag /X, which ignores the INCLUDE environment variable. This relands commit c43f413b01b021a8f7b6fce013296114fa92a245 using lit's cross-platform `env` support. Differential Revision: https://reviews.llvm.org/D103989	2021-06-09 17:54:40 -04:00
Cyndy Ishida	1ad231f52e	Revert "Reland "[llvm] llvm-tapi-diff"" This reverts commit 20126c9fd4afe2fe11510becccaa769332da302f. The sorting fixes failed to have stable output on different platforms.	2021-06-09 13:48:09 -07:00
Cyndy Ishida	69fb005298	Revert "[llvm-tapi-diff] Apply stable sorting to output" This reverts commit 90a26a41e9ce16a4d471d25c2f7b36b5538fb4ce. This failed to fix ubuntu failures.	2021-06-09 13:48:09 -07:00
Sam Powell	ca95ab645d	[llvm-tapi-diff] Apply stable sorting to output * For the output, the attributes within the target slice should be grouped by the input order, then sorted by value ordering. This is to fix current ubuntu buildbot inconsistences.	2021-06-09 13:09:47 -07:00
Eric Astor	c35099a5e4	Revert "[ms] [llvm-ml] Add support for INCLUDE environment variable" This reverts commit c43f413b01b021a8f7b6fce013296114fa92a245 due to Windows environment build breaks	2021-06-09 15:49:51 -04:00
Eric Astor	e4ec17b56e	[ms] [llvm-ml] Add support for INCLUDE environment variable Also adds support for the ML.exe command-line flag /X, which ignores the INCLUDE environment variable.	2021-06-09 15:25:26 -04:00
Sam Powell	6baaf00ed9	Reland "[llvm] llvm-tapi-diff" This is relanding commit d1d36f7ad2ae82bea8a6fcc40d6c42a72e21f096 . This patch additionally addresses failures found in buildbots & post review comments. This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-09 10:35:41 -07:00
Florian Hahn	37c3bfd1ce	[LTO] Support new PM in ThinLTOCodeGenerator. This patch adds initial support for using the new pass manager when doing ThinLTO via libLTO. Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D102627	2021-06-09 10:05:14 +01:00
Brendon Cahoon	3a664dba6e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Brendon Cahoon	8238dc695f	Revert "[AMDGPU] Add gfx1013 target" This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00
Brendon Cahoon	c9fa68e102	[AMDGPU] Add gfx1013 target Differential Revision: https://reviews.llvm.org/D103663	2021-06-08 12:49:49 -04:00
David Blaikie	7aeebb1a7c	NFC: .clang-tidy: Inherit configs from parents to improve maintainability In the interests of disabling misc-no-recursion across LLVM (this seems like a stylistic choice that is not consistent with LLVM's style/development approach) this NFC preliminary change adjusts all the .clang-tidy files to inherit from their parents as much as possible. This change specifically preserves all the quirks of the current configs in order to make it easier to review as NFC. I validatad the change is NFC as follows: for X in `cat ../files.txt`; do mkdir -p ../tmp/$(dirname $X) touch $(dirname $X)/blaikie.cpp clang-tidy -dump-config $(dirname $X)/blaikie.cpp > ../tmp/$(dirname $X)/after rm $(dirname $X)/blaikie.cpp done (similarly for the "before" state, without this patch applied) for X in `cat ../files.txt`; do echo $X diff \ ../tmp/$(dirname $X)/before \ <(cat ../tmp/$(dirname $X)/after \ \| sed -e "s/,readability-identifier-naming$.$,-readability-identifier-naming/\1/" \ \| sed -e "s/,-llvm-include-order$.$,llvm-include-order/\1/" \ \| sed -e "s/,-misc-no-recursion$.$,misc-no-recursion/\1/" \ \| sed -e "s/,-clang-diagnostic-\$.$,clang-diagnostic-\/\1/") done (using sed to strip some add/remove pairs to reduce the diff and make it easier to read) The resulting report is: .clang-tidy clang/.clang-tidy 2c2 < Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-readability-identifier-naming,-misc-no-recursion' --- > Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-misc-no-recursion' compiler-rt/.clang-tidy 2c2 < Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,-llvm-header-guard,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes' --- > Checks: 'clang-diagnostic-,clang-analyzer-,-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-llvm-header-guard' flang/.clang-tidy 2c2 < Checks: 'clang-diagnostic-,clang-analyzer-,-,llvm-,-llvm-include-order,misc-,-misc-no-recursion,-misc-unused-parameters,-misc-non-private-member-variables-in-classes' --- > Checks: 'clang-diagnostic-,clang-analyzer-,-,llvm-,misc-,-misc-unused-parameters,-misc-non-private-member-variables-in-classes,-llvm-include-order,-misc-no-recursion' flang/include/flang/Lower/.clang-tidy flang/include/flang/Optimizer/.clang-tidy flang/lib/Lower/.clang-tidy flang/lib/Optimizer/.clang-tidy lld/.clang-tidy lldb/.clang-tidy llvm/tools/split-file/.clang-tidy mlir/.clang-tidy The `clang/.clang-tidy` change is a no-op, disabling an option that was never enabled. The compiler-rt and flang changes are no-op reorderings of the same flags. (side note, the .clang-tidy file in parallel-libs is broken and crashes clang-tidy because it uses "lowerCase" as the style instead of "lower_case" - so I'll deal with that separately) Differential Revision: https://reviews.llvm.org/D103842	2021-06-08 08:25:59 -07:00
jasonliu	c03aed5d1b	[XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing Add in the ability of parsing symbol table for 64 bit object. Reviewed By: jhenderson, DiggerLin Differential Revision: https://reviews.llvm.org/D85774	2021-06-07 17:24:13 +00:00
Simon Pilgrim	4ab5e47c13	xray-color-helper.cpp - add missing implicit cmath header dependency. NFCI. Noticed while investigating if we can remove an unnecessary MathExtras.h include from SmallVector.h (necessary for gcc builds but not MSVC)	2021-06-05 21:33:24 +01:00
Simon Pilgrim	385d538097	xray-color-helper.h - sort includes. NFCI.	2021-06-05 21:33:23 +01:00
Rong Xu	559805b594	[SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550	2021-06-04 11:22:06 -07:00
Cyndy Ishida	35b9c30842	Revert "[llvm] llvm-tapi-diff" This reverts commit d1d36f7ad2ae82bea8a6fcc40d6c42a72e21f096. Reverting this patch to investigate linux bot failures + fix with author offline	2021-06-03 21:10:51 -07:00
Wenlei He	34de16e5aa	[CSSPGO][llvm-profgen] Make extended binary the default output format Make extended binary the default output format for CSSPGO. This avoids having to pass flag every time when generating profile. It also matches llvm-profdata where binary profile is the default (should we switch to extbinary as default for llvm-profdata?). We plan to compress name table for context profile, which depends on the built-in compression of extbinary. Differential Revision: https://reviews.llvm.org/D103650	2021-06-03 17:58:16 -07:00
Sam Powell	270d9dda87	[llvm] llvm-tapi-diff This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files. Reviewed By: ributzka, JDevlieghere Differential Revision: https://reviews.llvm.org/D101835	2021-06-03 11:38:00 -07:00
Nikita Popov	1c866d4e4f	[ADT] Move DenseMapInfo for ArrayRef/StringRef into respective headers (NFC) This is a followup to D103422. The DenseMapInfo implementations for ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h headers, which means that these two headers no longer need to be included by DenseMapInfo.h. This required adding a few additional includes, as many files were relying on various things pulled in by ArrayRef.h. Differential Revision: https://reviews.llvm.org/D103491	2021-06-03 18:34:36 +02:00
Kim-Anh Tran	c835b7a765	[llvm-dwp] Add support for rnglists and loclists This patch updates llvm-dwp to include rnglists and loclists when parsing debug sections. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101894	2021-06-02 12:31:35 -07:00
Kim-Anh Tran	8448b16d77	[llvm-dwp] Add support for DWARFv5 type units ... This patch adds support for DWARFv5 type units: parsing from the .debug_info section, and writing index to the type unit index. Previously, the type units were part of the .debug_types section which is no longer used in DWARFv5. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101818	2021-06-02 12:24:08 -07:00
Kim-Anh Tran	98cb2ee214	[llvm-dwp] Adding support for v5 index writing This patch adds general support for DWARFv5 index writing. In particular, this means only allowing inputs with one version, either DWARFv5 or DWARFv4. This patch adds the .debug_macro section as an example, but the DWARFv5 type support and loc and rangelists are still missing (and upcoming). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102315	2021-06-02 12:21:31 -07:00
Kim-Anh Tran	73a355d20e	[llvm-dwp] Skip type unit debug info sections This patch makes llvm-dwp skip debug info sections that may not be encoding a compile unit. In DWARF5, debug info sections are also used for type units. As in preparation to support type units, make llvm-dwp aware of other uses of debug info sections but skip them for now. The patch first records all .debug_info sections, then goes through them one by one and records the cu debug info section for writing the index unit, and copies that section to the final dwp output info section. If it's not a compile unit, skip. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102312	2021-06-02 11:48:10 -07:00
Rahman Lavaee	2769abb764	[llvm-readobj] Print function names with `--bb-addr-map`. This patch uses the `getSymbolIndexForFunctionAddress` helper function to print function names for BB address map entries. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102900	2021-06-01 18:40:42 -07:00
gbreynoo	081a4bd35e	[llvm-dwarfdump][test] Add missing dedicated tests for some options This change adds tests specifically for --parent-recurse-depth, --quiet and -o. The test for -o found a typo in an error message which is also fixed in this change. Differential Revision: https://reviews.llvm.org/D103250	2021-06-01 14:57:00 +01:00
Andrea Di Biagio	44ebb7579c	[MCA][NFCI] Minor changes to InstrBuilder and Instruction. This is based on the assumption that most simulated instructions don't define more than one or two registers. This is true for example on x86, where most instruction definitions don't declare more than one register write. The default code region size has been increased from 8 to 16. This is based on the assumption that, for small microbenchmarks, the typical code snippet size is often less than 16 instructions. mca::Instruction now uses bitfields to pack flags. No functional change intended.	2021-05-31 17:05:13 +01:00
Alexey Lapshin	7ec130df24	[llvm-objcopy][NFC] Refactor CopyConfig structure - remove lazy options processing. During reviewing D102277 it was decided to remove lazy options processing from llvm-objcopy CopyConfig structure. This patch transforms processing of ELF lazy options into the in-place processing. Differential Revision: https://reviews.llvm.org/D103260	2021-05-31 14:40:27 +03:00
Andrea Di Biagio	8de4c75a32	[MCA] Refactor the InOrderIssueStage stage. NFCI Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile. Changed how the InOrderIssueStage keeps track of backend stalls. Stall events are now generated from method notifyStallEvent(). No functional change intended.	2021-05-27 22:28:04 +01:00
Simon Giesecke	f35a5b956e	Add --quiet option to llvm-gsymutil to suppress output of warnings. Differential Revision: https://reviews.llvm.org/D102829	2021-05-27 12:36:34 +00:00
Esme-Yi	54468e15fa	[llvm-objdump] Print the DEBUG type under `--section-headers`. Summary: Under the option --section-headers, we can only print the section types of TEXT, DATA, and BSS for now. This patch adds the DEBUG type. Reviewed By: jhenderson, Higuoxing Differential Revision: https://reviews.llvm.org/D102603	2021-05-27 04:53:14 +00:00
Rahman Lavaee	b75ea85ad5	[llvm-readobj] Optimize printing stack sizes to linear time. Currently, each function name lookup is a linear iteration over all symbols defined in the object file which makes the total running time quadratic. This patch optimizes the function name lookup by populating an address to index map upon the first function name lookup which is used to lookup each function name in O(1). impact: For the clang binary built with `-fstack-size-section`, this improves the running time of `llvm-readobj --stack-size` from 7 minutes to 0.25 seconds. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D103072	2021-05-26 13:14:33 -07:00
Fangrui Song	c2d4999f01	[llvm-mc] Add -M to replace -riscv-no-aliases and -riscv-arch-reg-names In objdump, many targets support `-M no-aliases`. Instead of having a `-*-no-aliases` for each target when LLVM adds the support, it makes more sense to introduce objdump style `-M`. -riscv-arch-reg-names is removed. -riscv-no-aliases has too many uses and thus is retained for now. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D103004	2021-05-26 10:43:32 -07:00
Esme-Yi	b08bc2bd30	[NFC][object] Change the input parameter of the method isDebugSection. Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102601	2021-05-26 08:47:53 +00:00
Wenlei He	e250a71651	[CSSPGO][llvm-profgen] Change default cold threshold for context merging llvm-profgen uses profile summary based cold threshold to merge and trim cold context profile. This is to strike a good balance between profile size and performance. We've been using 99.9% as the cutoff to save profile size without affecting performance. This change switch to use 99.9% instead of 99.9999% as default cold threshold cutoff for llvm-profgen. Redundant switch csprof-cold-thres is also removed and tests cleaned up. Differential Revision: https://reviews.llvm.org/D103071	2021-05-25 10:41:10 -07:00
Langston Barrett	874da3772d	[llvm-reduce] Exit when input module is malformed The parseInputFile function returns an empty unique_ptr to signal an error, like when the input file doesn't exist, or is malformed. In this case, the tool should exit immediately rather than segfault by dereferencing the unique_ptr later. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102891	2021-05-25 10:01:12 -07:00
Roman Lebedev	5d534d8259	[llvm-exegesis] Loop unrolling for loop snippet repetitor mode I really needed this, like, factually, yesterday, when verifying dependency breaking idioms for AMD Zen 3 scheduler model. Consider the following example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=duplicate Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-4a7e50.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 0.31025, per_snippet_value: 0.31025 } error: '' info: '' assembled_snippet: C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C3 ... ``` What does it tell us? So wait, it can only execute ~3 x86 AVX YMM PXOR zero-idioms per cycle? That doesn't seem right. That's even less than there are pipes supporting this type of op. Now, second example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2418b5.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 1.00011, per_snippet_value: 1.00011 } error: '' info: '' assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3 ... ``` Now that's just worse. Due to the looping, the throughput completely plummeted, and now we can only do a single instruction/cycle!? That's not great. And final example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop --loop-body-size=1000 Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c402e2.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 0.167087, per_snippet_value: 0.167087 } error: '' info: '' assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3 ... ``` So if we merge the previous two approaches, do duplicate this single-instruction snippet 1000x (loop-body-size/instruction count in snippet), and run a loop with 1000 iterations over that duplicated/unrolled snippet, the measured throughput goes through the roof, up to 5.9 instructions/cycle, which finally tells us that this idiom is zero-cycle! Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D102522	2021-05-25 12:08:27 +03:00
Jonas Devlieghere	8e74c099a0	[dsymutil] Emit an error when the Mach-O exceeds the 4GB limit. The Mach-O object file format is limited to 4GB because its used of 32-bit offsets in the header. It is possible for dsymutil to (silently) emit an invalid binary. Instead of having consumers deal with this, emit an error instead.	2021-05-24 16:29:06 -07:00
Jonas Devlieghere	8f5c83a0a1	[dsymutil] Use EXIT_SUCCESS and EXIT_FAILURE (NFC)	2021-05-24 16:29:05 -07:00

1 2 3 4 5 ...

12794 Commits