llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00

History

Roman Lebedev 5d534d8259 [llvm-exegesis] Loop unrolling for loop snippet repetitor mode I really needed this, like, factually, yesterday, when verifying dependency breaking idioms for AMD Zen 3 scheduler model. Consider the following example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=duplicate Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-4a7e50.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 0.31025, per_snippet_value: 0.31025 } error: '' info: '' assembled_snippet: C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C3 ... ``` What does it tell us? So wait, it can only execute ~3 x86 AVX YMM PXOR zero-idioms per cycle? That doesn't seem right. That's even less than there are pipes supporting this type of op. Now, second example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2418b5.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 1.00011, per_snippet_value: 1.00011 } error: '' info: '' assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3 ... ``` Now that's just worse. Due to the looping, the throughput completely plummeted, and now we can only do a single instruction/cycle!? That's not great. And final example: ``` $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop --loop-body-size=1000 Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c402e2.o --- mode: inverse_throughput key: instructions: - 'VPXORYrr YMM0 YMM0 YMM0' config: '' register_initial_values: [] cpu_name: znver3 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 1000000 measurements: - { key: inverse_throughput, value: 0.167087, per_snippet_value: 0.167087 } error: '' info: '' assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3 ... ``` So if we merge the previous two approaches, do duplicate this single-instruction snippet 1000x (loop-body-size/instruction count in snippet), and run a loop with 1000 iterations over that duplicated/unrolled snippet, the measured throughput goes through the roof, up to 5.9 instructions/cycle, which finally tells us that this idiom is zero-cycle! Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D102522		2021-05-25 12:08:27 +03:00
..
bugpoint	Avoid shuffle self-assignment in EXPENSIVE_CHECKS builds	2021-03-10 11:17:34 +00:00
bugpoint-passes
dsymutil	[dsymutil] Emit an error when the Mach-O exceeds the 4GB limit.	2021-05-24 16:29:06 -07:00
gold	[gold] Match lld WPD behavior for shared library symbols and add test	2021-02-17 15:28:49 -08:00
llc	Recommit "[VP,Integer,#2] ExpandVectorPredication pass"	2021-05-04 11:47:52 +02:00
lli	[lli] Honor the --entry-function flag in orc and orc-lazy modes.	2021-04-13 11:33:24 -07:00
llvm-ar	[NFC] Reordering parameters in getFile and getFileOrSTDIN	2021-03-25 09:47:49 -04:00
llvm-as	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-as-fuzzer
llvm-bcanalyzer	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-c-test	LLVM-C: Allow LLVM{Get/Set}Alignment on an atomicrmw/cmpxchg instruction.	2021-02-12 18:31:18 -05:00
llvm-cat	[tools] Use llvm::append_range (NFC)	2021-01-05 21:15:56 -08:00
llvm-cfi-verify	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-config	[MinGW] Use lib prefix for libraries	2020-09-12 22:01:29 +03:00
llvm-cov	[Coverage] Support overriding compilation directory	2021-05-11 15:26:45 -07:00
llvm-cvtres	[llvm-cvtres] Reduce the set of dependencies of llvm-cvtres. NFC.	2021-04-21 11:50:10 +03:00
llvm-cxxdump	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-cxxfilt	[demangler] Initial support for the new Rust mangling scheme	2021-05-03 16:44:30 -07:00
llvm-cxxmap	[Support] Don't include VirtualFileSystem.h in CommandLine.h	2021-04-21 10:19:01 -04:00
llvm-diff	Switch from llvm::is_trivially_copyable to std::is_trivially_copyable	2020-12-02 22:02:48 -08:00
llvm-dis	Allow llvm-dis to disassemble multiple files	2021-05-06 11:08:55 -07:00
llvm-dwarfdump	[NFC][llvm-dwarfdump] Avoid passing std::string by value in collectStatsForDie()	2021-05-12 01:29:37 -07:00
llvm-dwp	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-elfabi	[llvm-elfabi] Add flag to preserve timestamp when output is the same	2020-12-29 20:27:06 -08:00
llvm-exegesis	[llvm-exegesis] Loop unrolling for loop snippet repetitor mode	2021-05-25 12:08:27 +03:00
llvm-extract	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-go
llvm-gsymutil	Add option to llvm-gsymutil to read addresses from stdin.	2021-05-20 06:10:35 +00:00
llvm-ifs	[SystemZ][z/OS] Add IsText Argument to GetFile and GetFileOrSTDIN	2021-04-16 10:08:36 -04:00
llvm-isel-fuzzer	[AIX] Turn -fdata-sections on by default in Clang	2020-10-14 15:58:31 +00:00
llvm-itanium-demangle-fuzzer
llvm-jitlink	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-jitlistener	[MCJIT] Profile the code generated by MCJIT engine using Intel VTune profiler	2020-11-16 19:28:14 +11:00
llvm-libtool-darwin	[Support] Don't include VirtualFileSystem.h in CommandLine.h	2021-04-21 10:19:01 -04:00
llvm-link	NFC: Run clang-format over llvm-link.	2021-04-28 14:33:00 -07:00
llvm-lipo	[TextAPI] move source code files out of subdirectory, NFC	2021-04-05 10:24:42 -07:00
llvm-lto	Recommit "[LTO] Use lto::backend for code generation."	2021-02-15 10:05:42 +00:00
llvm-lto2	Don't use $ as suffix for symbol names in ThinLTOBitcodeWriter and other places	2021-03-29 13:03:52 +02:00
llvm-mc	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-mc-assemble-fuzzer	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-mc-disassemble-fuzzer
llvm-mca	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-microsoft-demangle-fuzzer
llvm-ml	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-modextract	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-mt	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-nm	[llvm-nm] Support the -V option, print that the tool is compatible with GNU nm	2021-05-13 22:36:25 +03:00
llvm-objcopy	[llvm-strip] Add support for '--' for delimiting options from input files	2021-05-20 03:33:51 -07:00
llvm-objdump	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-opt-fuzzer	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose	2021-05-07 21:51:47 -07:00
llvm-opt-report	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text	2021-04-06 07:23:31 -04:00
llvm-pdbutil	Removed redundant code.	2021-04-07 05:37:46 +04:00
llvm-profdata	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles	2021-04-22 00:42:37 -07:00
llvm-profgen	[NFC][CSSPGO]llvm-profge] Fix Build warning dueo to an attrbute usage.	2021-05-24 12:59:02 -07:00
llvm-rc	[llvm-rc] Add a GNU windres-like frontend to llvm-rc	2021-04-26 22:04:29 +03:00
llvm-readobj	[AMDGPU] Add gfx1034 target	2021-05-13 14:25:18 -04:00
llvm-reduce	[llvm-reduce] Don't unset dso_local on implicitly dso_local GVs	2021-04-30 11:57:22 -07:00
llvm-rtdyld	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
llvm-rust-demangle-fuzzer	Add fuzzer for Rust demangler	2021-05-05 12:50:50 -07:00
llvm-shlib	[CMake][ELF] Link libLLVM.so and libclang-cpp.so with -Bsymbolic-functions	2021-05-13 13:44:57 -07:00
llvm-size	[llvm-cov] Use is_contained (NFC)	2020-12-27 09:57:25 -08:00
llvm-special-case-list-fuzzer
llvm-split	[LTO] Update splitCodeGen to take a reference to the module. (NFC)	2021-01-29 11:53:11 +00:00
llvm-stress	Avoid shuffle self-assignment in EXPENSIVE_CHECKS builds	2021-03-10 11:17:34 +00:00
llvm-strings	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-symbolizer	[llvm-symbolizer] Place Mach-O options into the Mach-O option group.	2021-05-12 12:04:54 +01:00
llvm-undname	llvmbuildectomy - replace llvm-build by plain cmake	2020-11-13 10:35:24 +01:00
llvm-xray	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text	2021-04-06 07:23:31 -04:00
llvm-yaml-numeric-parser-fuzzer	[llvm] NFC: Cleanup llvm-yaml-numeric-parser-fuzzer	2021-02-15 14:52:53 +01:00
llvm-yaml-parser-fuzzer	[llvm] Use llvm::erase_value and llvm::erase_if (NFC)	2021-01-02 09:24:15 -08:00
lto	[LTO][Legacy] Decouple option parsing from LTOCodeGenerator	2021-03-31 16:43:26 +00:00
msbuild
obj2yaml	Reland: "[lld][WebAssembly] Initial support merging string data"	2021-05-10 16:03:38 -07:00
opt	[NewPM] Add options to PrintPassInstrumentation	2021-05-18 20:59:35 -07:00
opt-viewer
remarks-shlib	[tools][remarks-shlib] Don't build libRemarks.so without PIC	2020-09-20 12:40:21 +02:00
sancov	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo	2021-05-23 14:15:23 -07:00
sanstats	[NFC] Reordering parameters in getFile and getFileOrSTDIN	2021-03-25 09:47:49 -04:00
split-file	[Support] Don't include VirtualFileSystem.h in CommandLine.h	2021-04-21 10:19:01 -04:00
verify-uselistorder	[SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text	2021-04-06 07:23:31 -04:00
vfabi-demangle-fuzzer
xcode-toolchain
yaml2obj	[llvm] Make obj2yaml and yaml2obj LLVM utilities instead of tools	2020-10-19 10:21:21 -07:00
CMakeLists.txt