llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Zachary Turner	9560bf0466	Update the LLVM VS integration to sign the assembly. llvm-svn: 338740	2018-08-02 17:20:31 +00:00
Ben Dunbobbin	ede5cc86c7	[llvm-ar] Correct help text Corrected and simplified the help text. It was clearly too difficult to maintain before (see e.g. @227296) making it simpler and more consistent it should help people keep it up to date. Differential Revision: https://reviews.llvm.org/D48577 llvm-svn: 338703	2018-08-02 11:27:38 +00:00
Andrea Di Biagio	8b13141f92	[llvm-mca] Use a vector to store ResourceState objects in the ResourceManager. We don't need to use a map to store ResourceState objects. The number of processor resources is known statically from the scheduling model. We can therefore use a vector, and reserve a slot for each processor resource that we want to simulate. Every time the ResourceManager queries the ResourceState vector, the index to the vector of ResourceState objects can be easily computed from the processor resource mask. This drastically reduces the time complexity of method ResourceManager::use() and method ResourceManager::release(). This patch gives an average speedup of 12%. llvm-svn: 338702	2018-08-02 11:12:35 +00:00
Guillaume Chatelet	4c5dcde9ea	[llvm-exegesis] Rename InstructionInstance into InstructionBuilder. Summary: Non functional change. Subscribers: tschuett, courbet, llvm-commits Differential Revision: https://reviews.llvm.org/D50176 llvm-svn: 338701	2018-08-02 11:12:02 +00:00
Jordan Rupprecht	edc06f5bd6	[llvm-objcopy] Add missing -I command line flag alias for --input-target llvm-svn: 338635	2018-08-01 20:59:39 +00:00
Zachary Turner	5d0ac84a37	[llvm-undname Add an option to dump back references. This is useful for understanding how our demangler processes back references and for investigating issues related to back references. But it's a feature only useful for debugging the demangling process itself, so I'm marking it hidden. llvm-svn: 338609	2018-08-01 18:33:04 +00:00
Jordan Rupprecht	5fd6d96e6e	[llvm-objcopy] Add support for --rename-section flags from gnu objcopy Summary: Add support for --rename-section flags from gnu objcopy. Not all flags appear to have an effect for ELF objects, but allowing them would allow easier drop-in replacement. Other unrecognized flags are rejected. This was only tested by comparing flags printed by "readelf -e <.o>" against the output of gnu vs llvm objcopy, it hasn't been tested to be valid beyond that. Reviewers: jakehehrlich, alexshap Subscribers: llvm-commits, paulsemel, alexshap Differential Revision: https://reviews.llvm.org/D49870 llvm-svn: 338582	2018-08-01 16:23:22 +00:00
Andrea Di Biagio	692173433e	[llvm-mca] Correctly update the rank in `Scheduler::select()`. Found by inspection. llvm-svn: 338579	2018-08-01 16:06:33 +00:00
Guillaume Chatelet	4d60477cbd	[llvm-exegesis] Provide a way to handle memory instructions. Summary: And implement memory instructions on X86. This fixes PR36906. Reviewers: gchatelet Reviewed By: gchatelet Subscribers: lebedev.ri, filcab, mgorny, tschuett, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D48935 llvm-svn: 338567	2018-08-01 14:41:45 +00:00
Jonas Devlieghere	ef964ac532	[dsymutil] Convert recursion in lookForDIEsToKeep into worklist. The functions `lookForDIEsToKeep` and `keepDIEAndDependencies` can have some very deep recursion. This tackles part of this problem by removing the recursion from `lookForDIEsToKeep` by turning it into a worklist. The difficulty in doing so is the computation of incompleteness, which depends on the incompleteness of its children. To compute this, we insert "continuation markers" into the worklist. This informs the work loop to (re)compute the incompleteness property of the DIE associated with it (i.e. the parent of the previously processed DIE). This patch should generate byte-identical output. Unfortunately it also has some impact of performance, regressing by about 4% when processing clang on my machine. Differential revision: https://reviews.llvm.org/D48899 llvm-svn: 338536	2018-08-01 13:24:39 +00:00
Andrea Di Biagio	e75426e0b8	[llvm-mca] Improve code comments. NFC. llvm-svn: 338513	2018-08-01 10:49:01 +00:00
Fangrui Song	f4cc6ca32d	[llvm-objcopy] Make --strip-debug strip .gdb_index Summary: See binutils-gdb/bfd/elf.c, GNU objcopy also strips .stab* (STABS) .line* (DWARF 1) .gnu.linkonce.wi.* (linkonce section for .debug_info) but I'm not sure we need to be compatible with it. Reviewers: dblaikie, alexshap, jakehehrlich, jhenderson Reviewed By: alexshap, jakehehrlich Subscribers: aprantl, JDevlieghere, jakehehrlich, llvm-commits Differential Revision: https://reviews.llvm.org/D50100 llvm-svn: 338443	2018-07-31 21:26:35 +00:00
Matt Davis	d0dbf0dd79	[llvm-mca] Update the help text to reflect "physical" registers. NFC. llvm-svn: 338430	2018-07-31 20:05:08 +00:00
Alexandre Ganea	0f20d5d6a0	[CodeView] Minimal support for S_UNAMESPACE records Differential Revision: https://reviews.llvm.org/D50007 llvm-svn: 338417	2018-07-31 19:15:50 +00:00
Andrea Di Biagio	ec4329b38b	[llvm-mca] Remove README.txt A detailed description of the tool has been recently added by Matt to CommandGuide/llvm-mca.rst. File README.txt is now redundant and can be removed; all the relevant user-guide information has been improved and then moved to llvm-mca.rst. In future, we should add another .rst for the "llvm-mca developer manual" to provide infromation about: - llvm-mca internals. - How to add custom stages to the simulated pipeline. - How to provide extra processor info in the scheduling model to improve the analysis performed by llvm-mca. llvm-svn: 338386	2018-07-31 14:23:49 +00:00
Andrea Di Biagio	0e53532aeb	[llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms. This patch teaches llvm-mca how to identify dependency breaking instructions on btver2. An example of dependency breaking instructions is the zero-idiom XOR (example: `XOR %eax, %eax`), which always generates zero regardless of the actual value of the input register operands. Dependency breaking instructions don't have to wait on their input register operands before executing. This is because the computation is not dependent on the inputs. Not all dependency breaking idioms are also zero-latency instructions. For example, `CMPEQ %xmm1, %xmm1` is independent on the value of XMM1, and it generates a vector of all-ones. That instruction is not eliminated at register renaming stage, and its opcode is issued to a pipeline for execution. So, the latency is not zero. This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis interface. That method takes as input an instruction (i.e. MCInst) and a MCSubtargetInfo. The default implementation of isDependencyBreaking() conservatively returns false for all instructions. Targets may override the default behavior for specific CPUs, and return a value which better matches the subtarget behavior. In future, we should teach to Tablegen how to automatically generate the body of isDependencyBreaking from scheduling predicate definitions. This would allow us to expose the knowledge about dependency breaking instructions to the machine schedulers (and, potentially, other codegen passes). Differential Revision: https://reviews.llvm.org/D49310 llvm-svn: 338372	2018-07-31 13:21:43 +00:00
Jonas Devlieghere	a35775300e	[dsymutil] Simplify temporary file handling. Dsymutil's update functionality was broken on Windows because we tried to rename a file while we're holding open handles to that file. TempFile provides a solution for this through its keep(Twine) method. This patch changes dsymutil to make use of that functionality. Differential revision: https://reviews.llvm.org/D49860 llvm-svn: 338216	2018-07-29 14:56:15 +00:00
Fangrui Song	553090ebe0	[llvm-objcopy] Make --strip-debug strip .zdebug* (zlib-gnu) sections This behavior matches GNU objcopy. llvm-svn: 338173	2018-07-27 22:51:36 +00:00
Stephen Hines	c3d2618634	Handle the lack of a symbol table correctly. Summary: These two cases will trigger a dereference on a nullptr, since the SymbolTable can be nonexistent for a given library, in addition to just being empty. Reviewers: alexshap Reviewed By: alexshap Subscribers: meikeb, kongyi, chh, jakehehrlich, llvm-commits, pirama Differential Revision: https://reviews.llvm.org/D49534 llvm-svn: 338062	2018-07-26 20:05:31 +00:00
Michael Kruse	8fc32bf8f9	[ADT] Replace std::isprint by llvm::isPrint. The standard library functions ::isprint/std::isprint have platform- and locale-dependent behavior which makes LLVM's output less predictable. In particular, regression tests my fail depending on the implementation of these functions. Implement llvm::isPrint in StringExtras.h with a standard behavior and replace all uses of ::isprint/std::isprint by a call it llvm::isPrint. The function is inlined and does not look up language settings so it should perform better than the standard library's version. Such a replacement has already been done for isdigit, isalpha, isxdigit in r314883. gtest does the same in gtest-printers.cc using the following justification: // Returns true if c is a printable ASCII character. We test the // value of c directly instead of calling isprint(), which is buggy on // Windows Mobile. inline bool IsPrintableAscii(wchar_t c) { return 0x20 <= c && c <= 0x7E; } Similar issues have also been encountered by Julia: https://github.com/JuliaLang/julia/issues/7416 I noticed the problem myself when on Windows isprint('\t') started to evaluate to true (see https://stackoverflow.com/questions/51435249) and thus caused several unit tests to fail. The result of isprint doesn't seem to be well-defined even for ASCII characters. Therefore I suggest to replace isprint by a platform-independent version. Differential Revision: https://reviews.llvm.org/D49680 llvm-svn: 338034	2018-07-26 15:31:41 +00:00
Dean Michael Berris	79adceb6c6	[MCA] Avoid an InstrDesc copy in mca::LSUnit::reserve. Summary: InstrDesc contains 4 vectors (as well as some other data), so it's expensive to copy. Authored By: orodley Reviewers: andreadb, mattd, dberris Reviewed By: mattd, dberris Subscribers: dberris, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D49775 llvm-svn: 337985	2018-07-26 00:02:54 +00:00
Jonas Devlieghere	8dc3169d33	[dsymutil] Add support for generating DWARF5 accelerator tables. This patch add support for emitting DWARF5 accelerator tables (.debug_names) from dsymutil. Just as with the Apple style accelerator tables, it's possible to update existing dSYMs. This patch includes a test that show how you can convert back and forth between the two types. If no kind of table is specified, dsymutil will default to generating Apple-style accelerator tables whenever it finds those in its input. The same is true when there are no accelerator tables at all. Finally, in the remaining case, where there's at least one DWARF v5 table and no Apple ones, the output will contains a DWARF accelerator tables (.debug_names). Differential revision: https://reviews.llvm.org/D49137 llvm-svn: 337980	2018-07-25 23:01:38 +00:00
Paul Semel	3560e7c201	[llvm-objdump] Add dynamic section printing to private-headers option Differential Revision: https://reviews.llvm.org/D49016 llvm-svn: 337902	2018-07-25 11:09:20 +00:00
Paul Semel	e5caf328a5	[llvm-readobj] Generic hex-dump option Helpers are available to make this option file format independant. This patch adds the feature for Wasm file format. It doesn't change the behavior of the other file format handling. Differential Revision: https://reviews.llvm.org/D49545 llvm-svn: 337896	2018-07-25 10:04:37 +00:00
Dean Michael Berris	7952e860b2	llvm-xray: Broken chrome trace event format output Summary: Missing comma separator for EXIT and TAIL_EXIT RecordTypes emit invalid JSON output for Chrome Trace Event Format. Reviewers: dberris Reviewed By: dberris Subscribers: sammccall, kpw, llvm-commits Differential Revision: https://reviews.llvm.org/D49687 llvm-svn: 337795	2018-07-24 01:45:34 +00:00
Andres Freund	7602e1153a	Add PerfJITEventListener for perf profiling support. This new JIT event listener supports generating profiling data for the linux 'perf' profiling tool, allowing it to generate function and instruction level profiles. Currently this functionality is not enabled by default, but must be enabled with LLVM_USE_PERF=yes. Given that the listener has no dependencies, it might be sensible to enable by default once the initial issues have been shaken out. I followed existing precedent in registering the listener by default in lli. Should there be a decision to enable this by default on linux, that should probably be changed. Please note that until https://reviews.llvm.org/D47343 is resolved, using this functionality with mcjit rather than orcjit will not reliably work. Disregarding the previous comment, here's an example: $ cat /tmp/expensive_loop.c bool stupid_isprime(uint64_t num) { if (num == 2) return true; if (num < 1 \|\| num % 2 == 0) return false; for(uint64_t i = 3; i < num / 2; i+= 2) { if (num % i == 0) return false; } return true; } int main(int argc, char **argv) { int numprimes = 0; for (uint64_t num = argc; num < 100000; num++) { if (stupid_isprime(num)) numprimes++; } return numprimes; } $ clang -ggdb -S -c -emit-llvm /tmp/expensive_loop.c -o /tmp/expensive_loop.ll $ perf record -o perf.data -g -k 1 ./bin/lli -jit-kind=mcjit /tmp/expensive_loop.ll 1 $ perf inject --jit -i perf.data -o perf.jit.data $ perf report -i perf.jit.data - 92.59% lli jitted-5881-2.so [.] stupid_isprime stupid_isprime main llvm::MCJIT::runFunction llvm::ExecutionEngine::runFunctionAsMain main __libc_start_main 0x4bf6258d4c544155 + 0.85% lli ld-2.27.so [.] do_lookup_x And line-level annotations also work: │ for(uint64_t i = 3; i < num / 2; i+= 2) { │1 30: movq $0x3,-0x18(%rbp) 0.03 │1 38: mov -0x18(%rbp),%rax 0.03 │ mov -0x10(%rbp),%rcx │ shr $0x1,%rcx 3.63 │ ┌──cmp %rcx,%rax │ ├──jae 6f │ │ if (num % i == 0) 0.03 │ │ mov -0x10(%rbp),%rax │ │ xor %edx,%edx 89.00 │ │ divq -0x18(%rbp) │ │ cmp $0x0,%rdx 0.22 │ │↓ jne 5f │ │ return false; │ │ movb $0x0,-0x1(%rbp) │ │↓ jmp 73 │ │ } 3.22 │1 5f:│↓ jmp 61 │ │ for(uint64_t i = 3; i < num / 2; i+= 2) { Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D44892 llvm-svn: 337789	2018-07-24 00:54:06 +00:00
Vedant Kumar	523601c4da	[Debugify] Export per-pass debug info loss statistics Add a -debugify-export option to opt. This exports per-pass `debugify` loss statistics to a file in CSV format. For some interesting numbers on debug value loss during an -O2 build of the sqlite3 amalgamation, see the review thread. Differential Revision: https://reviews.llvm.org/D49003 llvm-svn: 337787	2018-07-24 00:41:29 +00:00
Vedant Kumar	069adf4d6b	[Debugify] Move interface definitions to a header, NFC This is a minor cleanup in preparation for a change to export DI statistics from -check-debugify. To do that, it would be cleaner to have a dedicated header for the debugify interface. llvm-svn: 337786	2018-07-24 00:41:28 +00:00
Paul Semel	d20cb8772f	[yaml2obj] Add default sh_entsize for dynamic sections Dynamic section holds a table, so the sh_entsize might be set. As the dynamic section entry size never changes, we can default it to the size of a dynamic entry. Differential Revision: https://reviews.llvm.org/D49619 llvm-svn: 337725	2018-07-23 18:49:04 +00:00
Aaron Ballman	389f26f768	Fixing a typo; NFC. llvm-svn: 337719	2018-07-23 18:09:43 +00:00
Zachary Turner	9ef1f44d7b	[llvm-undname] Flush output before demangling. If an error occurs and we write it to stderr, it could appear before we wrote the mangled name which we're undecorating. By flushing stdout first, we ensure that the messages are always sequenced in the correct order. llvm-svn: 337645	2018-07-21 15:39:05 +00:00
Martin Storsjo	5788d90da0	[llvm-undname] Remove a superfluous semicolon. NFC. llvm-svn: 337615	2018-07-20 20:48:36 +00:00
Jordan Rupprecht	760914730c	[llvm-objcopy] Add basic support for --rename-section Summary: Add basic support for --rename-section=old=new to llvm-objcopy. A full replacement for GNU objcopy requires also modifying flags (i.e. --rename-section=old=new,flag1,flag2); I'd like to keep that in a separate change to keep this simple. Reviewers: jakehehrlich, alexshap Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49576 llvm-svn: 337604	2018-07-20 19:54:24 +00:00
Zachary Turner	0bef0a2efc	Add a Microsoft Demangler. This adds initial support for a demangling library (LLVMDemangle) and tool (llvm-undname) for demangling Microsoft names. This doesn't cover 100% of cases and there are some known limitations which I intend to address in followup patches, at least until such time that we have (near) 100% test coverage matching up with all of the test cases in clang/test/CodeGenCXX/mangle-ms-*. Differential Revision: https://reviews.llvm.org/D49552 llvm-svn: 337584	2018-07-20 17:27:48 +00:00
Zachary Turner	2add682270	Rewrite the VS integration scripts. This is a new modernized VS integration installer. It adds a Visual Studio .sln file which, when built, outputs a VSIX that can be used to install ourselves as a "real" Visual Studio Extension. We can even upload this extension to the visual studio marketplace. This fixes a longstanding problem where we didn't support installing into VS 2017 and higher. In addition to supporting VS 2017, due to the way this is written we now longer need to do anything special to support future versions of VS as well. Everything should "just work". This also fixes several bugs with our old integration, such as MSBuild triggering full rebuilds when /Zi was used. Finally, we add a new UI page called "LLVM" which becomes visible when the LLVM toolchain is selected. For now this only contains one option which is the path to clang-cl.exe, but in the future we can add more things here. Differential Revision: https://reviews.llvm.org/D42762 llvm-svn: 337572	2018-07-20 16:30:02 +00:00
Reid Kleckner	3e2b59107f	Fix -Wsign-compare in llvm-readobj llvm-svn: 337490	2018-07-19 19:58:22 +00:00
George Rimar	4f4ea10917	[llvm-readobj] - Do not report invalid amount of sections. When output style is GNU and amount of sections is >= SHN_LORESERVE, llvm-readobj reports zero number of sections instead of actual value. The patch fixes that. Differential revision: https://reviews.llvm.org/D49544 llvm-svn: 337462	2018-07-19 14:52:57 +00:00
Paul Semel	e4ae3a86ed	[llvm-readobj] Generic -string-dump option Differential Revision: https://reviews.llvm.org/D49470 llvm-svn: 337408	2018-07-18 18:00:41 +00:00
Paul Semel	d8d82c148d	[llvm-objdump] Add -demangle (-C) option Differential Revision: https://reviews.llvm.org/D49043 llvm-svn: 337401	2018-07-18 16:39:21 +00:00
George Rimar	863e12ad16	[llvm-objdump] - An attempt to fix BB after r337361. Seems r337361 is the reason of the following ARM BB failures: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick http://lab.llvm.org:8011/builders/clang-cmake-armv8-full/builds/4633 Reason is unclear to me, other bots are OK. If this will not help, I'll revert r337361. llvm-svn: 337371	2018-07-18 09:25:36 +00:00
George Rimar	98853328e2	[llvm-objdump] - Stop reporting bogus section IDs. Imagine we have a file with few sections, and one of them is .foo with index N != 0. Problem is that when llvm-objdump is given a -section=.foo parameter it lists .foo as a section at index 0. That makes impossible to write test cases which needs to find the index of the particular section, while ignoring dumping of others. The patch fixes that. Differential revision: https://reviews.llvm.org/D49372 llvm-svn: 337361	2018-07-18 08:34:35 +00:00
George Rimar	691d967bcc	[llvm-readobj] - Teach tool to dump objects with >= SHN_LORESERVE of sections. http://www.sco.com/developers/gabi/2003-12-17/ch4.eheader.html says that e_shnum and/or e_shstrndx may have special values if "the number of sections is greater than or equal to SHN_LORESERVE" or "the section name string table section index is greater than or equal to SHN_LORESERVE (0xff00)" Previously llvm-readobj was unable to dump such files, patch changes that. I had to add a precompiled test case because it does not seem possible to prepare a test using yaml2obj or llvm-mc (not clear how to make .shstrtab to have index >= SHN_LORESERVE). Differential revision: https://reviews.llvm.org/D49369 llvm-svn: 337360	2018-07-18 08:19:58 +00:00
Puyan Lotfi	e6a54bf29b	[NFC][llvm-objcopy] Cleanup namespace usage in llvm-objcopy. Nest any classes not used outside of a file into anon. Nest any classes used across files in llvm-objcopy into namespace llvm::objcopy. Differential Revision: https://reviews.llvm.org/D49449 llvm-svn: 337337	2018-07-18 00:10:51 +00:00
Peter Collingbourne	ec56d20419	MC: Implement support for new .addrsig and .addrsig_sym directives. Part of the address-significance tables proposal: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123514.html Differential Revision: https://reviews.llvm.org/D47744 llvm-svn: 337328	2018-07-17 22:17:18 +00:00
Sam Clegg	5c92980d35	[WebAssembly] Remove ELF file support. This support was partial and temporary. Now that we have wasm object file support its no longer needed. Differential Revision: https://reviews.llvm.org/D48744 llvm-svn: 337222	2018-07-16 23:09:29 +00:00
Puyan Lotfi	52b2cd224d	[NFC][llvm-objcopy] Make helper functions static Anywhere in tools/llvm-objcopy where functions or classes are not referenced outside of a given file, we change things to make the function or class static or put inside an anonymous namespace. llvm-svn: 337220	2018-07-16 22:17:05 +00:00
Jake Ehrlich	ad730cd0df	[llvm-objcopy] Add support for large indexes This patch is an update of an older patch that never landed (see here: https://reviews.llvm.org/D42516) Recently various users have run into this issue and it just 100% has to be solved at this point. The main difference in this patch is that I use gunzip instead of unzip which should hopefully allow tests to pass. Please review this as if it is a new patch however. I found some issues along the way and made some minor modifications. The binary used in this patch for testing (a zip file to make it small) can be found here: https://drive.google.com/file/d/1UjsnTO9edLttZibbr-2T1bJl92KEQFAO/view?usp=sharing Differential Revision: https://reviews.llvm.org/D49206 llvm-svn: 337204	2018-07-16 19:48:52 +00:00
Teresa Johnson	fe40f71ee6	Restore "[ThinLTO] Ensure we always select the same function copy to import" This reverts commit r337081, therefore restoring r337050 (and fix in r337059), with test fix for bot failure described after the original description below. In order to always import the same copy of a linkonce function, even when encountering it with different thresholds (a higher one then a lower one), keep track of the summary we decided to import. This ensures that the backend only gets a single definition to import for each GUID, so that it doesn't need to choose one. Move the largest threshold the GUID was considered for import into the current module out of the ImportMap (which is part of a larger map maintained across the whole index), and into a new map just maintained for the current module we are computing imports for. This saves some memory since we no longer have the thresholds maintained across the whole index (and throughout the in-process backends when doing a normal non-distributed ThinLTO build), at the cost of some additional information being maintained for each invocation of ComputeImportForModule (the selected summary pointer for each import). There is an additional map lookup for each callee being considered for importing, however, this was able to subsume a map lookup in the Worklist iteration that invokes computeImportForFunction. We also are able to avoid calling selectCallee if we already failed to import at the same or higher threshold. I compared the run time and peak memory for the SPEC2006 471.omnetpp benchmark (running in-process ThinLTO backends), as well as for a large internal benchmark with a distributed ThinLTO build (so just looking at the thin link time/memory). Across a number of runs with and without this change there was no significant change in the time and memory. (I tried a few other variations of the change but they also didn't improve time or peak memory). The new commit removes a test that no longer makes sense (Transforms/FunctionImport/hotness_based_import2.ll), as exposed by the reverse-iteration bot. The test depends on the order of processing the summary call edges, and actually depended on the old problematic behavior of selecting more than one summary for a given GUID when encountered with different thresholds. There was no guarantee even before that we would eventually pick the linkonce copy with the hottest call edges, it just happened to work with the test and the old code, and there was no guarantee that we would end up importing the selected version of the copy that had the hottest call edges (since the backend would effectively import only one of the selected copies). Reviewers: davidxl Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D48670 llvm-svn: 337184	2018-07-16 15:30:27 +00:00
Joel Galenson	36676cf9c5	[cfi-verify] Abort on unsupported targets As suggested in the review for r337007, this makes cfi-verify abort on unsupported targets instead of producing incorrect results. It also updates the design document to reflect this. Differential Revision: https://reviews.llvm.org/D49304 llvm-svn: 337181	2018-07-16 15:26:44 +00:00
Andrea Di Biagio	c19db3b1d5	[llvm-mca][BtVer2] teach how to identify false dependencies on partially written registers. The goal of this patch is to improve the throughput analysis in llvm-mca for the case where instructions perform partial register writes. On x86, partial register writes are quite difficult to model, mainly because different processors tend to implement different register merging schemes in hardware. When the code contains partial register writes, the IPC (instructions per cycles) estimated by llvm-mca tends to diverge quite significantly from the observed IPC (using perf). Modern AMD processors (at least, from Bulldozer onwards) don't rename partial registers. Quoting Agner Fog's microarchitecture.pdf: " The processor always keeps the different parts of an integer register together. For example, AL and AH are not treated as independent by the out-of-order execution mechanism. An instruction that writes to part of a register will therefore have a false dependence on any previous write to the same register or any part of it." This patch is a first important step towards improving the analysis of partial register updates. It changes the semantic of RegisterFile descriptors in tablegen, and teaches llvm-mca how to identify false dependences in the presence of partial register writes (for more details: see the new code comments in include/Target/TargetSchedule.h - class RegisterFile). This patch doesn't address the case where a write to a part of a register is followed by a read from the whole register. On Intel chips, high8 registers (AH/BH/CH/DH)) can be stored in separate physical registers. However, a later (dirty) read of the full register (example: AX/EAX) triggers a merge uOp, which adds extra latency (and potentially affects the pipe usage). This is a very interesting article on the subject with a very informative answer from Peter Cordes: https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to In future, the definition of RegisterFile can be extended with extra information that may be used to identify delays caused by merge opcodes triggered by a dirty read of a partial write. Differential Revision: https://reviews.llvm.org/D49196 llvm-svn: 337123	2018-07-15 11:01:38 +00:00

1 2 3 4 5 ...

9483 Commits