llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 13:33:37 +02:00

History

Daniel Berlin 23bf9929d1 NewGVN: Sort Dominator Tree in RPO order, and use that for generating order. Summary: The optimal iteration order for this problem is RPO order. We want to process as many preds of a backedge as we can before we process the backedge. At the same time, as we add predicate handling, we want to be able to touch instructions that are dominated by a given block by ranges (because a change in value numbering a predicate possibly affects all users we dominate that are using that predicate). If we don't do it this way, we can't do value inference over backedges (the paper covers this in depth). The newgvn branch currently overshoots the last part, and guarantees that it will touch at least the right set of instructions, but it does touch more. This is because the bitvector instruction ranges are currently generated in RPO order (so we take the max and the min of the ranges of dominated blocks, which means there are some in the middle we didn't have to touch that we did). We can do better by sorting the dominator tree, and then just using dominator tree order. As a preliminary, the dominator tree has some RPO guarantees, but not enough. It guarantees that for a given node, your idom must come before you in the RPO ordering. It guarantees no relative RPO ordering for siblings. We add siblings in whatever order they appear in the module. So that is what we fix. We sort the children array of the domtree into RPO order, and then use the dominator tree for ordering, instead of RPO, since the dominator tree is now a valid RPO ordering. Note: This would help any other pass that iterates a forward problem in dominator tree order. Most of them are single pass. It will still maximize whatever result they compute. We could also build the dominator tree in this order, but our incremental updates would still put it out of sort order, and recomputing the sort order is almost as hard as general incremental updates of the domtree. Also note that the sorting does not affect any tests, etc. Nothing depends on domtree order, including the verifier, the equals functions for domtree nodes, etc. How much could this matter, you ask? Here are the current numbers. This is generated by running NewGVN over all files in LLVM. Note that once we propagate equalities, the differences go up by an order of magnitude or two (IE instead of 29, the max ends up in the thousands, since the worst case we add a factor of N, where N is the number of branch predicates). So while it doesn't look that stark for the default ordering, it gets much much worse. There are also programs in the wild where the difference is already pretty stark (2 iterations vs hundreds). RPO ordering: 759040 Number of iterations is 1 112908 Number of iterations is 2 Default dominator tree ordering: 755081 Number of iterations is 1 116234 Number of iterations is 2 603 Number of iterations is 3 27 Number of iterations is 4 2 Number of iterations is 5 1 Number of iterations is 7 Dominator tree sorted: 759040 Number of iterations is 1 112908 Number of iterations is 2 <yay!> Really bad ordering (sort domtree siblings in postorder. not quite the worst possible, but yeah): 754008 Number of iterations is 1 21 Number of iterations is 10 8 Number of iterations is 11 6 Number of iterations is 12 5 Number of iterations is 13 2 Number of iterations is 14 2 Number of iterations is 15 3 Number of iterations is 16 1 Number of iterations is 17 2 Number of iterations is 18 96642 Number of iterations is 2 1 Number of iterations is 20 2 Number of iterations is 21 1 Number of iterations is 22 1 Number of iterations is 29 17266 Number of iterations is 3 2598 Number of iterations is 4 798 Number of iterations is 5 273 Number of iterations is 6 186 Number of iterations is 7 80 Number of iterations is 8 42 Number of iterations is 9 Reviewers: chandlerc, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28129 llvm-svn: 290699		2016-12-29 01:12:36 +00:00
..
Analysis	[PM] Teach the CGSCC's CG update utility to more carefully invalidate	2016-12-28 10:34:50 +00:00
AsmParser	ASMParser: use range-based for loops (NFC)	2016-12-27 18:35:22 +00:00
Bitcode	Change Metadata Index emission in the bitcode to use 2x32 bits for the placeholder	2016-12-28 23:45:54 +00:00
CodeGen	[COFF] Use 32-bit jump table entries in .rdata for Win64	2016-12-29 00:12:39 +00:00
DebugInfo	[ObjectYAML] Support for DWARF debug_info section	2016-12-22 22:44:27 +00:00
Demangle	Demangle: remove references to allocator for default allocator	2016-11-20 00:20:27 +00:00
ExecutionEngine	RuntimeDyldELF: refactor AArch64 relocations. NFC.	2016-12-27 13:33:32 +00:00
Fuzzer	[libFuzzer] add an experimental flag -experimental_len_control=1 that sets max_len to 1M and tries to increases the actual max sizes of mutations very gradually (second attempt)	2016-12-27 23:24:55 +00:00
IR	Add a static_assert about the sizeof(GlobalValue)	2016-12-29 00:55:51 +00:00
IRReader
LibDriver	LibDriver: Allow resource files to be archive members.	2016-12-15 19:37:46 +00:00
LineEditor
Linker	[ThinLTO] Import only necessary DICompileUnit fields	2016-12-12 16:09:30 +00:00
LTO	[ThinLTO] Honor -O{0,1,2,4} passed through the libLTO interface for ThinLTO	2016-12-28 19:37:16 +00:00
MC	This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible.	2016-12-28 10:12:48 +00:00
Object	Fix a bugs with using some Mach-O command line flags like "-arch armv7m".	2016-12-16 22:54:02 +00:00
ObjectYAML	[ObjectYAML] Support for DWARF debug_info section	2016-12-22 22:44:27 +00:00
Option
Passes	[PM] Introduce a devirtualization iteration layer for the new PM.	2016-12-28 11:07:33 +00:00
ProfileData
Support	Attempt to fix build bot after r290597	2016-12-27 10:24:58 +00:00
TableGen	[TableGen] Centralize/Unify error handling.	2016-12-05 22:58:01 +00:00
Target	[COFF] Use 32-bit jump table entries in .rdata for Win64	2016-12-29 00:12:39 +00:00
Transforms	NewGVN: Sort Dominator Tree in RPO order, and use that for generating order.	2016-12-29 01:12:36 +00:00
CMakeLists.txt
LLVMBuild.txt