llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Eli Friedman	8b737bdfec	[SCCP] Reduce the number of times ResolvedUndefsIn is called for large modules. If a module has many values that need to be resolved by ResolvedUndefsIn, compilation takes quadratic time overall. Solve should do a small amount of work, since not much is added to the worklists each time markOverdefined is called. But ResolvedUndefsIn is linear over the length of the function/module, so resolving one undef at a time is quadratic in general. To solve this, make ResolvedUndefsIn resolve every undef value at once, instead of resolving them one at a time. This loses a little optimization power, but can be a lot faster. We still need a loop around ResolvedUndefsIn because markOverdefined could change the set of blocks that are live. That should be uncommon, hopefully. We could optimize it by tracking which blocks transition from dead to live, instead of iterating over the whole module to find them. But I'll leave that for later. (The whole function will become a lot simpler once we start pruning branches on undef.) The regression test changes seem minor. The specific cases in question could probably be optimized with a bit more work, but they seem like edge cases that don't really matter. Fixes an "infinite" compile issue my team found on an internal workoad. Differential Revision: https://reviews.llvm.org/D89080	2020-10-09 15:24:16 -07:00
Nikita Popov	74624d8649	Reapply [SCCP] Directly remove non-feasible edges Reapply with DTU update moved after CFG update, which is a requirement of the API. ----- Non-feasible control-flow edges are currently removed by replacing the branch condition with a constant and then calling ConstantFoldTerminator. This happens in a rather roundabout manner, by inspecting the users (effectively: predecessors) of unreachable blocks, and further complicated by the need to explicitly materialize the condition for "forced" edges. I would like to extend SCCP to discard switch conditions that are non-feasible based on range information, but this is incompatible with the current approach (as there is no single constant we could use.) Instead, this patch explicitly removes non-feasible edges. It currently only needs to handle the case where there is a single feasible edge. The llvm_unreachable() branch will need to be implemented for the aforementioned switch improvement. Differential Revision: https://reviews.llvm.org/D84264	2020-07-25 14:52:35 +02:00
Fangrui Song	ff71233589	Revert D84264 "[SCCP] Directly remove non-feasible edges" & 5db5b4bc4394ca247c9eb665e03b851848aa2fbf It breaks stage-2 build. Clang crashed when compiling llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp llvm/Support/GenericDomTree.h eraseNode: Node is not a leaf node	2020-07-23 17:51:48 -07:00
Nikita Popov	a7f6e63f5c	[SCCP] Directly remove non-feasible edges Non-feasible control-flow edges are currently removed by replacing the branch condition with a constant and then calling ConstantFoldTerminator. This happens in a rather roundabout manner, by inspecting the users (effectively: predecessors) of unreachable blocks, and further complicated by the need to explicitly materialize the condition for "forced" edges. I would like to extend SCCP to discard switch conditions that are non-feasible based on range information, but this is incompatible with the current approach (as there is no single constant we could use.) Instead, this patch explicitly removes non-feasible edges. It currently only needs to handle the case where there is a single feasible edge. The llvm_unreachable() branch will need to be implemented for the aforementioned switch improvement. Differential Revision: https://reviews.llvm.org/D84264	2020-07-23 20:32:57 +02:00
Florian Hahn	c5b8c59aa2	[SCCP] Switch to widen at PHIs, stores and call edges. Currently SCCP does not widen PHIs, stores or along call edges (arguments/return values), but on operations that directly extend ranges (like binary operators). This means PHIs, stores and call edges are not pessimized by widening currently, while binary operators are. The main reason for widening operators initially was that opting-out for certain operations was more straight-forward in the initial implementation (and it did not matter too much, as range support initially was only implemented for a very limited set of operations. During the discussion in D78391, it was suggested to consider flipping widening to PHIs, stores and along call edges. After adding support for tracking the number of range extensions in ValueLattice, limiting the number of range extensions per value is straight forward. This patch introduces a MaxWidenSteps option to the MergeOptions, limiting the number of range extensions per value. For PHIs, it seems natural allow an extension for each (active) incoming value plus 1. For the other cases, a arbitrary limit of 10 has been chosen initially. It would potentially make sense to set it depending on the users of a function/global, but that still needs investigating. This potentially leads to more state-changes and longer compile-times. The results look quite promising (MultiSource, SPEC): Same hash: 179 (filtered out) Remaining: 58 Metric: sccp.IPNumInstRemoved Program base widen-phi diff test-suite...ks/Prolangs-C/agrep/agrep.test 58.00 82.00 41.4% test-suite...marks/SciMark2-C/scimark2.test 32.00 43.00 34.4% test-suite...rks/FreeBench/mason/mason.test 6.00 8.00 33.3% test-suite...langs-C/football/football.test 104.00 128.00 23.1% test-suite...cations/hexxagon/hexxagon.test 36.00 42.00 16.7% test-suite...CFP2000/177.mesa/177.mesa.test 214.00 249.00 16.4% test-suite...ngs-C/assembler/assembler.test 14.00 16.00 14.3% test-suite...arks/VersaBench/dbms/dbms.test 10.00 11.00 10.0% test-suite...oxyApps-C++/miniFE/miniFE.test 43.00 47.00 9.3% test-suite...ications/JM/ldecod/ldecod.test 179.00 195.00 8.9% test-suite...CFP2006/433.milc/433.milc.test 249.00 265.00 6.4% test-suite.../CINT2000/175.vpr/175.vpr.test 98.00 104.00 6.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 70.00 74.00 5.7% test-suite...CFP2000/188.ammp/188.ammp.test 71.00 75.00 5.6% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 111.00 117.00 5.4% test-suite...ce/Applications/Burg/burg.test 41.00 43.00 4.9% test-suite...000/197.parser/197.parser.test 66.00 69.00 4.5% test-suite...tions/lambda-0.1.3/lambda.test 23.00 24.00 4.3% test-suite...urce/Applications/lua/lua.test 301.00 313.00 4.0% test-suite...TimberWolfMC/timberwolfmc.test 76.00 79.00 3.9% test-suite...lications/ClamAV/clamscan.test 991.00 1030.00 3.9% test-suite...plications/d/make_dparser.test 53.00 55.00 3.8% test-suite...fice-ispell/office-ispell.test 83.00 86.00 3.6% test-suite...lications/obsequi/Obsequi.test 28.00 29.00 3.6% test-suite.../Prolangs-C/bison/mybison.test 56.00 58.00 3.6% test-suite.../CINT2000/254.gap/254.gap.test 170.00 176.00 3.5% test-suite.../Applications/lemon/lemon.test 30.00 31.00 3.3% test-suite.../CINT2000/176.gcc/176.gcc.test 1202.00 1240.00 3.2% test-suite...pplications/treecc/treecc.test 79.00 81.00 2.5% test-suite...chmarks/MallocBench/gs/gs.test 357.00 366.00 2.5% test-suite...eeBench/analyzer/analyzer.test 103.00 105.00 1.9% test-suite...T2006/445.gobmk/445.gobmk.test 1697.00 1724.00 1.6% test-suite...006/453.povray/453.povray.test 1812.00 1839.00 1.5% test-suite.../Benchmarks/Bullet/bullet.test 337.00 342.00 1.5% test-suite.../CINT2000/252.eon/252.eon.test 426.00 432.00 1.4% test-suite...T2000/300.twolf/300.twolf.test 214.00 217.00 1.4% test-suite...pplications/oggenc/oggenc.test 244.00 247.00 1.2% test-suite.../CINT2006/403.gcc/403.gcc.test 4008.00 4055.00 1.2% test-suite...T2006/456.hmmer/456.hmmer.test 175.00 177.00 1.1% test-suite...nal/skidmarks10/skidmarks.test 430.00 434.00 0.9% test-suite.../Applications/sgefa/sgefa.test 115.00 116.00 0.9% test-suite...006/447.dealII/447.dealII.test 1082.00 1091.00 0.8% test-suite...6/482.sphinx3/482.sphinx3.test 141.00 142.00 0.7% test-suite...ocBench/espresso/espresso.test 152.00 153.00 0.7% test-suite...3.xalancbmk/483.xalancbmk.test 4003.00 4025.00 0.5% test-suite...lications/sqlite3/sqlite3.test 548.00 551.00 0.5% test-suite...marks/7zip/7zip-benchmark.test 5522.00 5551.00 0.5% test-suite...nsumer-lame/consumer-lame.test 208.00 209.00 0.5% test-suite...:: External/Povray/povray.test 1556.00 1563.00 0.4% test-suite...000/186.crafty/186.crafty.test 298.00 299.00 0.3% test-suite.../Applications/SPASS/SPASS.test 2019.00 2025.00 0.3% test-suite...ications/JM/lencod/lencod.test 8427.00 8449.00 0.3% test-suite...6/464.h264ref/464.h264ref.test 6797.00 6813.00 0.2% test-suite...6/471.omnetpp/471.omnetpp.test 431.00 430.00 -0.2% test-suite...006/450.soplex/450.soplex.test 446.00 447.00 0.2% test-suite...0.perlbench/400.perlbench.test 1729.00 1727.00 -0.1% test-suite...000/255.vortex/255.vortex.test 3815.00 3819.00 0.1% Reviewers: efriedma, nikic, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79036	2020-05-29 11:59:17 +01:00
Florian Hahn	8286c97bd8	Recommit "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This patch should fix the cause of the stage2 failures and PR45185. This reverts the revert commit c52f839e723ee288db2a3e21860b011f6a9d707e.	2020-03-13 17:03:22 +00:00
Florian Hahn	f18b2707d7	Revert "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This commit is likely causing clang-with-lto-ubuntu to fail http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/16052 Also causes PR45185. This reverts commit f1ac5d2263f8419b865cc78ba1f5c8694970fb6b.	2020-03-12 18:49:11 +00:00
Florian Hahn	b4dc084389	[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI) This patch switches SCCP to use ValueLatticeElement for lattice values, instead of the local LatticeVal, as first step to enable integer range support. This patch does not make use of constant ranges for additional operations and the only difference for now is that integer constants are represented by single element ranges. To preserve the existing behavior, the following helpers are used * isConstant(LV): returns true when LV is either a constant or a constant range with a single element. This should return true in the same cases where LV.isConstant() returned true previously. * getConstant(LV): returns a constant if LV is either a constant or a constant range with a single element. This should return a constant in the same cases as LV.getConstant() previously. * getConstantInt(LV): same as getConstant, but additionally casted to ConstantInt. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60582	2020-03-12 12:03:06 +00:00
Florian Hahn	0338a099fe	[SCCP] Re-generate check lines using --function-signature. (NFC)	2020-02-16 20:34:54 +01:00
Florian Hahn	61a834b1e0	Recommit "[SCCP] Remove forcedconstant, go to overdefined instead" This includes a fix for cases where things get marked as overdefined in ResolvedUndefsIn, but we later discover a constant. To avoid crashing, we consistently bail out on overdefined values in the visitors. This is similar to the previous behavior with forcedconstant. This reverts the revert commit 02b72f564c8be0b4f4337d5c4a3fcf7e8018a818.	2020-02-15 18:36:44 +01:00
Vedant Kumar	1cb48ac1e6	Revert "Recommit "[SCCP] Remove forcedconstant, go to overdefined instead"" This reverts commit bb310b3f73dde5551bc2a0d564e88f7c831dfdb3. This breaks the stage2 ASan build, see: https://bugs.llvm.org/show_bug.cgi?id=44898 rdar://59431448	2020-02-13 11:55:18 -08:00
Florian Hahn	5badf3826e	Recommit "[SCCP] Remove forcedconstant, go to overdefined instead" This version includes a fix for a set of crashes caused by marking values depending on a yet unknown & tracked call as overdefined. In some cases, we would later discover that the call has a constant result and try to mark a user of it as constant, although it was already marked as overdefined. Most instruction handlers bail out early if the instruction is already overdefined. But that is not necessary for CastInsts for example. By skipping values that depend on skipped calls, we resolve the crashes and also improve the precision in some cases (see resolvedundefsin-tracked-fn.ll). Note that we may not skip PHI nodes that may depend on a skipped call, but they can be safely marked as overdefined, as we bail out early if the PHI node is overdefined. This reverts the revert commit a74b31a3e9cd844c7ce2087978568e3f5ec8519.	2020-02-12 18:02:18 +00:00

12 Commits