mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2025-01-31 20:51:52 +01:00
2088bfe3c4
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify, nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places.. While i could teach EarlyCSE how to hash PHI nodes, we can't really do much (anything?) even if we find two identical PHI nodes in different basic blocks, same-BB case is the interesting one, and if we teach InstSimplify about it (which is what i wanted originally, https://reviews.llvm.org/D86530), we get EarlyCSE support for free. So i would think this is pretty uncontroversial. On vanilla llvm test-suite + RawSpeed, this has the following effects: ``` | statistic name | baseline | proposed | Δ | % | \|%\| | |----------------------------------------------------|-----------|-----------|-------:|---------:|---------:| | instsimplify.NumPHICSE | 0 | 23779 | 23779 | 0.00% | 0.00% | | asm-printer.EmittedInsts | 7942328 | 7942392 | 64 | 0.00% | 0.00% | | assembler.ObjectBytes | 273069192 | 273084704 | 15512 | 0.01% | 0.01% | | correlated-value-propagation.NumPhis | 18412 | 18539 | 127 | 0.69% | 0.69% | | early-cse.NumCSE | 2183283 | 2183227 | -56 | 0.00% | 0.00% | | early-cse.NumSimplify | 550105 | 542090 | -8015 | -1.46% | 1.46% | | instcombine.NumAggregateReconstructionsSimplified | 73 | 4506 | 4433 | 6072.60% | 6072.60% | | instcombine.NumCombined | 3640264 | 3664769 | 24505 | 0.67% | 0.67% | | instcombine.NumDeadInst | 1778193 | 1783183 | 4990 | 0.28% | 0.28% | | instcount.NumCallInst | 1758401 | 1758799 | 398 | 0.02% | 0.02% | | instcount.NumInvokeInst | 59478 | 59502 | 24 | 0.04% | 0.04% | | instcount.NumPHIInst | 330557 | 330533 | -24 | -0.01% | 0.01% | | instcount.TotalInsts | 8831952 | 8832286 | 334 | 0.00% | 0.00% | | simplifycfg.NumInvokes | 4300 | 4410 | 110 | 2.56% | 2.56% | | simplifycfg.NumSimpl | 1019808 | 999607 | -20201 | -1.98% | 1.98% | ``` I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call` transforms, and counter-intuitively results in *more* instructions total. That being said, the PHI count doesn't decrease that much, and looking at some examples, it seems at least some of them were previously getting PHI CSE'd in SimplifyCFG of all places.. I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time. As a comment in `InstCombinerImpl::visitPHINode()` already stated, there are no guarantees on the ordering of the operands of a PHI node, so if we just naively compare them, we may false-negatively say that the nodes are not equal when the only difference is operand order, which is especially important since the fold is in InstSimplify, so we can't rely on InstCombine sorting them beforehand. Fixing this for the general case is costly (geomean +0.02%), and does not appear to catch anything in test-suite, but for the same-BB case, it's trivial, so let's fix at least that. As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions this appears to cause geomean +0.03% compile time increase (regression), but geomean -0.01%..-0.04% code size decrease (improvement).