llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Hongtao Yu	db4396f62a	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Serge Pavlov	e062f1d0ad	[IR] Merge metadata manipulation code into Value Now there are two main classes in Value hierarchy, which support metadata, these are Instruction and GlobalObject. They implement different APIs for metadata manipulation, which however overlap. This change moves metadata manipulation code into Value, so descendant classes can use this code for their operations on metadata. No functional changes intended. Differential Revision: https://reviews.llvm.org/D67626	2020-10-23 11:08:26 +07:00
Sanjay Patel	251968146e	[IR][GVN] allow intrinsics in Instruction's isCommutative query (2nd try) The 1st try was reverted because I missed an assert that needed softening. As discussed in D86798 / rG09652721 , we were potentially returning a different result for whether an Instruction is commutable depending on if we call the base class or derived class method. This requires relaxing asserts in GVN, but that pass seems to be working otherwise. NewGVN requires more work because it uses different code paths for numbering binops and calls.	2020-08-31 16:01:19 -04:00
Sanjay Patel	56bc7f03f4	Revert "[IR][GVN] allow intrinsics in Instruction's isCommutative query" This reverts commit 25597f7783e7038b8a2ee88bb49ac605b211b564. It is causing crashing on bots such as: http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10523/steps/ninja-build/logs/stdio	2020-08-30 17:02:51 -04:00
Sanjay Patel	d58c2f282d	[IR][GVN] allow intrinsics in Instruction's isCommutative query As discussed in D86798 / rG09652721 , we were potentially returning a different result for whether an Instruction is commutable depending on if we call the base class or derived class method. This requires relaxing an assert in GVN, but that pass seems to be working otherwise. NewGVN requires more work because it uses different code paths for numbering binops and calls.	2020-08-30 16:49:22 -04:00
Roman Lebedev	bf4def296e	[Instruction] Speculatively undo isIdenticalToWhenDefined() PHI handling changes The stage2-stage3 differences persist even without instcombine-based PHI CSE, so this is the only possible reason.	2020-08-29 19:38:57 +03:00
Roman Lebedev	c05db21f5c	[NFC] Instruction::isIdenticalToWhenDefined(): s/nessesairly/necessarily/	2020-08-29 15:10:13 +03:00
Roman Lebedev	2bf651df9f	[InstCombine] Take 2: Perform trivial PHI CSE The original take was 6102310d814ad73eab60a88b21dd70874f7a056f, which taught InstSimplify to do that, which seemed better at time, since we got EarlyCSE support for free. However, it was proven that we can not do that there, the simplified-to PHI would not be reachable from the original PHI, and that is not something InstSimplify is allowed to do, as noted in the commit ed90f15efb40d26b5d3ead3bb8e9e284218e0186 that reverted it : > It appears to cause compilation non-determinism and caused stage3 mismatches. However InstCombine already does many different optimizations, so it should be a safe place to do it here. Note that we still can't just compare incoming values ranges, because there is no guarantee that these PHI's we'd simplify to were already re-visited and sorted. However coming up with a test is problematic. Effects on vanilla llvm test-suite + RawSpeed: ``` \| statistic name \| baseline \| proposed \| Δ \| % \| \|%\| \| \|----------------------------------------------------\|-----------\|-----------\|-------:\|---------:\|---------:\| \| instcombine.NumPHICSEs \| 0 \| 22228 \| 22228 \| 0.00% \| 0.00% \| \| asm-printer.EmittedInsts \| 7942329 \| 7942456 \| 127 \| 0.00% \| 0.00% \| \| assembler.ObjectBytes \| 254295632 \| 254313792 \| 18160 \| 0.01% \| 0.01% \| \| early-cse.NumCSE \| 2183283 \| 2183272 \| -11 \| 0.00% \| 0.00% \| \| early-cse.NumSimplify \| 550105 \| 541842 \| -8263 \| -1.50% \| 1.50% \| \| instcombine.NumAggregateReconstructionsSimplified \| 73 \| 4506 \| 4433 \| 6072.60% \| 6072.60% \| \| instcombine.NumCombined \| 3640311 \| 3666911 \| 26600 \| 0.73% \| 0.73% \| \| instcombine.NumDeadInst \| 1778204 \| 1783318 \| 5114 \| 0.29% \| 0.29% \| \| instcount.NumCallInst \| 1758395 \| 1758804 \| 409 \| 0.02% \| 0.02% \| \| instcount.NumInvokeInst \| 59478 \| 59502 \| 24 \| 0.04% \| 0.04% \| \| instcount.NumPHIInst \| 330557 \| 330549 \| -8 \| 0.00% \| 0.00% \| \| instcount.TotalBlocks \| 1077138 \| 1077221 \| 83 \| 0.01% \| 0.01% \| \| instcount.TotalFuncs \| 101442 \| 101441 \| -1 \| 0.00% \| 0.00% \| \| instcount.TotalInsts \| 8831946 \| 8832611 \| 665 \| 0.01% \| 0.01% \| \| simplifycfg.NumInvokes \| 4300 \| 4410 \| 110 \| 2.56% \| 2.56% \| \| simplifycfg.NumSimpl \| 1019813 \| 999740 \| -20073 \| -1.97% \| 1.97% \| ``` So it fires ~22k times, which is less than ~24k the take 1 did. It allows foldAggregateConstructionIntoAggregateReuse() to actually work after PHI-of-extractvalue folds did their thing. Previously SimplifyCFG would have done this PHI CSE, of all places. Additionally, allows some more `invoke`->`call` folds to happen (+110, +2.56%). All in all, expectedly, this catches less things overall, but all the motivational cases are still caught, so all good.	2020-08-29 13:13:06 +03:00
Owen Anderson	df34423d50	Revert "[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block" This reverts commit 6102310d814ad73eab60a88b21dd70874f7a056f. It appears to cause compilation non-determinism and caused stage3 mismatches.	2020-08-28 23:43:42 +00:00
Roman Lebedev	2088bfe3c4	[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify, nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places.. While i could teach EarlyCSE how to hash PHI nodes, we can't really do much (anything?) even if we find two identical PHI nodes in different basic blocks, same-BB case is the interesting one, and if we teach InstSimplify about it (which is what i wanted originally, https://reviews.llvm.org/D86530), we get EarlyCSE support for free. So i would think this is pretty uncontroversial. On vanilla llvm test-suite + RawSpeed, this has the following effects: ``` \| statistic name \| baseline \| proposed \| Δ \| % \| \\|%\\| \| \|----------------------------------------------------\|-----------\|-----------\|-------:\|---------:\|---------:\| \| instsimplify.NumPHICSE \| 0 \| 23779 \| 23779 \| 0.00% \| 0.00% \| \| asm-printer.EmittedInsts \| 7942328 \| 7942392 \| 64 \| 0.00% \| 0.00% \| \| assembler.ObjectBytes \| 273069192 \| 273084704 \| 15512 \| 0.01% \| 0.01% \| \| correlated-value-propagation.NumPhis \| 18412 \| 18539 \| 127 \| 0.69% \| 0.69% \| \| early-cse.NumCSE \| 2183283 \| 2183227 \| -56 \| 0.00% \| 0.00% \| \| early-cse.NumSimplify \| 550105 \| 542090 \| -8015 \| -1.46% \| 1.46% \| \| instcombine.NumAggregateReconstructionsSimplified \| 73 \| 4506 \| 4433 \| 6072.60% \| 6072.60% \| \| instcombine.NumCombined \| 3640264 \| 3664769 \| 24505 \| 0.67% \| 0.67% \| \| instcombine.NumDeadInst \| 1778193 \| 1783183 \| 4990 \| 0.28% \| 0.28% \| \| instcount.NumCallInst \| 1758401 \| 1758799 \| 398 \| 0.02% \| 0.02% \| \| instcount.NumInvokeInst \| 59478 \| 59502 \| 24 \| 0.04% \| 0.04% \| \| instcount.NumPHIInst \| 330557 \| 330533 \| -24 \| -0.01% \| 0.01% \| \| instcount.TotalInsts \| 8831952 \| 8832286 \| 334 \| 0.00% \| 0.00% \| \| simplifycfg.NumInvokes \| 4300 \| 4410 \| 110 \| 2.56% \| 2.56% \| \| simplifycfg.NumSimpl \| 1019808 \| 999607 \| -20201 \| -1.98% \| 1.98% \| ``` I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call` transforms, and counter-intuitively results in more instructions total. That being said, the PHI count doesn't decrease that much, and looking at some examples, it seems at least some of them were previously getting PHI CSE'd in SimplifyCFG of all places.. I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time. As a comment in `InstCombinerImpl::visitPHINode()` already stated, there are no guarantees on the ordering of the operands of a PHI node, so if we just naively compare them, we may false-negatively say that the nodes are not equal when the only difference is operand order, which is especially important since the fold is in InstSimplify, so we can't rely on InstCombine sorting them beforehand. Fixing this for the general case is costly (geomean +0.02%), and does not appear to catch anything in test-suite, but for the same-BB case, it's trivial, so let's fix at least that. As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions this appears to cause geomean +0.03% compile time increase (regression), but geomean -0.01%..-0.04% code size decrease (improvement).	2020-08-27 18:47:04 +03:00
Yevgeny Rouban	e9f41fac71	[Instruction] Remove setProfWeight() Remove the function Instruction::setProfWeight() and make use of Instruction::copyMetadata(.., {LLVMContext::MD_prof}). This is correct for all use cases of setProfWeight() as it is applied to CallBase instructions only. This change results in prof metadata copied intact even if the source has "VP". The old pair of calls extractProfTotalWeight() + setProfWeight() resulted in setting branch_weights if the source had "VP" data. Reviewers: yamauchi, davidxl Tags: #llvm Differential Revision: https://reviews.llvm.org/D80987	2020-06-04 15:10:55 +07:00
Sanjay Patel	763ee90b20	[IR] add set function for FMF 'contract' This was missed when the flag was added with D31164.	2020-05-27 09:14:51 -04:00
Vedant Kumar	fa1b88c3f1	[Instruction] Set metadata uses to undef on deletion Summary: Replace any extant metadata uses of a dying instruction with undef to preserve debug info accuracy. Some alternatives include: - Treat Instruction like any other Value, and point its extant metadata uses to an empty ValueAsMetadata node. This makes extant dbg.value uses trivially dead (i.e. fair game for deletion in many passes), leading to stale dbg.values being in effect for too long. - Call salvageDebugInfoOrMarkUndef. Not needed to make instruction removal correct. OTOH results in wasted work in some common cases (e.g. when all instructions in a BasicBlock are deleted). This came up while discussing some basic cases in https://reviews.llvm.org/D80052. Reviewers: jmorse, TWeaver, aprantl, dexonsmith, jdoerfert Subscribers: jholewinski, qcolombet, hiraditya, jfb, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80264	2020-05-21 15:58:12 -07:00
Eli Friedman	db20f1e2c5	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Reid Kleckner	2a197a86b4	[IR] Lazily number instructions for local dominance queries Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664	2020-02-18 14:44:24 -08:00
aqjune	8a733b9297	[IR] Redefine Freeze instruction Summary: This patch redefines freeze instruction from being UnaryOperator to a subclass of UnaryInstruction. ConstantExpr freeze is removed, as discussed in the previous review. FreezeOperator is not added because there's no ConstantExpr freeze. `freeze i8* null` test is added to `test/Bindings/llvm-c/freeze.ll` as well, because the null pointer-related bug in `tools/llvm-c/echo.cpp` is now fixed. InstVisitor has visitFreeze now because freeze is not unaryop anymore. Reviewers: whitequark, deadalnix, craig.topper, jdoerfert, lebedev.ri Reviewed By: craig.topper, lebedev.ri Subscribers: regehr, nlopes, mehdi_amini, hiraditya, steven_wu, dexonsmith, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69932	2019-11-12 10:49:00 +09:00
aqjune	37bbfa1895	[IR] Add Freeze instruction Summary: - Define Instruction::Freeze, let it be UnaryOperator - Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter The format is `%x = freeze <ty> %v` - Add support for freeze instruction to llvm-c interface. - Add m_Freeze in PatternMatch. - Erase freeze when lowering IR to SelDag. Reviewers: deadalnix, hfinkel, efriedma, lebedev.ri, nlopes, jdoerfert, regehr, filcab, delcypher, whitequark Reviewed By: lebedev.ri, jdoerfert Subscribers: jfb, kristof.beyls, hiraditya, lebedev.ri, steven_wu, dexonsmith, xbolva00, delcypher, spatel, regehr, trentxintong, vsk, filcab, nlopes, mehdi_amini, deadalnix, llvm-commits Differential Revision: https://reviews.llvm.org/D29011	2019-11-05 15:54:56 +09:00
Yevgeny Rouban	7fbbd670e7	[IR] Fix mayReadFromMemory() for writeonly calls Current implementation of Instruction::mayReadFromMemory() returns !doesNotAccessMemory() which is !ReadNone. This does not take into account that the writeonly attribute also indicates that the call does not read from memory. The patch changes the predicate to !doesNotReadMemory() that reflects the intended behavior. Differential Revision: https://reviews.llvm.org/D69086 llvm-svn: 375389	2019-10-21 06:52:08 +00:00
Philip Reames	8fea1df72a	Add a transform pass to make the executable semantics of poison explicit in the IR Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language. The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program. At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code. (See comment in PoisonChecking.cpp for full explanation and context) Differential Revision: https://reviews.llvm.org/D64215 llvm-svn: 365536	2019-07-09 18:49:29 +00:00
Roman Lebedev	b74f0977d3	[NFC] Instruction: introduce replaceSuccessorWith() function, use it Summary: There is `Instruction::getNumSuccessors()`, `Instruction::getSuccessor()` and `Instruction::setSuccessor()`, but no function to replace every specified `BasicBlock` successor with some other specified `BasicBlock`. I've found one place where it should clearly be used. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61010 llvm-svn: 359994	2019-05-05 18:59:22 +00:00
Wei Mi	99b0c5f8c6	[PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to CallInst. The issue was raised here: https://reviews.llvm.org/D60903#1472783 The function Instruction::updateProfWeight is only used for CallInst in profile update. From the current interface, it is very easy to think that the function can also be used for branch instruction. However, Branch instruction does't need the scaling the function provides for branch_weights and VP (value profile), in addition, scaling may introduce inaccuracy for branch probablity. The patch moves the function updateProfWeight from Instruction class to CallInst to remove the confusion. The patch also changes the scaling of branch_weights from a loop to a block because we know that ProfileData for branch_weights of CallInst will only have two operands at most. Differential Revision: https://reviews.llvm.org/D60911 llvm-svn: 358900	2019-04-22 17:04:51 +00:00
Craig Topper	ea7e6b3857	Implementation of asm-goto support in LLVM This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563	2019-02-08 20:48:56 +00:00
Craig Topper	7b4a99f436	[IR] Use CallBase to reduce code duplication. NFC Noticed in the asm-goto patch. Callbr needs to go here too. One cast and call is better than 3. Differential Revision: https://reviews.llvm.org/D57295 llvm-svn: 352563	2019-01-29 23:31:54 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Vedant Kumar	0705274c67	[IR] Add Instruction::isLifetimeStartOrEnd, NFC Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964	2018-12-21 21:49:40 +00:00
Cameron McInally	3a2064d3a0	[IR] Add a dedicated FNeg IR Instruction The IEEE-754 Standard makes it clear that fneg(x) and fsub(-0.0, x) are two different operations. The former is a bitwise operation, while the latter is an arithmetic operation. This patch creates a dedicated FNeg IR Instruction to model that behavior. Differential Revision: https://reviews.llvm.org/D53877 llvm-svn: 346774	2018-11-13 18:15:47 +00:00
Carlos Alberto Enciso	2629fa33cc	[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG. In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D53390 llvm-svn: 346481	2018-11-09 09:42:10 +00:00
Chandler Carruth	5fa7afa32f	[IR] Replace `isa<TerminatorInst>` with `isTerminator()`. This is a bit awkward in a handful of places where we didn't even have an instruction and now we have to see if we can build one. But on the whole, this seems like a win and at worst a reasonable cost for removing `TerminatorInst`. All of this is part of the removal of `TerminatorInst` from the `Instruction` type hierarchy. llvm-svn: 340701	2018-08-26 09:51:22 +00:00
Chandler Carruth	7f564cda33	[IR] Begin removal of TerminatorInst by removing successor manipulation. The core get and set routines move to the `Instruction` class. These routines are only valid to call on instructions which are terminators. The iterator and generic range based access move to `CFG.h` where all the other generic successor and predecessor access lives. While moving the iterator here, simplify it using the iterator utilities LLVM provides and updates coding style as much as reasonable. The APIs remain pointer-heavy when they could better use references, and retain the odd behavior of `operator*` and `operator->` that is common in LLVM iterators. Adjusting this API, if desired, should be a follow-up step. Non-generic range iteration is added for the two instructions where there is an especially easy mechanism and where there was code attempting to use the range accessor from a specific subclass: `indirectbr` and `br`. In both cases, the successors are contiguous operands and can be easily iterated via the operand list. This is the first major patch in removing the `TerminatorInst` type from the IR's instruction type hierarchy. This change was discussed in an RFC here and was pretty clearly positive: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html There will be a series of much more mechanical changes following this one to complete this move. Differential Revision: https://reviews.llvm.org/D47467 llvm-svn: 340698	2018-08-26 08:41:15 +00:00
Vedant Kumar	eb47c07b14	[IR] Introduce helpers to skip debug instructions (NFC) This patch introduces two helpers to make it easier to ignore debug intrinsics: - Instruction::getNextNonDebugInstruction() This is just like Instruction::getNextNode(), except that it skips debug info. - skipDebugInfo(BasicBlock::iterator) A free function which advances a BasicBlock iterator past any debug info. This is a no-op when the iterator already points to a non-debug instruction. Part of: llvm.org/PR37728 Related to: https://reviews.llvm.org/D47874 Differential Revision: https://reviews.llvm.org/D48305 llvm-svn: 335083	2018-06-19 23:42:17 +00:00
Warren Ristow	74ca133d49	[InstCombine] Enable more reassociations using FMF 'reassoc' + 'nsz' Reassociation of math ops in some contexts (especially vector contexts) has generally only been happening when the 'fast' FMF was set. This enables reassoication when only the finer grained controls 'reassoc' and 'nsz' are set. Differential Revision: https://reviews.llvm.org/D47335 llvm-svn: 333221	2018-05-24 20:16:43 +00:00
Chris Bieneman	152dec707a	[IPSCCP] Remove calls without side effects Summary: When performing constant propagation for call instructions we have historically replaced all uses of the return from a call, but not removed the call itself. This is required for correctness if the calls have side effects, however the compiler should be able to safely remove calls that don't have side effects. This allows the compiler to completely fold away calls to functions that have no side effects if the inputs are constant and the output can be determined at compile time. Reviewers: davide, sanjoy, bruno, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38856 llvm-svn: 322125	2018-01-09 21:58:46 +00:00
Michael Zolotukhin	a9ce469914	Remove redundant includes from lib/IR. llvm-svn: 320622	2017-12-13 21:30:52 +00:00
Mandeep Singh Grang	32947f7b72	[llvm] Remove redundant return [NFC] Reviewers: davidxl, olista01, Eugene.Zelenko Reviewed By: Eugene.Zelenko Subscribers: sdardis, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39917 llvm-svn: 317995	2017-11-12 03:47:50 +00:00
Sanjay Patel	fd69991264	[IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html and again more recently: http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html ...this is a step in cleaning up our fast-math-flags implementation in IR to better match the capabilities of both clang's user-visible flags and the backend's flags for SDNode. As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic reassociation - 'AllowReassoc'. We're also adding a bit to allow approximations for library functions called 'ApproxFunc' (this was initially proposed as 'libm' or similar). ...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), but that's apparently already used for other purposes. Also, I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated. We'll defer movement of FMF to another day. We keep the 'fast' keyword. I thought about removing that, but seeing IR like this: %f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2 ...made me think we want to keep the shortcut synonym. Finally, this change is binary incompatible with existing IR as seen in the compatibility tests. This statement: "Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR." ( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility ) ...provides the flexibility we want to make this change without requiring a new IR version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will fail to optimize some previously 'fast' code because it's no longer recognized as 'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'. Note: an inter-dependent clang commit to use the new API name should closely follow commit. Differential Revision: https://reviews.llvm.org/D39304 llvm-svn: 317488	2017-11-06 16:27:15 +00:00
Sanjay Patel	4aeffc1bf9	[Instruction] add moveAfter() convenience function; NFCI As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(), but implemented using moveBefore() for symmetry/efficiency. Differential Revision: https://reviews.llvm.org/D37239 llvm-svn: 312001	2017-08-29 14:07:48 +00:00
Konstantin Zhuravlyov	d382d6f3fc	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
George Burgess IV	5cb9a3f362	[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops. If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] \| 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. llvm-svn: 305053	2017-06-09 03:56:15 +00:00
Chandler Carruth	eb66b33867	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Reid Kleckner	73e1a13fdc	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Tim Shen	5f6285f048	[Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC. Now both emitLeadingFence and emitTrailingFence take the instruction itself, instead of taking IsLoad/IsStore pairs. Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used for determining those two booleans. The instruction argument is also useful for later D32763, in emitTrailingFence. For emitLeadingFence, it seems to have cleaner interface with the proposed change. Differential Revision: https://reviews.llvm.org/D32762 llvm-svn: 302539	2017-05-09 15:27:17 +00:00
Dehao Chen	ad41a3b98f	Update VP prof metadata during inlining. Summary: r298270 added profile update logic for branch_weights. This patch implements profile update logic for VP prof metadata too. Reviewers: eraman, tejohnson, davidxl Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32773 llvm-svn: 302209	2017-05-05 00:47:34 +00:00
Adam Nemet	2cc291ac33	[IR] Add AllowContract to FastMathFlags -ffp-contract=fast does not currently work with LTO because it's passed as a TargetOption to the backend rather than in the IR. This adds it to FastMathFlags. This is toward fixing PR25721 Differential Revision: https://reviews.llvm.org/D31164 llvm-svn: 298939	2017-03-28 20:11:52 +00:00
Craig Topper	7a92a7ca52	[IR] Share implementation for pairs of const and non-const methods using const_cast. NFCI llvm-svn: 298830	2017-03-27 05:46:58 +00:00
Craig Topper	1e83366ec7	[IR] Make Instruction::isAssociative method inline. Add LLVM_READONLY to the static version. llvm-svn: 298826	2017-03-26 23:23:29 +00:00
Dehao Chen	42ca2b084a	Set the prof weight correctly for call instructions in DeadArgumentElimination. Summary: In DeadArgumentElimination, the call instructions will be replaced. We also need to set the prof weights so that function inlining can find the correct profile. Reviewers: eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31143 llvm-svn: 298660	2017-03-23 23:26:00 +00:00
Dehao Chen	359491dd62	Updates branch_weights annotation for call instructions during inlining. Summary: Inliner should update the branch_weights annotation to scale it to proper value. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D30767 llvm-svn: 298270	2017-03-20 16:40:44 +00:00
Craig Topper	336d3faa34	[IR] Move a few static functions in Instruction class inline. They just check for certain opcodes and opcode enums are available in Instruction.h. llvm-svn: 298237	2017-03-20 06:40:39 +00:00
Sanjoy Das	c1d9ef40b5	[IR] Add a Instruction::dropPoisonGeneratingFlags helper Summary: The helper will be used in a later change. This change itself is NFC since the only user of this new function is its unit test. Reviewers: majnemer, efriedma Reviewed By: efriedma Subscribers: efriedma, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D30184 llvm-svn: 296035	2017-02-23 22:50:52 +00:00
Sanjay Patel	5e8e039088	fix documentation comments; NFC llvm-svn: 283361	2016-10-05 18:51:12 +00:00

1 2

99 Commits