llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 02:33:06 +01:00

Author	SHA1	Message	Date
Nikita Popov	2369372c85	[IR] Lift attribute handling for assume bundles into CallBase Rather than special-casing assume in BasicAA getModRefBehavior(), do this one level higher, in the attribute handling of CallBase. For assumes with operand bundles, the inaccessiblememonly attribute applies regardless of operand bundles.	2021-03-25 21:15:39 +01:00
Mircea Trofin	f394727369	[NFC] Module::getInstructionCount() is const	2021-03-25 12:29:19 -07:00
Philip Reames	006e14ce80	[deref] Implement initial set of inference rules for deref-at-point This implements a subset of the initial set of inference rules proposed in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree". The nolias one got moved to a separate review as there was some concerns raised which require further discussion. Differential Revision: https://reviews.llvm.org/D99135	2021-03-24 16:20:41 -07:00
Nick Lewycky	40cba34858	Verify that MDNodes belong to the same context as the Module. Differential Revision: https://reviews.llvm.org/D99289	2021-03-24 12:38:05 -07:00
David Sherwood	42a72164a2	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Bradley Smith	839304c777	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Nikita Popov	4766db4e38	Reapply [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast() There seems to be an impedance mismatch between what the type system considers an aggregate (structs and arrays) and what constants consider an aggregate (structs, arrays and vectors). Adjust the type check to consider vectors as well. The previous version of the patch dropped the type check entirely, but it turns out that getAggregateElement() does require the constant to be an aggregate in some edge cases: For Poison/Undef the getNumElements() API is called, without checking in advance that we're dealing with an aggregate. Possibly the implementation should avoid doing that, but for now I'm adding an assert so the next person doesn't fall into this trap.	2021-03-21 17:48:21 +01:00
Christoffer Lernö	3f24585141	Add type attributes to LLVM C API The LLVM C API is missing type attributes as is needed by attributes such as sret and byval. This patch adds three missing wrapper functions. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=48249 https://reviews.llvm.org/D97763	2021-03-19 19:07:04 -04:00
Philip Reames	eaf092af50	Update basic deref API to account for possiblity of free [NFC] This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree". The point of this change is purely to simplify iteration on other pieces on way to making the switch. Rebuilding with a change to Value.h is slow and painful, so I want to get the API change landed. Once that's done, I plan to more closely audit each caller, add the inference rules in their own patch, then post a patch with the langref changes and test diffs. The value of the command line flag is that we can exercise the inference logic in standalone patches without needing the whole switch ready to go just yet. Differential Revision: https://reviews.llvm.org/D98908	2021-03-19 11:17:19 -07:00
Jeroen Dobbelaere	13605b24cd	Support intrinsic overloading on unnamed types This patch adds support for intrinsic overloading on unnamed types. This fixes PR38117 and PR48340 and will also be needed for the Full Restrict Patches (D68484). The main problem is that the intrinsic overloading name mangling is using 's_s' for unnamed types. This can result in identical intrinsic mangled names for different function prototypes. This patch changes this by adding a '.XXXXX' to the intrinsic mangled name when at least one of the types is based on an unnamed type, ensuring that we get a unique name. Implementation details: - The mapping is created on demand and kept in Module. - It also checks for existing clashes and recycles potentially existing prototypes and declarations. - Because of extra data in Module, Intrinsic::getName needs an extra Module* argument and, for speed, an optional FunctionType* argument. - I still kept the original two-argument 'Intrinsic::getName' around which keeps the original behavior (providing the base name). -- Main reason is that I did not want to change the LLVMIntrinsicGetName version, as I don't know how acceptable such a change is -- The current situation already has a limitation. So that should not get worse with this patch. - Intrinsic::getDeclaration and the verifier are now using the new version. Other notes: - As far as I see, this should not suffer from stability issues. The count is only added for prototypes depending on at least one anonymous struct - The initial count starts from 0 for each intrinsic mangled name. - In case of name clashes, existing prototypes are remembered and reused when that makes sense. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D91250	2021-03-19 14:34:25 +01:00
Philip Reames	4a9f71baa0	Add a couple of missing attribute query methods [NFC]	2021-03-18 17:33:20 -07:00
Stephen Tozer	fea97b90a1	Reapply "[DebugInfo] Handle multiple variable location operands in IR" Fixed section of code that iterated through a SmallDenseMap and added instructions in each iteration, causing non-deterministic code; replaced SmallDenseMap with MapVector to prevent non-determinism. This reverts commit 01ac6d1587e8613ba4278786e8341f8b492ac941.	2021-03-17 16:45:25 +00:00
Hans Wennborg	d0e43622c0	Revert "[DebugInfo] Handle multiple variable location operands in IR" This caused non-deterministic compiler output; see comment on the code review. > This patch updates the various IR passes to correctly handle dbg.values with a > DIArgList location. This patch does not actually allow DIArgLists to be produced > by salvageDebugInfo, and it does not affect any pass after codegen-prepare. > Other than that, it should cover every IR pass. > > Most of the changes simply extend code that operated on a single debug value to > operate on the list of debug values in the style of any_of, all_of, for_each, > etc. Instances of setOperand(0, ...) have been replaced with with > replaceVariableLocationOp, which takes the value that is being replaced as an > additional argument. In places where this value isn't readily available, we have > to track the old value through to the point where it gets replaced. > > Differential Revision: https://reviews.llvm.org/D88232 This reverts commit df69c69427dea7f5b3b3a4d4564bc77b0926ec88.	2021-03-17 13:36:48 +01:00
Bradley Smith	85ceade375	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Adrian Prantl	7fb4439f85	Support !heapallocsite attachments in StripDebugInfo(). They point into the DIType type system, so they need to be stripped as well. rdar://75341300 Differential Revision: https://reviews.llvm.org/D98668	2021-03-16 10:05:13 -07:00
Adrian Prantl	2a7cba934d	Support !heapallocsite attachments in stripNonLineTableDebugInfo(). They point into the DIType type system, so they need to be stripped as well. rdar://75341300 Differential Revision: https://reviews.llvm.org/D98667	2021-03-16 10:05:12 -07:00
serge-sans-paille	ee3e917269	[NFC] Use SmallString instead of std::string for the AttrBuilder This avoids a few unnecessary conversion from StringRef to std::string, and a bunch of extra allocation thanks to the SmallString. Differential Revision: https://reviews.llvm.org/D98190	2021-03-16 13:34:14 +01:00
Caroline Concatto	cb0c62b65c	[SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse This patch adds support for reverse loop vectorization. It is possible to vectorize the following loop: ``` for (int i = n-1; i >= 0; --i) a[i] = b[i] + 1.0; ``` with fixed or scalable vector. The loop-vectorizer will use 'reverse' on the loads/stores to make sure the lanes themselves are also handled in the right order. This patch adds support for scalable vector on IRBuilder interface to create a reverse vector. The IR function CreateVectorReverse lowers to experimental.vector.reverse for scalable vector and keedp the original behavior for fixed vector using shuffle reverse. Differential Revision: https://reviews.llvm.org/D95363	2021-03-16 07:51:59 +00:00
Luo, Yuanke	6f29193e28	[X86][AMX] Prevent transforming load pointer from <256 x i32>* to x86_amx*. The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy. Differential Revision: https://reviews.llvm.org/D98247	2021-03-14 09:24:56 +08:00
Stephen Tozer	617750cbc9	Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reverts commit c0f3dfb9f119bb5f22dd8846f5502b6abaf026d3. Reverted due to an error on the clang-x64-windows-msvc buildbot.	2021-03-11 14:48:01 +00:00
gbtozers	3b8eba58df	[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic operations with two or more non-const operands; this includes the GetElementPtr instruction, and most Binary Operator instructions. These salvages produce DIArgList locations and are only valid for dbg.values, as currently variadic DIExpressions must use DW_OP_stack_value. This functionality is also only added for salvageDebugInfoForDbgValues; other functions that directly call salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be updated in a later patch. Differential Revision: https://reviews.llvm.org/D91722	2021-03-11 13:33:49 +00:00
Leonard Chan	bf4eac0c1d	[llvm] Change DSOLocalEquivalent type if the underlying global value type changes We encountered an issue where LTO running on IR that used the DSOLocalEquivalent constant would result in bad codegen. The underlying issue was ValueMapper wasn't properly handling DSOLocalEquivalent, so this just adds the machinery for handling it. This code path is triggered by a fix to DSOLocalEquivalent::handleOperandChangeImpl where DSOLocalEquivalent could potentially not have the same type as its underlying GV. This updates DSOLocalEquivalent::handleOperandChangeImpl to change the type if the GV type changes and handles this constant in ValueMapper. Differential Revision: https://reviews.llvm.org/D97978	2021-03-09 15:09:48 -08:00
gbtozers	3089bda8b6	[DebugInfo] Add replaceArg function to simplify DBG_VALUE_LIST expressions The LiveDebugValues and LiveDebugVariables implementations for handling DBG_VALUE_LIST instructions can be simplified significantly if they do not have to deal with any duplicated operands, such as a DBG_VALUE_LIST that uses the same register multiple times in its expression. This patch adds a function, replaceArg, that can be used to simplify a DIExpression in the case of duplicated operands. Differential Revision: https://reviews.llvm.org/D83896	2021-03-09 17:41:04 +00:00
gbtozers	9b334d3086	[DebugInfo] Handle multiple variable location operands in IR This patch updates the various IR passes to correctly handle dbg.values with a DIArgList location. This patch does not actually allow DIArgLists to be produced by salvageDebugInfo, and it does not affect any pass after codegen-prepare. Other than that, it should cover every IR pass. Most of the changes simply extend code that operated on a single debug value to operate on the list of debug values in the style of any_of, all_of, for_each, etc. Instances of setOperand(0, ...) have been replaced with with replaceVariableLocationOp, which takes the value that is being replaced as an additional argument. In places where this value isn't readily available, we have to track the old value through to the point where it gets replaced. Differential Revision: https://reviews.llvm.org/D88232	2021-03-09 16:44:38 +00:00
Akira Hatanaka	5a7b95b490	Move ObjCARCUtil.h back to llvm/Analysis Instead of adding the header to llvm/IR, just duplicate the marker string in the auto upgrader.	2021-03-08 16:35:24 -08:00
Philip Reames	fec6c4d338	[gvn] Precisely propagate equalities to phi operands The code used for propagating equalities (e.g. assume facts) was conservative in two ways - one of which this patch fixes. Specifically, it shifts the code reasoning about whether a use is dominated by the end of the assume block to consider phi uses to exist on the predecessor edge. This matches the dominator tree handling for dominates(Edge, Use), and simply extends it to dominates(BB, Use). Note that the decision to use the end of the block is itself a conservative choice. The more precise option would be to use the later of the assume and the value, and replace all uses after that. GVN handles that case separately (with the replace operand mechanism) because it used to be expensive to ask dominator questions within blocks. With the new instruction ordering support, we should probably rewrite this code at some point to simplify. Differential Revision: https://reviews.llvm.org/D98082	2021-03-08 08:59:00 -08:00
Nikita Popov	7eb1e2d813	[ConstantFold] Handle icmp of global and null consistently Return UGT rather than NE for icmp @g, null, which is slightly stronger. This is consistent with what we do for more complex folds. It is somewhat silly that @g ugt null does not get folded while (gep @g) ugt null does.	2021-03-08 17:18:01 +01:00
Nikita Popov	756bf6934b	[ConstProp] Fix folding of pointer icmp with signed predicates While @g ugt null is always true (ignoring weak symbols), @g sgt null is not necessarily the case -- that would imply that it is forbidden to place globals in the high half of the address space.	2021-03-08 17:12:12 +01:00
gbtozers	bfb6dad2ae	[DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch updates DbgVariableIntrinsics to support use of a DIArgList for the location operand, resulting in a significant change to its interface. This patch does not update all IR passes to support multiple location operands in a dbg.value; the only change is to update the DbgVariableIntrinsic interface and its uses. All code outside of the intrinsic classes assumes that an intrinsic will always have exactly one location operand; they will still support DIArgLists, but only if they contain exactly one Value. Among other changes, the setOperand and setArgOperand functions in DbgVariableIntrinsic have been made private. This is to prevent code from setting the operands of these intrinsics directly, which could easily result in incorrect/invalid operands being set. This does not prevent these functions from being called on a debug intrinsic at all, as they can still be called on any CallInst pointer; it is assumed that any code directly setting the operands on a generic call instruction is doing so safely. The intention for making these functions private is to prevent DIArgLists from being overwritten by code that's naively trying to replace one of the Values it points to, and also to fail fast if a DbgVariableIntrinsic is updated to use a DIArgList without a valid corresponding DIExpression.	2021-03-08 14:36:13 +00:00
Sanjay Patel	3f3b31f5ae	[ConstantFold] allow folding icmp of null and constexpr I noticed that we were not folding expressions like this: icmp ult (constexpr), null in https://llvm.org/PR49355, so we end up with extremely large icmp instructions as the constant expressions pile up on each other. There is no potential to mis-fold an unsigned boundary condition with a zero/null, so this is just falling through a crack in the pattern matching. The more general case of comparisons of non-zero constants and constexpr are more tricky and may require the datalayout to know how to cast to different types, etc. Negative tests verify that we are only changing a subset of potential patterns. Differential Revision: https://reviews.llvm.org/D98150	2021-03-08 08:53:59 -05:00
Matt Arsenault	82d8dae819	IR: Fix assert string message referring to the wrong attribute	2021-03-07 13:14:17 -05:00
gbtozers	c52cf11f42	[DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics This patch adds a new metadata node, DIArgList, which contains a list of SSA values. This node is in many ways similar in function to the existing ValueAsMetadata node, with the difference being that it tracks a list instead of a single value. Internally, it uses ValueAsMetadata to track the individual values, but there is also a reasonable amount of DIArgList-specific value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special case in parsing and printing due to the fact that it requires a function state (as it may reference function-local values). This patch should not result in any immediate functional change; it allows for DIArgLists to be parsed and printed, but debug variable intrinsics do not yet recognize them as a valid argument (outside of parsing). Differential Revision: https://reviews.llvm.org/D88175	2021-03-05 17:02:24 +00:00
Stephen Tozer	e0cb677eb6	Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values" Rewrites test to use correct architecture triple; fixes incorrect reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour so as to not be dependent on changes elsewhere in the patch stack. This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.	2021-03-05 12:32:05 +00:00
David Blaikie	2677513543	Move llvm/Analysis/ObjCARCUtil.h to IR to fix layering. This is included from IR files, and IR doesn't/can't depend on Analysis (because Analysis depends on IR). Also fix the implementation - don't use non-member static in headers, as it leads to ODR violations, inaccurate "unused function" warnings, etc. And fix the header protection macro name (we don't generally include "LIB" in the names, so far as I can tell).	2021-03-04 16:14:53 -08:00
Akira Hatanaka	4055195f29	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies ed4718eccb12bd42214ca4fb17d196d49561c0c7, which was reverted because it was causing a miscompile. The bug that was causing the miscompile has been fixed in 75805dce5ff874676f3559c069fcd6737838f5c0. Original commit message: Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-03-04 11:22:30 -08:00
Stephen Tozer	977ffc2c60	Revert "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values" This reverts commit d07f106f4a48b6e941266525b6f7177834d7b74e.	2021-03-04 11:59:21 +00:00
gbtozers	7cf2776667	[DebugInfo] Add new instruction and DIExpression operator for variadic debug values This patch adds a new instruction that can represent variadic debug values, DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set of basic code changes in MachineInstr and a few adjacent areas, but does not correctly handle variadic debug values outside of these areas, nor does it generate them at any point. The new instruction is similar to the existing DBG_VALUE instruction, with the following differences: the operands are in a different order, any number of values may be used in the instruction following the Variable and Expression operands (these are referred to in code as “debug operands”) and are indexed from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the expression. The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the index given by the argument onto the Expression stack. For example the sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at index 0 onto the expression stack.” Differential Revision: https://reviews.llvm.org/D82363	2021-03-04 11:45:35 +00:00
Hongtao Yu	d0f29b816e	[CSSPGO] Deduplicating dangling pseudo probes. Same dangling probes are redundant since they all have the same semantic that is to rely on the counts inference tool to get reasonable count for the same original block. Therefore, there's no need to keep multiple copies of them. I've seen jump threading created tons of redundant dangling probes that slowed down the compiler dramatically. Other optimization passes can also result in redundant probes though without an observed impact so far. This change removes block-wise redundant dangling probes specifically introduced by jump threading. To support removing redundant dangling probes caused by all other passes, a final function-wise deduplication is also added. An 18% size win of the .pseudo_probe section was seen for SPEC2017. No performance difference was observed. Differential Revision: https://reviews.llvm.org/D97482	2021-03-03 22:44:42 -08:00
Hongtao Yu	f2b87eabed	[CSSPGO] Unblocking optimizations by dangling pseudo probes. This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments. The optimizations being unblocked are: 1. Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted. 2. Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions. 3. Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks. Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D97481	2021-03-03 22:44:42 -08:00
Philip Reames	5c3ba57c9e	Sink routine for replacing a operand bundle to CallBase [NFC] We had equivalent code for both CallInst and InvokeInst, but never cared about the result type.	2021-03-03 12:07:55 -08:00
Hans Wennborg	c2ea8c4219	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR" This caused miscompiles of Chromium tests for iOS due clobbering of live registers. See discussion on the code review for details. > Background: > > This fixes a longstanding problem where llvm breaks ARC's autorelease > optimization (see the link below) by separating calls from the marker > instructions or retainRV/claimRV calls. The backend changes are in > https://reviews.llvm.org/D92569. > > https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue > > What this patch does to fix the problem: > > - The front-end adds operand bundle "clang.arc.attachedcall" to calls, > which indicates the call is implicitly followed by a marker > instruction and an implicit retainRV/claimRV call that consumes the > call result. In addition, it emits a call to > @llvm.objc.clang.arc.noop.use, which consumes the call result, to > prevent the middle-end passes from changing the return type of the > called function. This is currently done only when the target is arm64 > and the optimization level is higher than -O0. > > - ARC optimizer temporarily emits retainRV/claimRV calls after the calls > with the operand bundle in the IR and removes the inserted calls after > processing the function. > > - ARC contract pass emits retainRV/claimRV calls after the call with the > operand bundle. It doesn't remove the operand bundle on the call since > the backend needs it to emit the marker instruction. The retainRV and > claimRV calls are emitted late in the pipeline to prevent optimization > passes from transforming the IR in a way that makes it harder for the > ARC middle-end passes to figure out the def-use relationship between > the call and the retainRV/claimRV calls (which is the cause of > PR31925). > > - The function inliner removes an autoreleaseRV call in the callee if > nothing in the callee prevents it from being paired up with the > retainRV/claimRV call in the caller. It then inserts a release call if > claimRV is attached to the call since autoreleaseRV+claimRV is > equivalent to a release. If it cannot find an autoreleaseRV call, it > tries to transfer the operand bundle to a function call in the callee. > This is important since the ARC optimizer can remove the autoreleaseRV > returning the callee result, which makes it impossible to pair it up > with the retainRV/claimRV call in the caller. If that fails, it simply > emits a retain call in the IR if retainRV is attached to the call and > does nothing if claimRV is attached to it. > > - SCCP refrains from replacing the return value of a call with a > constant value if the call has the operand bundle. This ensures the > call always has at least one user (the call to > @llvm.objc.clang.arc.noop.use). > > - This patch also fixes a bug in replaceUsesOfNonProtoConstant where > multiple operand bundles of the same kind were being added to a call. > > Future work: > > - Use the operand bundle on x86-64. > > - Fix the auto upgrader to convert call+retainRV/claimRV pairs into > calls with the operand bundles. > > rdar://71443534 > > Differential Revision: https://reviews.llvm.org/D92808 This reverts commit ed4718eccb12bd42214ca4fb17d196d49561c0c7.	2021-03-03 15:51:40 +01:00
Arthur Eubanks	83c7bddf7d	[opt] Error if -debug-pass is specified alongside the new PM Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D97810	2021-03-02 15:59:28 -08:00
Kazu Hirata	99066de18c	[IR] Use range-based for loops (NFC)	2021-03-01 23:40:33 -08:00
Yuanfang Chen	afa87767f4	[Diagnose] Unify MCContext and LLVMContext diagnosing The situation with inline asm/MC error reporting is kind of messy at the moment. The errors from MC layout are not reliably propagated and users have to specify an inlineasm handler separately to get inlineasm diagnose. The latter issue is not a correctness issue but could be improved. * Kill LLVMContext inlineasm diagnose handler and migrate it to use DiagnoseInfo/DiagnoseHandler. * Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This covers use cases like inlineasm, MC, and any clients using SourceMgr. * Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all use cases, only one of them is used. * If LLVMContext is available, let MCContext uses LLVMContext's diagnose handler; if LLVMContext is not available, MCContext uses its own default diagnose handler which just prints SMDiagnostic. * Change a few clients(Clang, llc, lldb) to use the new way of reporting. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D97449	2021-03-01 15:58:37 -08:00
Kazu Hirata	a3564b4137	[IR] Use range-based for loops (NFC)	2021-02-28 10:59:23 -08:00
Kazu Hirata	cf6f3cec61	[IR] Use range-based for loops (NFC)	2021-02-27 10:09:25 -08:00
James Y Knight	a0f2b285c4	Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg. And then push those change throughout LLVM. Keep the old signature in Clang's CGBuilder for now -- that will be updated in a follow-on patch (D97224). The MLIR LLVM-IR dialect is not updated to support the new alignment attribute, but preserves its existing behavior. Differential Revision: https://reviews.llvm.org/D97223	2021-02-25 18:29:42 -05:00
Nicolas Guillemot	22858cc076	[PM] Show the pass argument in pre/post-pass IR dumps This patch adds each pass' pass argument in the header for IR dumps. For example: Before: ``` * IR Dump Before InstructionSelect * ``` After: ``` * IR Dump Before InstructionSelect (instruction-select) * ``` The goal is to make it easier to know what argument to pass to command line options like `debug-only` or `run-pass` to further investigate a given pass.	2021-02-25 14:02:00 -08:00
Stanislav Mekhanoshin	3c6f5332f7	Option to ignore llvm[.compiler].used uses in hasAddressTaken() Differential Revision: https://reviews.llvm.org/D96087	2021-02-25 10:06:24 -08:00
Stanislav Mekhanoshin	71a0b16dc0	Option to ignore assume like intrinsic uses in hasAddressTaken() Differential Revision: https://reviews.llvm.org/D96081	2021-02-25 09:48:29 -08:00
Yaxun (Sam) Liu	a55327fc56	[HIP] Fix managed variable linkage Currently managed variables are emitted as undefined symbols, which causes difficulty for diagnosing undefined symbols for non-managed variables. This patch transforms managed variables in device compilation so that they can be emitted as normal variables. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D96195	2021-02-23 22:34:45 -05:00
Fangrui Song	d3b888c43c	collectUsedGlobalVariables: migrate SmallPtrSetImpl overload to SmallVecImpl overload after D97128 And delete the SmallPtrSetImpl overload. While here, decrease inline element counts from 8 to 4. See D97128 for the choice. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D97257	2021-02-23 16:09:06 -08:00
Fangrui Song	7d33c2a5f8	[ThinLTO] Make cloneUsedGlobalVariables deterministic Iterating on `SmallPtrSet<GlobalValue *, 8>` with more than 8 elements is not deterministic. Use a SmallVector instead because `Used` is guaranteed to contain unique elements. While here, decrease inline element counts from 8 to 4. The number of `llvm.used`/`llvm.compiler.used` elements is usually 0 or 1. For full LTO/hybrid LTO, the number may be large, so we need to be careful. According to tejohnson's analysis https://reviews.llvm.org/D97128#2582399 , 4 is good for a large project with WholeProgramDevirt, when available_externally vtables are placed in the llvm.compiler.used set. Differential Revision: https://reviews.llvm.org/D97128	2021-02-23 16:09:05 -08:00
Andy Kaylor	771be1b79d	Add auto-upgrade support for annotation intrinsics The llvm.ptr.annotation and llvm.var.annotation intrinsics were changed since the 11.0 release to add an additional parameter. This patch auto-upgrades IR containing the four-parameter versions of these intrinsics, adding a null pointer as the fifth argument. Differential Revision: https://reviews.llvm.org/D95993	2021-02-22 15:42:16 -08:00
Sanjay Patel	3d791c7666	[IR] restrict vector reduction intrinsic types The arguments in all cases should be vectors of exactly one of integer or FP. All of the tests currently pass the verifier because we check for any vector type regardless of the type of reduction. This obviously can't work if we mix up integer and FP, and based on current LangRef text it was not intended to work for pointers either. The pointer case from https://llvm.org/PR49215 is what led me here. That example was avoided with 5b250a27ec. Differential Revision: https://reviews.llvm.org/D96904	2021-02-21 12:37:00 -05:00
Nikita Popov	81952c68f5	[ConstantRange] Handle wrapping ranges in min/max (PR48643) When one of the inputs is a wrapping range, intersect with the union of the two inputs. The union of the two inputs corresponds to the result we would get if we treated the min/max as a simple select. This fixes PR48643.	2021-02-20 22:52:09 +01:00
Nikita Popov	ca3345ac4e	[ConstantRange] Handle wrapping range in binaryNot() We don't need any special handling for wrapping ranges (or empty ranges for that matter). The sub() call will already compute a correct and precise range. We only need to adjust the test expectation: We're now computing an optimal result, rather than an unsigned envelope.	2021-02-20 21:45:59 +01:00
Tim Shen	e909c229f8	Patch by @wecing (Chenguang Wang). The current getFoldedSizeOf() implementation uses naive recursion, which could be really slow when the input structure type is too complex. This issue was first brought up in http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by adding memoization. Differential Revision: https://reviews.llvm.org/D6594	2021-02-19 12:44:17 -08:00
Sanjay Patel	eb2508cda0	[Verifier] remove dead code for saturating intrinsics; NFC Test coverage shows that we assert with the string from the tablegen defs file for these intrinsics, so these cases should never be live.	2021-02-19 14:58:25 -05:00
Nikita Popov	56e22ed4c5	[IR] Move willReturn() to Instruction This moves the willReturn() helper from CallBase to Instruction, so that it can be used in a more generic manner. This will make it easier to fix additional passes (ADCE and BDCE), and will give us one place to change if additional instructions should become non-willreturn (e.g. there has been talk about handling volatile operations this way). I have also included the IntrinsicInst workaround directly in here, so that it gets applied consistently. (As such this change is not entirely NFC -- FuncAttrs will now use this as well.) Differential Revision: https://reviews.llvm.org/D96992	2021-02-19 11:56:01 +01:00
Leonard Chan	b854304d0e	[llvm][IR] Do not place constants with static relocations in a mergeable section This patch provides two major changes: 1. Add getRelocationInfo to check if a constant will have static, dynamic, or no relocations. (Also rename the original needsRelocation to needsDynamicRelocation.) 2. Only allow a constant with no relocations (static or dynamic) to be placed in a mergeable section. This will allow unused symbols that contain static relocations and happen to fit in mergeable constant sections (.rodata.cstN) to instead be placed in unique-named sections if -fdata-sections is used and subsequently garbage collected by --gc-sections. See https://lists.llvm.org/pipermail/llvm-dev/2021-February/148281.html. Differential Revision: https://reviews.llvm.org/D95960	2021-02-18 15:39:00 -08:00
Nikita Popov	9fa93a3e70	[BasicAA] Always strip single-argument phi nodes We can always look through single-argument (LCSSA) phi nodes when performing alias analysis. getUnderlyingObject() already does this, but stripPointerCastsAndInvariantGroups() does not. We still look through these phi nodes with the usual aliasPhi() logic, but sometimes get sub-optimal results due to the restrictions on value equivalence when looking through arbitrary phi nodes. I think it's generally beneficial to keep the underlying object logic and the pointer cast stripping logic in sync, insofar as it is possible. With this patch we get marginally better results: aa.NumMayAlias \| 5010069 \| 5009861 aa.NumMustAlias \| 347518 \| 347674 aa.NumNoAlias \| 27201336 \| 27201528 ... licm.NumPromoted \| 1293 \| 1296 I've renamed the relevant strip method to stripPointerCastsForAliasAnalysis(), as we're past the point where we can explicitly spell out everything that's getting stripped. Differential Revision: https://reviews.llvm.org/D96668	2021-02-18 23:07:50 +01:00
William S. Moses	08661b0074	[SROA] Propagate correct TBAA/TBAA Struct offsets SROA does not correctly account for offsets in TBAA/TBAA struct metadata. This patch creates functionality for generating new MD with the corresponding offset and updates SROA to use this functionality. Differential Revision: https://reviews.llvm.org/D95826	2021-02-17 11:59:00 -05:00
Igor Kudrin	305dfa701a	[DebugInfo] Keep the DWARF64 flag in the module metadata This allows the option to affect the LTO output. Module::Max helps to generate debug info for all modules in the same format. Differential Revision: https://reviews.llvm.org/D96597	2021-02-17 17:03:34 +07:00
Wei Wang	58f68f472f	[LTO] Perform DSOLocal propagation in combined index Perform DSOLocal propagation within summary list of every GV. This avoids the repeated query of this information during function importing. Differential Revision: https://reviews.llvm.org/D96398	2021-02-12 22:58:26 -08:00
James Y Knight	de99742eb0	LLVM-C: Allow LLVM{Get/Set}Alignment on an atomicrmw/cmpxchg instruction. (Now that these can have alignment specified.)	2021-02-12 18:31:18 -05:00
James Y Knight	91b7513f11	Fix layering after ed4718eccb12. That commit added a dependency from IR to Analysis, which isn't allowed. Fix it by duplicating a string constant.	2021-02-12 14:54:59 -05:00
Xun Li	fcb89a3296	[NFC][Coroutine] Fix an error message on coro.id verification The error message should be about coro.id, not coro.begin Differential Revision: https://reviews.llvm.org/D96447	2021-02-12 10:44:03 -08:00
Akira Hatanaka	078c441e76	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-12 09:51:57 -08:00
Guillaume Chatelet	d83a14c77d	Encode alignment attribute for `cmpxchg` This is a follow up patch to D83136 adding the align attribute to `cmpxchg`. See also D83465 for `atomicrmw`. Differential Revision: https://reviews.llvm.org/D87443	2021-02-11 15:17:50 -05:00
Guillaume Chatelet	1d36d2af13	Encode alignment attribute for `atomicrmw` This is a follow up patch to D83136 adding the align attribute to `atomicwmw`. Differential Revision: https://reviews.llvm.org/D83465	2021-02-11 15:17:37 -05:00
Duncan P. N. Exon Smith	65e9e80474	ValueMapper: Rename RF_MoveDistinctMDs => RF_ReuseAndMutateDistinctMDs, NFC Rename the `RF_MoveDistinctMDs` flag passed into `MapValue` and `MapMetadata` to `RF_ReuseAndMutateDistinctMDs` in order to more precisely describe its effect and clarify the header documentation. Found this while helping to investigate PR48841, which pointed out an unsound use of the flag in `CloneModule()`. For now I've just added a FIXME there, but I'm hopeful that the new (more precise) name will prevent other similar errors.	2021-02-10 16:53:21 -08:00
Hongtao Yu	20454adfae	[CSSPGO] Unblock optimizations with pseudo probe instrumentation. The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include: 1. IR InstCombine, sinking load operation to shorten lifetimes. 2. MIR LiveRangeShrink, similar to #1 3. MIR TwoAddressInstructionPass, i.e, opeq transform 4. MIR function argument copy elision 5. IR stack protection. (though not perf-critical but nice to have). Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D95982	2021-02-10 12:43:17 -08:00
Arthur Eubanks	be513286ab	Specify that some flags are legacy PM-specific Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D96100	2021-02-10 10:53:04 -08:00
Nico Weber	7e84b293f2	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly" This reverts commit 4a64d8fe392449b205e59031aad5424968cf7446. Makes clang crash when buildling trivial iOS programs, see comment after https://reviews.llvm.org/D92808#2551401	2021-02-09 11:06:32 -05:00
Fangrui Song	02db0b26fe	[Verifier] Allow DW_TAG_class_type/DW_TAG_union_type to have no filename `clang/lib/CodeGen/CGOpenMPRuntime.cpp` synthesized union (`distinct !DICompositeType(tag: DW_TAG_union_type, name: "kmp_cmplrdata_t", size: 64, elements: <0x62b690>)`) does not have meaningful filename/line number. D94735 dropped the previously arbitrary and untested filename/line from the union and caused a verifier error here. This fixes `check-libarcher` failures. Differential Revision: https://reviews.llvm.org/D96212	2021-02-08 13:31:05 -08:00
Adrian Prantl	fa4635d9f3	Have stripDebugInfo() also strip !llvm.loop annotations from all instructions. The !llvm.loop annotations consist of pointers into the debug info, so when stripping the debug info (particularly important when it is malformed!) !llvm.loop annotations need to be stripped as well, or else the malformed debug info stays around. This patch applies the stripping to all instructions, not just terminator instructions. rdar://73687049 Differential Revision: https://reviews.llvm.org/D96181	2021-02-05 17:22:41 -08:00
Akira Hatanaka	52b9cd0c02	[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies 3fe3946d9a958b7af6130241996d9cfcecf559d4 without the changes made to lib/IR/AutoUpgrade.cpp, which was violating layering. Original commit message: Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.rv" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if the call is annotated with claimRV since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the implicit call is a call to retainRV and does nothing if it's a call to claimRV. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-05 06:09:42 -08:00
Akira Hatanaka	9b804d4398	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly" This reverts commit 3fe3946d9a958b7af6130241996d9cfcecf559d4. The commit violates layering by including a header from Analysis in lib/IR/AutoUpgrade.cpp.	2021-02-05 06:00:05 -08:00
Akira Hatanaka	b1b35f383e	[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.rv" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if the call is annotated with claimRV since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the implicit call is a call to retainRV and does nothing if it's a call to claimRV. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-05 05:55:18 -08:00
Kazu Hirata	70b89a598e	[IR] Drop unnecessary const from return types (NFC) Identified with const-return-type.	2021-02-04 21:18:02 -08:00
Adrian Prantl	81c0c6d2aa	Remove overzealous verifier check on DW_OP_LLVM_entry_value and improve the documentation Based on the comments in the code, the idea is that AsmPrinter is unable to produce entry value blocks of arbitrary length, such as DW_OP_entry_value [DW_OP_reg5 DW_OP_lit1 DW_OP_plus]. But the way the Verifier check is written it also disallows DW_OP_entry_value [DW_OP_reg5] DW_OP_lit1 DW_OP_plus which seems to overshoot the target. Note that this patch does not change any of the safety guards in LiveDebugValues — there is zero behavior change for clang. It just allows us to legalize more complex expressions in future patches. rdar://73907559 Differential Revision: https://reviews.llvm.org/D95990	2021-02-04 10:58:35 -08:00
Juneyoung Lee	bda396ca51	Revert "[ConstantFold] Fold more operations to poison" This reverts commit 53040a968dc2ff20931661e55f05da2ef8b964a0 due to its bad interaction with select i1 -> and/or i1 transformation. This fixes: https://bugs.llvm.org/show_bug.cgi?id=49005 https://bugs.llvm.org/show_bug.cgi?id=48435	2021-02-04 00:24:02 +09:00
Hongtao Yu	3594bb4c1b	[CSSPGO] Introducing distribution factor for pseudo probe. Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count. This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes. A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead. Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D93264	2021-02-02 11:55:01 -08:00
David Sherwood	81a8f9f3aa	[SVE][LoopVectorize] Add masked load/store and gather/scatter support for SVE This patch updates IRBuilder::CreateMaskedGather/Scatter to work with ScalableVectorType and adds isLegalMaskedGather/Scatter functions to AArch64TargetTransformInfo. In addition I've fixed up isLegalMaskedLoad/Store to return true for supported scalar types, since this is what the vectorizer asks for. In LoopVectorize.cpp I've changed LoopVectorizationCostModel::getInterleaveGroupCost to return an invalid cost for scalable vectors, since currently this relies upon using shuffle vector for reversing vectors. In addition, in LoopVectorizationCostModel::setCostBasedWideningDecision I have assumed that the cost of scalarising memory ops is infinitely expensive. I have added some simple masked load/store and gather/scatter tests, including cases where we use gathers and scatters for conditional invariant loads and stores. Differential Revision: https://reviews.llvm.org/D95350	2021-02-02 09:52:39 +00:00
Jeroen Dobbelaere	2a8989c820	Revert "[Verifier] enable llvm.experimental.noalias.scope.decl dominance check." the 'clang-with-lto-ubuntu' buildbot triggers the assertion. This reverts commit b43c395e60d2636ab5afc9b60a2046978c71e366.	2021-02-01 14:38:33 +01:00
Jeroen Dobbelaere	7618d9f77d	[Verifier] enable llvm.experimental.noalias.scope.decl dominance check. Now that Loop Peeling has been fixed (80cdd30eb90c3509bf315f1fa1369483e2448bbd), enable the dominance check by default. This reverts commit 3b5d36ece21f9baf96d82944b0165cb352443bee.	2021-02-01 11:53:01 +01:00
Yang Fan	e552570018	[NFC][IR][AsmWriter] Fix Wreturn-type gcc warning GCC warning: ``` /llvm-project/llvm/lib/IR/AsmWriter.cpp:3175:1: warning: control reaches end of non-void function [-Wreturn-type] 3175 \| } \| ^ ```	2021-01-28 16:42:30 +08:00
Kazu Hirata	2c90b2760d	[llvm] Use append_range (NFC)	2021-01-27 23:25:41 -08:00
Fangrui Song	b3b970744d	[ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags Imported functions and variable get the visibility from the module supplying the definition. However, non-imported definitions do not get the visibility from (ELF) the most constraining visibility among all modules (Mach-O) the visibility of the prevailing definition. This patch * adds visibility bits to GlobalValueSummary::GVFlags * computes the result visibility and propagates it to all definitions Protected/hidden can imply dso_local which can enable some optimizations (this is stronger than GVFlags::DSOLocal because the implied dso_local can be leveraged for ELF -shared while default visibility dso_local has to be cleared for ELF -shared). Note: we don't have summaries for declarations, so for ELF if a declaration has the most constraining visibility, the result visibility may not be that one. Differential Revision: https://reviews.llvm.org/D92900	2021-01-27 10:43:51 -08:00
Petr Hosek	3cfd18b793	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 17:13:34 -08:00
Petr Hosek	ef4906bd9c	Revert "Support for instrumenting only selected files or functions" This reverts commit 4edf35f11a9e20bd5df3cb47283715f0ff38b751 because the test fails on Windows bots.	2021-01-26 12:25:28 -08:00
Petr Hosek	a8839c989d	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 11:11:39 -08:00
Richard Smith	b9cfb936a1	Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV" This reverts commit 53176c168061d6f26dcf3ce4fa59288b7d67255e, which introduceed a layering violation. LLVM's IR library can't include headers from Analysis.	2021-01-25 13:53:38 -08:00
Akira Hatanaka	96195f7442	[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end annotates calls with attribute "clang.arc.rv"="retain" or "clang.arc.rv"="claim", which indicates the call is implicitly followed by a marker instruction and a retainRV/claimRV call that consumes the call result. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the annotated calls in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the annotated calls. It doesn't remove the attribute on the call since the backend needs it to emit the marker instruction. The retainRV/claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes the autoreleaseRV call in the callee that returns the result if nothing in the callee prevents it from being paired up with the calls annotated with "clang.arc.rv"="retain/claim" in the caller. If the call is annotated with "claim", a release call is inserted since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the attributes to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV call returning the callee result, which makes it impossible to pair it up with the retainRV or claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the call is annotated with "retain" and does nothing if it's annotated with "claim". - This patch teaches dead argument elimination pass not to change the return type of a function if any of the calls to the function are annotated with attribute "clang.arc.rv". This is necessary since the pass can incorrectly determine nothing in the IR uses the function return, which can happen since the front-end no longer explicitly emits retainRV/claimRV calls in the IR, and change its return type to 'void'. Future work: - Use the attribute on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the attributes. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-01-25 11:57:08 -08:00
Jeroen Dobbelaere	099d05c508	[Verifier] disable llvm.experimental.noalias.scope.decl dominance check. This was enabled in https://reviews.llvm.org/D95335 but it breaks the stage2 fuchsia build (See http://lab.llvm.org:8011/#/builders/98/builds/4105/steps/9/logs/stdio)	2021-01-25 16:43:08 +01:00
Jeroen Dobbelaere	8d875453b4	[Verifier] enable and limit llvm.experimental.noalias.scope.decl dominance checking Checking the llvm.experimental.noalias.scope.decl dominance can be worstcase O(N^2). Limit the dominance check to N=32. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D95335	2021-01-25 16:19:12 +01:00
Michael Kruse	d945273b52	[OpenMPIRBuilder] Implement tileLoops. The tileLoops method implements the code generation part of the tile directive introduced in OpenMP 5.1. It takes a list of loops forming a loop nest, tiles it, and returns the CanonicalLoopInfo representing the generated loops. The implementation takes n CanonicalLoopInfos, n tile size Values and returns 2*n new CanonicalLoopInfos. The input CanonicalLoopInfos are invalidated and BBs not reused in the new loop nest removed from the function. In a modified version of D76342, I was able to correctly compile and execute a tiled loop nest. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D92974	2021-01-23 19:39:29 -06:00
Kazu Hirata	0069e576c3	[llvm] Use isAlpha/isAlnum (NFC)	2021-01-22 23:25:03 -08:00
Yaxun (Sam) Liu	2e16ddcdc8	[HIP] Support __managed__ attribute This patch implements codegen for __managed__ variable attribute for HIP. Diagnostics will be added later. Differential Revision: https://reviews.llvm.org/D94814	2021-01-22 11:43:58 -05:00
Nikita Popov	8c54b2d6f0	[IR] Optimize adding attribute to AttributeList (NFC) When adding an enum attribute to an AttributeList, avoid going through an AttrBuilder and instead directly add the attribute to the correct set. Going through AttrBuilder is expensive, because it requires all string attributes to be reconstructed. This can be further improved by inserting the attribute at the right position and using the AttributeSetNode::getSorted() API. This recovers the small compile-time regression from D94633.	2021-01-22 11:30:21 +01:00
Jay Foad	eee5207f2a	[LegacyPM] Update InversedLastUser on the fly. NFC. This speeds up setLastUser enough to give a 5% to 10% speed up on trivial invocations of opt and llc, as measured by: perf stat -r 100 opt -S -o /dev/null -O3 /dev/null perf stat -r 100 llc -march=amdgcn /dev/null -filetype null Don't dump last use information unless -debug-pass=Details to avoid printing lots of spam that will break some existing lit tests. Before this patch, dumping last use information was broken anyway, because it used InversedLastUser before it had been populated. Differential Revision: https://reviews.llvm.org/D92309	2021-01-22 09:48:54 +00:00
Kazu Hirata	69d44af40a	[llvm] Use isDigit (NFC)	2021-01-21 19:59:50 -08:00
Kazu Hirata	ff389465ca	[llvm] Don't include StringSwitch.h where unnecessary (NFC)	2021-01-21 19:59:48 -08:00
Bjorn Pettersson	be875842f9	[PM] Avoid duplicates in the Used/Preserved/Required sets The pass analysis uses "sets" implemented using a SmallVector type to keep track of Used, Preserved, Required and RequiredTransitive passes. When having nested analyses we could end up with duplicates in those sets, as there was no checks to see if a pass already existed in the "set" before pushing to the vectors. This idea with this patch is to avoid such duplicates by avoiding pushing elements that already is contained when adding elements to those sets. To align with the above PMDataManager::collectRequiredAndUsedAnalyses is changed to skip adding both the Required and RequiredTransitive passes to its result vectors (since RequiredTransitive always is a subset of Required we ended up with duplicates when traversing both sets). Main goal with this is to avoid spending time verifying the same analysis mulitple times in PMDataManager::verifyPreservedAnalysis when iterating over the Preserved "set". It is assumed that removing duplicates from a "set" shouldn't have any other negative impact (I have not seen any problems so far). If this ends up causing problems one could do some uniqueness filtering of the vector being traversed in verifyPreservedAnalysis instead. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D94416	2021-01-20 13:55:18 +01:00
Juneyoung Lee	c70101b156	Allow nonnull/align attribute to accept poison Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison. To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null. This makes many transformations like below legal: ``` %p = gep inbounds %x, 1 ; % p is non-null pointer or poison call void @f(%p) ; instcombine converts this to call void @f(nonnull %p) ``` Instead, this semantics makes propagation of `nonnull` to caller illegal. The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument. Having `noundef` attribute there re-allows this. ``` define void @f(i8* %p) { ; functionattr cannot mark %p nonnull here anymore call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p. ret void } ``` Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90529	2021-01-20 11:31:23 +09:00
Craig Topper	4097ff94d8	[IR] Allow scalable vectors in structs to support intrinsics returning multiple values. RISC-V would like to use a struct of scalable vectors to return multiple values from intrinsics. This woud also be needed for target independent intrinsics like llvm.sadd.overflow. This patch removes the existing restriction for this. I've modified StructType::isSized to consider a struct containing scalable vectors as unsized so the verifier won't allow loads/stores/allocas of these structs. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D94142	2021-01-17 23:29:51 -08:00
Kazu Hirata	ceea2ccdb5	[IRBuilder] "Zero"-initialize SmallVector (NFC)	2021-01-17 10:39:47 -08:00
Kazu Hirata	8b192f274e	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-16 09:40:53 -08:00
Jeroen Dobbelaere	4d4e2b930d	Introduce llvm.noalias.decl intrinsic The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a noalias scope is declared. When the intrinsic is duplicated, a decision must also be made about the scope: depending on the reason of the duplication, the scope might need to be duplicated as well. Reviewed By: nikic, jdoerfert Differential Revision: https://reviews.llvm.org/D93039	2021-01-16 09:20:45 +01:00
Alok Kumar Sharma	5ffb093588	[Debuginfo][DW_OP_implicit_pointer] (1/7) Support for DW_OP_LLVM_implicit_pointer New dwarf operator DW_OP_LLVM_implicit_pointer is introduced (present only in LLVM IR) This operator is required as it is different than DWARF operator DW_OP_implicit_pointer in representation and specification (number and types of operands) and later can not be used as multiple level. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D84113	2021-01-15 14:45:04 +05:30
Kazu Hirata	39185b091b	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Kazu Hirata	04ea28f569	[llvm] Use llvm::find_if (NFC)	2021-01-11 18:48:06 -08:00
Quentin Colombet	16cd617ccf	[NFC][LICM] Minor improvements to debug output Added a utility function in Value class to print block name and use block labels for unnamed blocks. Changed LICM to call this function in its debug output. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D93577	2021-01-11 18:02:49 -08:00
Florian Hahn	867bd6d8b8	[STLExtras] Use return type from operator* of the wrapped iter. Currently make_early_inc_range cannot be used with iterators with operator* implementations that do not return a reference. Most notably in the LLVM codebase, this means the User iterator ranges cannot be used with make_early_inc_range, which slightly simplifies iterating over ranges while elements are removed. Instead of directly using BaseT::reference as return type of operator, this patch uses decltype to get the actual return type of the operator implementation in WrappedIteratorT. This patch also updates a few places to use make use of make_early_inc_range. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93992	2021-01-10 14:41:13 +00:00
Fangrui Song	7e4e2696fe	[IR] Delete unused ReplaceLast in DebugLoc::appendInlineAt Not used after r304488	2021-01-08 23:28:22 -08:00
Heejin Ahn	50147487de	[WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin `wasm_rethrow_in_catch` intrinsic and builtin are used in order to rethrow an exception when the exception is caught but there is no matching clause within the current `catch`. For example, ``` try { foo(); } catch (int n) { ... } ``` If the caught exception does not correspond to C++ `int` type, it should be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch` because at the time I thought there would be another intrinsic for C++'s `throw` keyword, which rethrows an exception. It turned out that `throw` keyword doesn't require wasm's `rethrow` instruction, so we rename `rethrow_in_catch` to just `rethrow` here. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D94038	2021-01-08 06:55:04 -08:00
Arthur Eubanks	17f2d36a06	[NewPM] Don't error when there's an unrecognized pass name This currently blocks --print-before/after with a legacy PM pass, for example when we use the new PM for the optimization pipeline but the legacy PM for the codegen pipeline. Also in the future when the codegen pipeline works with the new PM there will be multiple places to specify passes, so even when everything is using the new PM, there will still be multiple places that can accept different pass names. Reviewed By: hoy, ychen Differential Revision: https://reviews.llvm.org/D94283	2021-01-07 22:33:32 -08:00
Oliver Stannard	88767efa6e	Revert "[llvm] Use BasicBlock::phis() (NFC)" Reverting because this causes crashes on the 2-stage buildbots, for example http://lab.llvm.org:8011/#/builders/7/builds/1140. This reverts commit 9b228f107d43341ef73af92865f73a9a076c5a76.	2021-01-07 09:43:33 +00:00
Kazu Hirata	8fd273682c	[llvm] Use llvm::all_of (NFC)	2021-01-06 18:27:36 -08:00
Kazu Hirata	745ad84858	[llvm] Use BasicBlock::phis() (NFC)	2021-01-06 18:27:35 -08:00
Kazu Hirata	3109d596ee	[llvm] Use llvm::append_range (NFC)	2021-01-06 18:27:33 -08:00
Juneyoung Lee	0b995628a9	[Constant] Update ConstantVector::get to return poison if all input elems are poison The diff was reviewed at D93994	2021-01-07 09:26:07 +09:00
Juneyoung Lee	691497c4e5	[Constant] Add containsPoisonElement This patch - Adds containsPoisonElement that checks existence of poison in constant vector elements, - Renames containsUndefElement to containsUndefOrPoisonElement to clarify its behavior & updates its uses properly With this patch, isGuaranteedNotToBeUndefOrPoison's tests w.r.t constant vectors are added because its analysis is improved. Thanks! Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D94053	2021-01-06 12:10:33 +09:00
Simon Pilgrim	c5a8d5ee85	[IR] Add ConstantInt::getBool helpers to wrap getTrue/getFalse.	2021-01-05 11:01:10 +00:00
Kazu Hirata	08389fa62e	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-04 11:42:44 -08:00
Simon Pilgrim	c41a85c760	[IR] CallBase::getBundleOpInfoForOperand - ensure Current iterator is defined. NFCI. Fix clang static analyzer undefined pointer warning in the case Begin == End.	2021-01-04 15:30:15 +00:00
Fangrui Song	c7c5340c75	Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar. The refactoring is made available by 2820a2ca3a0e69c3f301845420e0067ffff2251b. Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states: * -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out. Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.	2020-12-31 13:59:45 -08:00
Andrew Litteken	0c19daeb5f	[IROutliner] Adding consistent function attribute merging When combining extracted functions, they may have different function attributes. We want to make sure that we do not make any assumptions, or lose any information. This attempts to make sure that we consolidate function attributes to their most general case. Tests: llvm/test/Transforms/IROutliner/outlining-compatible-and-attribute-transfer.ll llvm/test/Transforms/IROutliner/outlining-compatible-or-attribute-transfer.ll Reviewers: jdoefert, paquette Differential Revision: https://reviews.llvm.org/D87301	2020-12-31 12:30:23 -06:00
Sanjay Patel	6e879de8bb	[IR] remove 'NoNan' param when creating FP reductions This is no-functional-change-intended (AFAIK, we can't isolate this difference in a regression test). That's because the callers should be setting the IRBuilder's FMF field when creating the reduction and/or setting those flags after creating. It doesn't make sense to override this one flag alone. This is part of a multi-step process to clean up the FMF setting/propagation. See PR35538 for an example.	2020-12-30 09:51:23 -05:00
Juneyoung Lee	1ec361c6b1	clang-format, address warnings	2020-12-30 23:05:07 +09:00
Juneyoung Lee	46421cee58	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Luo, Yuanke	4ef6280b52	[X86] Add x86_amx type for intel AMX. The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927	2020-12-30 13:52:13 +08:00
Craig Topper	a4e63dffe1	[Verifier] Remove declaration of method that was removed 8.5 years ago. NFC	2020-12-29 21:04:19 -08:00
Kazu Hirata	d6be2b089a	[Analysis, IR] Use *Map::lookup (NFC)	2020-12-29 19:23:24 -08:00
Juneyoung Lee	e6839b113d	[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93793	2020-12-30 04:21:04 +09:00
Samuel Eubanks	939f38fb2b	Make NPM OptBisectInstrumentation use global singleton OptBisect Currently there is an issue where the legacy pass manager uses a different OptBisect counter than the new pass manager. This fix makes the npm OptBisectInstrumentation use the global OptBisect. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D92897	2020-12-20 13:47:56 -08:00
Kazu Hirata	0abf9f34bf	[Analysis, IR, CodeGen] Use llvm::erase_if (NFC)	2020-12-20 09:19:35 -08:00
Kazu Hirata	22a8c864f7	[Analysis, CodeGen, IR] Use contains (NFC)	2020-12-18 19:08:17 -08:00
Chih-Ping Chen	c44b393235	[DebugInfo] Support Fortran 'use <external module>' statement. The main change is to add a 'IsDecl' field to DIModule so that when IsDecl is set to true, the debug info entry generated for the module would be marked as a declaration. That way, the debugger would look up the definition of the module in the gloabl scope. Please see the comments in llvm/test/DebugInfo/X86/dimodule.ll for what the debug info entries would look like. Differential Revision: https://reviews.llvm.org/D93462	2020-12-18 13:10:57 -05:00
Whitney Tsang	0ac56aa46f	Ensure SplitEdge to return the new block between the two given blocks This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’. Below is an example of splitting the block bb1 at its first instruction. /// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: Differential Revision: https://reviews.llvm.org/D92200	2020-12-18 17:37:17 +00:00
Rong Xu	92b9137fcb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Bangtian Liu	33b4e1043e	Revert "Ensure SplitEdge to return the new block between the two given blocks" This reverts commit d20e0c3444ad9ada550d9d6d1d56fd72948ae444.	2020-12-17 21:00:37 +00:00
Bangtian Liu	a2ec1d8ec2	Ensure SplitEdge to return the new block between the two given blocks This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’. Below is an example of splitting the block bb1 at its first instruction. /// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: Differential Revision: https://reviews.llvm.org/D92200	2020-12-17 16:00:15 +00:00
Jun Ma	2d3aa79604	[InstCombine] Remove scalable vector restriction in InstCombineCasts Differential Revision: https://reviews.llvm.org/D93389	2020-12-17 22:02:33 +08:00
Barry Revzin	2fc9f32ca3	Make LLVM build in C++20 mode Part of the <=> changes in C++20 make certain patterns of writing equality operators ambiguous with themselves (sorry!). This patch goes through and adjusts all the comparison operators such that they should work in both C++17 and C++20 modes. It also makes two other small C++20-specific changes (adding a constructor to a type that cases to be an aggregate, and adding casts from u8 literals which no longer have type const char*). There were four categories of errors that this review fixes. Here are canonical examples of them, ordered from most to least common: // 1) Missing const namespace missing_const { struct A { #ifndef FIXED bool operator==(A const&); #else bool operator==(A const&) const; #endif }; bool a = A{} == A{}; // error } // 2) Type mismatch on CRTP namespace crtp_mismatch { template <typename Derived> struct Base { #ifndef FIXED bool operator==(Derived const&) const; #else // in one case changed to taking Base const& friend bool operator==(Derived const&, Derived const&); #endif }; struct D : Base<D> { }; bool b = D{} == D{}; // error } // 3) iterator/const_iterator with only mixed comparison namespace iter_const_iter { template <bool Const> struct iterator { using const_iterator = iterator<true>; iterator(); template <bool B, std::enable_if_t<(Const && !B), int> = 0> iterator(iterator<B> const&); #ifndef FIXED bool operator==(const_iterator const&) const; #else friend bool operator==(iterator const&, iterator const&); #endif }; bool c = iterator<false>{} == iterator<false>{} // error \|\| iterator<false>{} == iterator<true>{} \|\| iterator<true>{} == iterator<false>{} \|\| iterator<true>{} == iterator<true>{}; } // 4) Same-type comparison but only have mixed-type operator namespace ambiguous_choice { enum Color { Red }; struct C { C(); C(Color); operator Color() const; bool operator==(Color) const; friend bool operator==(C, C); }; bool c = C{} == C{}; // error bool d = C{} == Red; } Differential revision: https://reviews.llvm.org/D78938	2020-12-17 10:44:10 +00:00
Hongtao Yu	66121fdf6a	[CSSPGO] Consume pseudo-probe-based AutoFDO profile This change enables pseudo-probe-based sample counts to be consumed by the sample profile loader under the regular `-fprofile-sample-use` switch with minimal adjustments to the existing sample file formats. After the counts are imported, a probe helper, aka, a `PseudoProbeManager` object, is automatically launched to verify the CFG checksum of every function in the current compilation against the corresponding checksum from the profile. Mismatched checksums will cause a function profile to be slipped. A `SampleProfileProber` pass is scheduled before any of the `SampleProfileLoader` instances so that the CFG checksums as well as probe mappings are available during the profile loading time. The `PseudoProbeManager` object is set up right after the profile reading is done. In the future a CFG-based fuzzy matching could be done in `PseudoProbeManager`. Samples will be applied only to pseudo probe instructions as well as probed callsites once the checksum verification goes through. Those instructions are processed in the same way that regular instructions would be processed in the line-number-based scenario. In other words, a function is processed in a regular way as if it was reduced to just containing pseudo probes (block probes and callsites). Adjustment to profile format A CFG checksum field is being added to the existing AutoFDO profile formats. So far only the text format and the extended binary format are supported. For the text format, a new line like ``` !CFGChecksum: 12345 ``` is added to the end of the body sample lines. For the extended binary profile format, we introduce a metadata section to store the checksum map from function names to their CFG checksums. Differential Revision: https://reviews.llvm.org/D92347	2020-12-16 15:57:18 -08:00
Bangtian Liu	e7d3773d91	Revert "Ensure SplitEdge to return the new block between the two given blocks" This reverts commit cf638d793c489632bbcf0ee0fbf9d0f8c76e1f48.	2020-12-16 11:52:30 +00:00
Bangtian Liu	e77001771a	Ensure SplitEdge to return the new block between the two given blocks This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’. Below is an example of splitting the block bb1 at its first instruction. /// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: Differential Revision: https://reviews.llvm.org/D92200	2020-12-15 23:32:29 +00:00
Fangrui Song	5dd827ba56	[IR] Delete deprecated DebugLoc::get	2020-12-15 14:53:12 -08:00
Johannes Doerfert	6230d2a2a4	[Clang][Attr] Introduce the `assume` function attribute The `assume` attribute is a way to provide additional, arbitrary information to the optimizer. For now, assumptions are restricted to strings which will be accumulated for a function and emitted as comma separated string function attribute. The key of the LLVM-IR function attribute is `llvm.assume`. Similar to `llvm.assume` and `__builtin_assume`, the `assume` attribute provides a user defined assumption to the compiler. A follow up patch will introduce an LLVM-core API to query the assumptions attached to a function. We also expect to add more options, e.g., expression arguments, to the `assume` attribute later on. The `omp [begin] asssumes` pragma will leverage this attribute and expose the functionality in the absence of OpenMP. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D91979	2020-12-15 16:51:34 -06:00
Kazu Hirata	49434d3cd6	[IR] Remove isPowerOf2ByteWidth The predicate used to be used with the C backend, which was removed on Mar 23, 2012 in commit 64a232343aa649fdacf78698da3e4d5737dee56a. It seems to be unused since then.	2020-12-14 23:00:17 -08:00
Gulfem Savrun Yeniceri	196c46aa87	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Matt Arsenault	4c16866a59	OpaquePtr: Require byval on x86_intrcc parameter 0 Currently the backend special cases x86_intrcc and treats the first parameter as byval. Make the IR require byval for this parameter to remove this special case, and avoid the dependence on the pointee element type. Fixes bug 46672. I'm not sure the IR is enforcing all the calling convention constraints. clang seems to ignore the attribute for empty parameter lists, but the IR tolerates it.	2020-12-14 16:34:37 -05:00
Fangrui Song	0140db71cc	Migrate deprecated DebugLoc::get to DILocation::get This migrates all LLVM (except Kaleidoscope and CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get. The CodeGen/StackProtector.cpp usage may have a nullptr Scope and can trigger an assertion failure, so I don't migrate it. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D93087	2020-12-11 12:45:22 -08:00
David Sherwood	e693cbf9f5	[Support] Introduce a new InstructionCost class This is the first in a series of patches that attempts to migrate existing cost instructions to return a new InstructionCost class in place of a simple integer. This new class is intended to be as light-weight and simple as possible, with a full range of arithmetic and comparison operators that largely mirror the same sets of operations on basic types, such as integers. The main advantage to using an InstructionCost is that it can encode a particular cost state in addition to a value. The initial implementation only has two states - Normal and Invalid - but these could be expanded over time if necessary. An invalid state can be used to represent an unknown cost or an instruction that is prohibitively expensive. This patch adds the new class and changes the getInstructionCost interface to return the new class. Other cost functions, such as getUserCost, etc., will be migrated in future patches as I believe this to be less disruptive. One benefit of this new class is that it provides a way to unify many of the magic costs in the codebase where the cost is set to a deliberately high number to prevent optimisations taking place, e.g. vectorization. It also provides a route to represent the extremely high, and unknown, cost of scalarization of scalable vectors, which is not currently supported. Differential Revision: https://reviews.llvm.org/D91174	2020-12-11 08:12:54 +00:00
Hongtao Yu	85e4f6f241	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 17:29:28 -08:00
Mitch Phillips	7c847657fe	Revert "[CSSPGO] Pseudo probe encoding and emission." This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1. Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269	2020-12-10 15:53:39 -08:00
Hongtao Yu	91873af129	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 09:50:08 -08:00
Florian Hahn	11dfe26f5c	[CallBase] Add hasRetAttr version that takes StringRef. This makes it slightly easier to deal with custom attributes and CallBase already provides hasFnAttr versions that support both AttrKind and StringRef arguments in a similar fashion. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D92567	2020-12-10 17:00:16 +00:00
Sander de Smalen	ca602d5346	[LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF. * Steps are scaled by `vscale`, a runtime value. * Changes to circumvent the cost-model for now (temporary) so that the cost-model can be implemented separately. This can vectorize the following loop [1]: void loop(int N, double a, double b) { #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < N; i++) { a[i] = b[i] + 1.0; } } [1] This source-level example is based on the pragma proposed separately in D89031. This patch only implements the LLVM part. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91077	2020-12-09 11:25:21 +00:00
Joe Ellis	eb525ef991	[SelectionDAG] Add llvm.vector.{extract,insert} intrinsics This commit adds two new intrinsics. - llvm.experimental.vector.insert: used to insert a vector into another vector starting at a given index. - llvm.experimental.vector.extract: used to extract a subvector from a larger vector starting from a given index. The codegen work for these intrinsics has already been completed; this commit is simply exposing the existing ISD nodes to LLVM IR. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D91362	2020-12-09 11:08:41 +00:00
Cullen Rhodes	d85b4494d3	[IR] Support scalable vectors in CastInst::CreatePointerCast Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92482	2020-12-09 10:39:36 +00:00
Simon Moll	358f33d2af	[VP] Build VP SDNodes Translate VP intrinsics to VP_* SDNodes. The tests check whether a matching vp_* SDNode is emitted. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91441	2020-12-09 11:36:51 +01:00
Roman Lebedev	65652d512a	[NFC][Instructions] Refactor CmpInst::getFlippedStrictnessPredicate() in terms of is{,Non}StrictPredicate()/get{Non,}StrictPredicate() In particular, this creates getStrictPredicate() method, to be symmetrical with already-existing getNonStrictPredicate().	2020-12-09 12:43:08 +03:00
Kazu Hirata	a7ca5fc6a4	[IR] Use llvm::is_contained (NFC)	2020-12-08 19:06:37 -08:00
Yuanfang Chen	c9bcfeec6a	[Time-report] Add a flag -ftime-report={per-pass,per-pass-run} to control the pass timing aggregation Currently, -ftime-report + new pass manager emits one line of report for each pass run. This potentially causes huge output text especially with regular LTO or large single file (Obeserved in private tests and was reported in D51276). The behaviour of -ftime-report + legacy pass manager is emitting one line of report for each pass object which has relatively reasonable text output size. This patch adds a flag `-ftime-report=` to control time report aggregation for new pass manager. The flag is for new pass manager only. Using it with legacy pass manager gives an error. It is a driver and cc1 flag. `per-pass` is the new default so `-ftime-report` is aliased to `-ftime-report=per-pass`. Before this patch, functionality-wise `-ftime-report` is aliased to `-ftime-report=per-pass-run`. * Adds an boolean variable TimePassesHandler::PerRun to control per-pass vs per-pass-run. * Adds a new clang CodeGen flag CodeGenOptions::TimePassesPerRun to work with the existing CodeGenOptions::TimePasses. * Remove FrontendOptions::ShowTimers, its uses are replaced by the existing CodeGenOptions::TimePasses. * Remove FrontendTimesIsEnabled (It was introduced in D45619 which was largely reverted.) Differential Revision: https://reviews.llvm.org/D92436	2020-12-08 10:13:19 -08:00
Cullen Rhodes	73794ea227	[IR] Remove CastInst::isCastable since it is not used It was removed back in 2013 (f63dfbb) by Matt Arsenault but then reverted since DragonEgg used it, but that project is no longer maintained. Reviewed By: ldionne, dexonsmith Differential Revision: https://reviews.llvm.org/D92571	2020-12-08 10:31:53 +00:00
Cullen Rhodes	143e05ecbb	[IR] Bail out for scalable vectors in ShuffleVectorInst::isConcat Shuffle mask for concat can't be expressed for scalable vectors, so we should bail out. A test has been added that previously crashed, also tested isIdentityWithPadding and isIdentityWithExtract where we already bail out. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92475	2020-12-07 10:48:35 +00:00
Arthur Eubanks	4fce57098e	[NewPM] Make pass adaptors less templatey Currently PassBuilder.cpp is by far the file that takes longest to compile. This is due to tons of templates being instantiated per pass. Follow PassManager by using wrappers around passes to avoid making the adaptors templated on the pass type. This allows us to move various adaptors' run methods into .cpp files. This reduces the compile time of PassBuilder.cpp on my machine from 66 to 39 seconds. It also reduces the size of opt from 685M to 676M. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D92616	2020-12-04 08:30:50 -08:00
Arthur Eubanks	5bc5d5ec44	[NewPM] Support --print-before/after in NPM This changes --print-before/after to be a list of strings rather than legacy passes. (this also has the effect of not showing the entire list of passes in --help-hidden after --print-before/after, which IMO is great for making it less verbose). Currently PrintIRInstrumentation passes the class name rather than pass name to llvm::shouldPrintBeforePass(), meaning llvm::shouldPrintBeforePass() never functions as intended in the NPM. There is no easy way of converting class names to pass names outside of within an instance of PassBuilder. This adds a map of pass class names to their short names in PassRegistry.def within PassInstrumentationCallbacks. It is populated inside the constructor of PassBuilder, which takes a PassInstrumentationCallbacks. Add a pointer to PassInstrumentationCallbacks inside PrintIRInstrumentation and use the newly created map. This is a bit hacky, but I can't think of a better way since the short id to class name only exists within PassRegistry.def. This also doesn't handle passes not in PassRegistry.def but rather added via PassBuilder::registerPipelineParsingCallback(). llvm/test/CodeGen/Generic/print-after.ll doesn't seem very useful now with this change. Reviewed By: ychen, jamieschmeiser Differential Revision: https://reviews.llvm.org/D87216	2020-12-03 16:52:14 -08:00
Fangrui Song	972a573aa5	[Metadata] Fix layer violation in D91576 There is a library layering issue. LLVMAnalysis provides llvm/Analysis/ScopedNoAliasAA.h and depends on LLVMCore. LLVMCore provides llvm/IR/Metadata.cpp and it should not include a header file in LLVMAnalysis	2020-12-03 10:58:46 -08:00
modimo	6b81c5b99d	[MemCpyOpt] Correctly merge alias scopes during call slot optimization When MemCpyOpt performs call slot optimization it will concatenate the `alias.scope` metadata between the function call and the memcpy. However, scoped AA relies on the domains in metadata to be maintained in a caller-callee relationship. Naive concatenation breaks this assumption leading to bad AA results. The fix is to take the intersection of domains then union the scopes within those domains. The original bug came from a case of rust bad codegen which uses this bad aliasing to perform additional memcpy optimizations. As show in the added test case `%src` got forwarded past its lifetime leading to a dereference of garbage data. Testing ninja check-llvm Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D91576	2020-12-03 09:23:37 -08:00
Xun Li	5243362b90	Small improvements to Intrinsic::getName While I was adding a new intrinsic instruction (not overloaded), I accidentally used CreateUnaryIntrinsic to create the intrinsics, which turns out to be passing the type list to getName, and ended up naming the intrinsics function with type suffix, which leads to wierd bugs latter on. It took me a long time to debug. It seems a good idea to add an assertion in getName so that it fails if types are passed but it's not a overloaded function. Also, the overloade version of getName is less efficient because it creates an std::string. We should avoid calling it if we know that there are no types provided. Differential Revision: https://reviews.llvm.org/D92523	2020-12-02 16:49:12 -08:00
Nick Desaulniers	5d27c8ae50	[Inline] prevent inlining on stack protector mismatch It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an attribute((no_stack_protector)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u SSP attributes can be ordered by strength. Weakest to strongest, they are: ssp, sspstrong, sspreq. Callees with differing SSP attributes may be inlined into each other, and the strongest attribute will be applied to the caller. (No change) After this change: * A callee with no SSP attributes will no longer be inlined into a caller with SSP attributes. * The reverse is also true: a callee with an SSP attribute will not be inlined into a caller with no SSP attributes. * The alwaysinline attribute overrides these rules. Functions that get synthesized by the compiler may not get inlined as a result if they are not created with the same stack protector function attribute as their callers. Alternative approach to https://reviews.llvm.org/D87956. Fixes pr/47479. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: rnk, MaskRay Differential Revision: https://reviews.llvm.org/D91816	2020-12-02 11:00:16 -08:00
Leonard Chan	6d2c502fb1	[llvm] Fix for failing test from fdbd84c6c819d4462546961f6086c1524d5d5ae8 When handling a DSOLocalEquivalent operand change: - Remove assertion checking that the `To` type and current type are the same type. This is not always a requirement. - Add a missing bitcast from an old DSOLocalEquivalent to the type of the new one.	2020-12-01 15:47:55 -08:00
Wei Wang	5e0aabe083	[Remarks][2/2] Expand remarks hotness threshold option support in more tools This is the #2 of 2 changes that make remarks hotness threshold option available in more tools. The changes also allow the threshold to sync with hotness threshold from profile summary with special value 'auto'. This change expands remarks hotness threshold option -fdiagnostics-hotness-threshold in clang and *-remarks-hotness-threshold in other tools to utilize hotness threshold from profile summary. Remarks hotness filtering relies on several driver options. Table below lists how different options are correlated and affect final remarks outputs: \| profile \| hotness \| threshold \| remarks printed \| \|---------\|---------\|-----------\|-----------------\| \| No \| No \| No \| All \| \| No \| No \| Yes \| None \| \| No \| Yes \| No \| All \| \| No \| Yes \| Yes \| None \| \| Yes \| No \| No \| All \| \| Yes \| No \| Yes \| None \| \| Yes \| Yes \| No \| All \| \| Yes \| Yes \| Yes \| >=threshold \| In the presence of profile summary, it is often more desirable to directly use the hotness threshold from profile summary. The new argument value 'auto' indicates threshold will be synced with hotness threshold from profile summary during compilation. The "auto" threshold relies on the availability of profile summary. In case of missing such information, no remarks will be generated. Differential Revision: https://reviews.llvm.org/D85808	2020-11-30 21:55:50 -08:00
Wei Wang	d0b74589e5	[Remarks][1/2] Expand remarks hotness threshold option support in more tools This is the #1 of 2 changes that make remarks hotness threshold option available in more tools. The changes also allow the threshold to sync with hotness threshold from profile summary with special value 'auto'. This change modifies the interface of lto::setupLLVMOptimizationRemarks() to accept remarks hotness threshold. Update all the tools that use it with remarks hotness threshold options: * lld: '--opt-remarks-hotness-threshold=' * llvm-lto2: '--pass-remarks-hotness-threshold=' * llvm-lto: '--lto-pass-remarks-hotness-threshold=' * gold plugin: '-plugin-opt=opt-remarks-hotness-threshold=' Differential Revision: https://reviews.llvm.org/D85809	2020-11-30 21:55:49 -08:00
Leonard Chan	03ffcb1a94	[llvm] Fix for failing test from cf8ff75bade763b054476321dcb82dcb2e7744c7 Handle null values when handling operand changes for DSOLocalEquivalent.	2020-11-30 17:22:28 -08:00
Nikita Popov	1f71c3e563	[DL] Inline getAlignmentInfo() implementation (NFC) Apart from getting the entry in the table (which is already a separate function), the remaining logic is different for all alignment types and is better combined with getAlignment(). This is a minor efficiency improvement, and should make further improvements like using separate storage for different alignment types simpler.	2020-11-30 20:56:15 +01:00
Nick Lewycky	25d19be185	Creating a named struct requires only a Context and a name, but looking up a struct by name requires a Module. The method on Module merely accesses the LLVMContextImpl and no data from the module itself, so this patch moves getTypeByName to a static method on StructType that takes a Context and a name. There's a small number of users of this function, they are all updated. This updates the C API adding a new method LLVMGetTypeByName2 that takes a context and a name. Differential Revision: https://reviews.llvm.org/D78793	2020-11-30 11:34:12 -08:00
Sanjay Patel	26ba573719	[IR][LoopRotate] remove assertion that phi must have at least one operand This was suggested in D92247 - I initially committed an alternate fix ( bfd2c216ea ) to avoid the crash/assert shown in https://llvm.org/PR48296 , but that was reverted because it caused msan failures on other tests. We can try to revive that patch using the test included here, but I do not have an immediate plan to isolate that problem.	2020-11-30 11:32:42 -05:00
Sanjay Patel	187ef7e7e0	[IR] improve code comment/logic in removePredecessor(); NFC This was suggested in the post-commit review of ce134da4b1.	2020-11-30 10:51:30 -05:00
Sanjay Patel	ac871b528f	Revert "[IR][LoopRotate] avoid leaving phi with no operands (PR48296)" This reverts commit bfd2c216ea8ef09f8fb1f755ca2b89f86f74acbb. This appears to be causing stage2 msan failures on buildbots: FAIL: LLVM :: Transforms/SimplifyCFG/X86/bug-25299.ll (65872 of 71835) ****************** TEST 'LLVM :: Transforms/SimplifyCFG/X86/bug-25299.ll' FAILED ****************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SimplifyCFG/X86/bug-25299.ll -simplifycfg -S \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SimplifyCFG/X86/bug-25299.ll -- Exit Code: 2 Command Output (stderr): -- ==87374==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x9de47b6 in getBasicBlockIndex /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/Instructions.h:2749:5 #1 0x9de47b6 in simplifyCommonResume /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:4112:23 #2 0x9de47b6 in simplifyResume /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:4039:12 #3 0x9de47b6 in (anonymous namespace)::SimplifyCFGOpt::simplifyOnce(llvm::BasicBlock) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6330:16 #4 0x9dcca13 in run /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6358:16 #5 0x9dcca13 in llvm::simplifyCFG(llvm::BasicBlock, llvm::TargetTransformInfo const&, llvm::SimplifyCFGOptions const&, llvm::SmallPtrSetImpl<llvm::BasicBlock>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6369:8 #6 0x974643d in iterativelySimplifyCFG(	2020-11-30 10:15:42 -05:00
Sanjay Patel	6b1415dcf0	[IR][LoopRotate] avoid leaving phi with no operands (PR48296) https://llvm.org/PR48296 shows an example where we delete all of the operands of a phi without actually deleting the phi, and that is currently considered invalid IR. The reduced test included here would crash for that reason. A suggested follow-up is to loosen the assert to allow 0-operand phis in unreachable blocks. Differential Revision: https://reviews.llvm.org/D92247	2020-11-30 09:28:45 -05:00
Juneyoung Lee	26954b5028	[ConstantFold] Don't fold and/or i1 poison to poison (NFC) .. because it causes miscompilation when combined with select i1 -> and/or. It is the select fold which is incorrect; but it is costly to disable the fold, so hack this one. D92270	2020-11-30 22:58:31 +09:00
Jay Foad	e98f4ccb3f	[LegacyPM] Simplify PMTopLevelManager::collectLastUses. NFC.	2020-11-30 10:36:19 +00:00
Nikita Popov	9af4629d4a	[DL] Optimize address space zero lookup (NFC) Information for pointer size/alignment/etc is queried a lot, but the binary search based implementation makes this fairly slow. Add an explicit check for address space zero and skip the search in that case -- we need to specially handle the zero address space anyway, as it serves as the fallback for all address spaces that were not explicitly defined. I initially wanted to simply replace the binary search with a linear search, which would handle both address space zero and the general case efficiently, but I was not sure whether there are any degenerate targets that use more than a handful of declared address spaces (in-tree, even AMDGPU only declares six).	2020-11-29 22:49:55 +01:00
Sanjay Patel	1e8aaec6b3	[IR] simplify code in removePredecessor(); NFCI As suggested in D92247 (and independent of whatever we decide to do there), this code is confusing as-is. Hopefully, this is at least mildly better. We might be able to do better still, but we have a function called "removePredecessor" with this behavior: "Note that this function does not actually remove the predecessor." (!)	2020-11-29 09:55:04 -05:00
Sanjay Patel	ce313ba7f9	[IR] remove redundant code comments; NFC As noted in D92247 (and independent of that patch): http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments "Don’t duplicate the documentation comment in the header file and in the implementation file. Put the documentation comments for public APIs into the header file."	2020-11-29 09:29:59 -05:00
Juneyoung Lee	45b0ec5d7b	[ConstantFold] Fold more operations to poison This patch folds more operations to poison. Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D92270	2020-11-29 21:19:48 +09:00
Juneyoung Lee	9bed1bd10d	[ConstantFold] Fold operations to poison if possible This patch updates ConstantFold, so operations are folded into poison if possible. <alive2 proofs> casts: https://alive2.llvm.org/ce/z/WSj7rw binary operations (arithmetic): https://alive2.llvm.org/ce/z/_7dEyJ binary operations (bitwise): https://alive2.llvm.org/ce/z/cezjVN vector/aggregate operations: https://alive2.llvm.org/ce/z/BQ7hWz unary ops: https://alive2.llvm.org/ce/z/yBRs4q other ops: https://alive2.llvm.org/ce/z/iXbcFD Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D92203	2020-11-29 02:28:40 +09:00
Francesco Petrogalli	4a2f3f7420	[AllocaInst] Update `getAllocationSizeInBits` to return `TypeSize`. Reviewed By: peterwaller-arm, sdesmalen Differential Revision: https://reviews.llvm.org/D92020	2020-11-27 16:39:10 +00:00
Jay Foad	59eb6dddd5	[LegacyPM] Avoid a redundant map lookup in setLastUser. NFC. As a bonus this makes it (IMO) obvious that the iterator is not invalidated, so remove the comment explaining that.	2020-11-27 10:42:01 +00:00
Jay Foad	6fe64ab931	[LegacyPM] Remove unused undocumented parameter. NFC. The Direction parameter to AnalysisResolver::getAnalysisIfAvailable has never been documented or used for anything.	2020-11-27 10:41:38 +00:00
Zhengyang Liu	fdfc9baedb	Fix use-of-uninitialized-value in rG75f50e15bf8f Differential Revision: https://reviews.llvm.org/D71126	2020-11-26 01:39:22 -07:00
Zhengyang Liu	f2658edb7a	Adding PoisonValue for representing poison value explicitly in IR Define ConstantData::PoisonValue. Add support for poison value to LLLexer/LLParser/BitcodeReader/BitcodeWriter. Add support for poison value to llvm-c interface. Add support for poison value to OCaml binding. Add m_Poison in PatternMatch. Differential Revision: https://reviews.llvm.org/D71126	2020-11-25 17:33:51 -07:00
Arthur Eubanks	cb9b83342f	Make CallInst::updateProfWeight emit i32 weights instead of i64 Typically branch_weights are i32, not i64. This fixes entry_counts_cold.ll under NPM. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90539	2020-11-24 18:13:59 -08:00
Simon Pilgrim	b1b0060ec0	[IR] Constant::getAggregateElement - early-out for ScalableVectorType We can't call getNumElements() for ScalableVectorType types - just bail for now, although ConstantAggregateZero/UndefValue could return a reasonable value. Fixes crash shown in OSS-Fuzz #25272 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=25272	2020-11-24 12:03:27 +00:00
Matt Arsenault	ec50a05ed6	Verifier: Fix assert when verifying non-pointer byval or preallocated This would fail on a cast<PointerType> when verifying the attribute if these attributes were incorrectly used with a non-pointer type.	2020-11-20 20:08:43 -05:00
Hongtao Yu	db4396f62a	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Alex Richardson	775dd2a2a2	[AMDGPU] Set the default globals address space to 1 This will ensure that passes that add new global variables will create them in address space 1 once the passes have been updated to no longer default to the implicit address space zero. This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't already to present to ensure bitcode backwards compatibility. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D84345	2020-11-20 15:46:53 +00:00
Alex Richardson	9c96f39f77	Add a default address space for globals to DataLayout This is similar to the existing alloca and program address spaces (D37052) and should be used when creating/accessing global variables. We need this in our CHERI fork of LLVM to place all globals in address space 200. This ensures that values are accessed using CHERI load/store instructions instead of the normal MIPS/RISC-V ones. The problem this is trying to fix is that most of the time the type of globals is created using a simple PointerType::getUnqual() (or ::get() with the default address-space value of 0). This does not work for us and we get assertion/compilation/instruction selection failures whenever a new call is added that uses the default value of zero. In our fork we have removed the default parameter value of zero for most address space arguments and use DL.getProgramAddressSpace() or DL.getGlobalsAddressSpace() whenever possible. If this change is accepted, I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead of relying on the default value of 0 for PointerType::get(), etc. This patch and the follow-up changes will not have any functional changes for existing backends with the default globals address space of zero. A follow-up commit will change the default globals address space for AMDGPU to 1. Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D70947	2020-11-20 15:46:52 +00:00
Leonard Chan	c24d9d2b01	[llvm][IR] Add dso_local_equivalent Constant The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target. When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit. In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done. See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details. Differential Revision: https://reviews.llvm.org/D77248	2020-11-19 10:26:17 -08:00
Nick Desaulniers	b2b1b97849	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit b7926ce6d7a83cdf70c68d82bc3389c04009b841. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Christopher Tetreault	97d1986ad4	[SVE] Take constant fold fast path for splatted vscale vectors This should be a perfectly reasonable operation for scalable vectors. Currently, it only works for zeroinitializer values of ScalableVectorType, but the fundamental operation is sound and it should be possible to make it work for other splats Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D77442	2020-11-17 12:45:31 -08:00
Simon Pilgrim	62500c8769	[IR] ShuffleVectorInst::isIdentityWithPadding - bail on non-fixed-type vector shuffles. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27416	2020-11-17 16:16:51 +00:00
Arthur Eubanks	535a7a1f46	[Debugify] Skip debugifying on special/immutable passes With a function pass manager, it would insert debuginfo metadata before getting to function passes while processing the pass manager, causing debugify to skip while running the function passes. Skip special passes + verifier + printing passes. Compared to the legacy implementation of -debugify-each, this additionally skips verifier passes. Probably no need to update the legacy version since it will be obsolete soon. This fixes 2 instcombine tests using -debugify-each under NPM. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D91558	2020-11-16 20:39:46 -08:00
Florian Hahn	e2fe6ad000	[IRGen] Add !annotation metadata for auto-init stores. This patch updates Clang's IRGen to add !annotation nodes with an "auto-init" annotation to all stores for auto-initialization. As discussed in 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) this allows using optimization remarks to track down where auto-init code was inserted (and not removed by optimizations). There are a few cases in the tests where !annotation gets dropped by optimizations. Those optimizations will be updated in subsequent patches. This patch is based on a patch by Francis Visoiu Mistrih. Reviewed By: thegameg, paquette Differential Revision: https://reviews.llvm.org/D91417	2020-11-16 10:37:02 +00:00
Simon Moll	28f80af0cf	[VP][NFC] Rename to HANDLE_VP_TO_OPC Use the less surprising shorthand OPC instead of OC.	2020-11-16 10:24:18 +01:00
Kazu Hirata	a80ed1a2ff	[IR] Use llvm::is_contained in BasicBlock::removePredecessor (NFC)	2020-11-15 21:15:31 -08:00
Roman Lebedev	f948b65a66	Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the implementation is known-broken for certain inputs, the bugreport was up for a significant amount of timer, and there has been no activity to address it. Therefore, just completely rip out all of misexpect handling. I suspect, fixing it requires redesigning the internals of MD_misexpect. Should anyone commit to fixing the implementation problem, starting from clean slate may be better anyways. This reverts commit 7bdad08429411e7d0ecd58cd696b1efe3cff309e, and some of it's follow-ups, that don't stand on their own.	2020-11-14 13:12:38 +03:00
Yuanfang Chen	06613d74b4	[CGProfile] allows bitcast in metadata node storing function pointers For example, during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D88433	2020-11-13 09:28:21 -08:00
Florian Hahn	041da6277f	Add !annotation metadata and remarks pass. This patch adds a new !annotation metadata kind which can be used to attach annotation strings to instructions. It also adds a new pass that emits summary remarks per function with the counts for each annotation kind. The intended uses cases for this new metadata is annotating 'interesting' instructions and the remarks should provide additional insight into transformations applied to a program. To motivate this, consider these specific questions we would like to get answered: * How many stores added for automatic variable initialization remain after optimizations? Where are they? * How many runtime checks inserted by a frontend could be eliminated? Where are the ones that did not get eliminated? Discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) Reviewed By: thegameg, jdoerfert Differential Revision: https://reviews.llvm.org/D91188	2020-11-13 13:24:10 +00:00
serge-sans-paille	82b6e6053d	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Kazushi (Jam) Marukawa	aa1acddbb3	[VE] Support vld intrinsics Add intrinsics for vector load instructions. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91332	2020-11-13 07:34:42 +09:00
Sebastian Neubauer	7e4be9501b	[AMDGPU] Add amdgpu_gfx calling convention Add a calling convention called amdgpu_gfx for real function calls within graphics shaders. For the moment, this uses the same calling convention as other calls in amdgpu, with registers excluded for return address, stack pointer and stack buffer descriptor. Differential Revision: https://reviews.llvm.org/D88540	2020-11-09 16:51:44 +01:00
Roman Lebedev	57330778e3	[IR] CmpInst: Add getFlippedSignednessPredicate() And refactor a few places to use it	2020-11-06 11:31:09 +03:00
Roman Lebedev	05a5fb18a8	[IR] CmpInst: add isEquality(Pred) Currently there is only a member version of isEquality(), which requires an actual [IF]CmpInst to be avaliable, which isn't always possible, and is inconsistent with the general pattern here. I wanted to use it in a new patch, but it wasn't there..	2020-11-06 11:31:09 +03:00
Roman Lebedev	a6b210b265	[IR] CmpInst: add getUnsignedPredicate() There's already getSignedPredicate(), it is not symmetrical to not have it's opposite. I wanted to use it in new code, but it wasn't there..	2020-11-06 11:31:08 +03:00
David Green	41688b499e	[CostModel] Make target intrinsics cheap by default This patch changes the intrinsics cost model to assume that by default target intrinsics are cheap. This didn't seem to be the case for all intrinsics, and is potentially an MVE problem due to our scalarization overheads. Cheap seems to be a good default in general though. Differential Revision: https://reviews.llvm.org/D90597	2020-11-03 09:58:28 +00:00
Arthur Eubanks	bb84082e59	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Arthur Eubanks	f52f1e83f5	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Craig Disselkoen	fe85e24882	C API: support scalable vectors This adds support for scalable vector types in the C API and in llvm-c-test, and also adds a test to ensure that llvm-c-test can properly roundtrip operations involving scalable vectors. While creating this diff, I discovered that the C API cannot properly roundtrip _constant expressions_ involving shufflevector / scalable vectors, but that seems to be a separate enough issue that I plan to address it in a future diff (unless reviewers feel it should be addressed here). Differential Revision: https://reviews.llvm.org/D89816	2020-10-28 18:19:34 -04:00
Adrian Prantl	4b8e1c57dc	[DebugInfo] Expose Fortran array debug info attributes through DIBuilder. The support of a few debug info attributes specifically for Fortran arrays have been added to LLVM recently, but there's no way to take advantage of them through DIBuilder. This patch extends DIBuilder::createArrayType to enable the settings of those attributes. Patch by Chih-Ping Chen! Differential Review: https://reviews.llvm.org/D90323	2020-10-28 13:13:35 -07:00
Alok Kumar Sharma	0d9afcbbba	[DebugInfo] Support for DW_TAG_generic_subrange This is needed to support fortran assumed rank arrays which have runtime rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF TAG DW_TAG_generic_subrange is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89218	2020-10-29 01:34:15 +05:30
Mitch Phillips	d27b0097ff	Revert "[DebugInfo] Expose Fortran array debug info attributes through DIBuilder." This reverts commit 5b3bf8b453b8cc00efd5269009a1e63c4442a30e. This caused a regression in the ASan buildbot. See comments at https://reviews.llvm.org/D89817 for more information.	2020-10-27 20:50:51 -07:00
Nicolai Hähnle	84f0cc96e8	Revert multiple patches based on "Introduce CfgTraits abstraction" These logically belong together since it's a base commit plus followup fixes to less common build configurations. The patches are: Revert "CfgInterface: rename interface() to getInterface()" This reverts commit a74fc481588fcea9317cbf1f6c5888a30c9edd2d. Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5" This reverts commit f2a06875b604c00cbe96e54363f4f5d28935d610. Revert "Try to make GCC5 happy about the CfgTraits thing" This reverts commit 03a5f7ce12e2111c8b7bc5a95cff4c51b516250f. Revert "Introduce CfgTraits abstraction" This reverts commit c0cdd22c72fab47a3c37b5a8401763995cadaa77.	2020-10-27 20:33:30 +01:00
Nico Weber	1ea6033a22	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Arthur Eubanks	3273c1a681	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Duncan P. N. Exon Smith	6c9c96836e	IR: Simplify two loops walking ConstantDataSequential, NFC Follow-up to b2b7cf39d596b1528cd64015575b3f5d1461c011. Differential Revision: https://reviews.llvm.org/D90198	2020-10-26 21:55:48 -04:00
Duncan P. N. Exon Smith	8696d90356	IR: Add a comment at missing std::make_unique calls from b2b7cf39d596b1528cd64015575b3f5d1461c011, NFC	2020-10-26 21:18:34 -04:00
Adrian Prantl	37ccbdeab3	[DebugInfo] Expose Fortran array debug info attributes through DIBuilder. The support of a few debug info attributes specifically for Fortran arrays have been added to LLVM recently, but there's no way to take advantage of them through DIBuilder. This patch extends DIBuilder::createArrayType to enable the settings of those attributes. Patch by Chih-Ping Chen! Differential Revision: https://reviews.llvm.org/D89817	2020-10-26 16:23:36 -07:00
Duncan P. N. Exon Smith	9ad16f9a7c	IR: Clarify ownership of ConstantDataSequentials, NFC Change `ConstantDataSequential::Next` to a `unique_ptr<ConstantDataSequential>` and update `CDSConstants` to a `StringMap<unique_ptr<ConstantDataSequential>>`, making the ownership more obvious. Differential Revision: https://reviews.llvm.org/D90083	2020-10-26 18:47:25 -04:00
Duncan P. N. Exon Smith	b79d3e2664	Avoid unnecessary uses of `MDNode::getTemporary`, NFC This is a long-delayed follow-up to 5e5b85098dbeaea2cfa5d01695b5d2982634d7dd. `TempMDNode` includes a bunch of machinery for RAUW, and should only be used when necessary. RAUW wasn't being used in any of these cases... it was just a placeholder for a self-reference. Where the real node was using `MDNode::getDistinct`, just replace the temporary argument with `nullptr`. Where the real node was using `MDNode::get`, the `replaceOperandWith` call was "promoting" the node to a distinct one implicitly due to self-reference detection in `MDNode::handleChangedOperand`. The `TempMDNode` was serving a purpose by delaying uniquing, but it's way simpler to just call `MDNode::getDistinct` in the first place. Note that using a self-reference at all in these places is a hold-over from before `distinct` metadata existed. It was an old trick to create distinct nodes. It would be intrusive to change, including bitcode upgrades, etc., and it's harmless so I'm not sure there's much value in removing it from existing schemas. After this commit it still has a tiny memory cost (in the extra metadata operand) but no more overhead in construction. Differential Revision: https://reviews.llvm.org/D90079	2020-10-26 17:03:25 -04:00
Nick Desaulniers	e95a065d26	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Serge Pavlov	e062f1d0ad	[IR] Merge metadata manipulation code into Value Now there are two main classes in Value hierarchy, which support metadata, these are Instruction and GlobalObject. They implement different APIs for metadata manipulation, which however overlap. This change moves metadata manipulation code into Value, so descendant classes can use this code for their operations on metadata. No functional changes intended. Differential Revision: https://reviews.llvm.org/D67626	2020-10-23 11:08:26 +07:00
Nikita Popov	edaf326d53	[DomTree] Make assert more precise Per asbirlea's comment, assert that only instructions, constants and arguments are passed to this API. Simplify returning true would not be correct for special Value subclasses like MemoryAccess.	2020-10-22 22:40:06 +02:00
Nikita Popov	f3f520a3f7	[DomTree] Accept Value as Def (NFC) Non-instruction defs like arguments, constants or global values always dominate all instructions/uses inside the function. This case currently needs to be treated separately by the caller, see https://reviews.llvm.org/D89623#inline-832818 for an example. This patch makes the dominator tree APIs accept a Value instead of an Instruction and always returns true for the non-Instruction case. A complication here is that BasicBlocks are also Values. For that reason we can't support the dominates(Value , BasicBlock ) variant, as it would conflict with dominates(BasicBlock , BasicBlock ), which has different semantics. For the other two APIs we assert that the passed value is not a BasicBlock. Differential Revision: https://reviews.llvm.org/D89632	2020-10-22 18:32:03 +02:00
Artur Pilipenko	b8f630c820	[RS4GC] NFC. Preparatory refactoring to make GC parseable memcpy For GC parseable element atomic memcpy/memmove we'll need to shuffle statepoint arguments. Make it possible by storing the arguments as Value , not Use .	2020-10-21 12:38:20 -07:00
Kazu Hirata	c31e34345d	[AsmWriter] Construct SlotTracker with the function This patch teaches BasicBlock::print to construct an instance of SlotTracker with the containing function. Without this patch, we dump: * IR Dump After LoopInstSimplifyPass * ; Preheader: br label %1 ; Loop: <badref>: ; preds = %1, %0 br label %1 Note "<badref>" above. This happens because BasicBlock::print calls: SlotTracker SlotTable(this->getModule()); Note that this constructor does not add the contents of functions to the slot table. That is, basic blocks are left unnumbered. This patch fixes the problem by switching to: SlotTracker SlotTable(this->getParent()); which does add the contents of the Module and the function, this->getParent(), to the slot table. Differential Revision: https://reviews.llvm.org/D89567	2020-10-20 15:01:40 -07:00
Shimin Cui	e5bb2c0aa5	[ConstantFold] Fold the comparison of bitcasted global values This is to simplify icmp instructions in the form like: %cmp = icmp eq i32 (i8, i8)* bitcast (i32 (i32, i32)* @f32 to i32 %(i8, i8)), bitcast (i32 (i64, i64) @f64 to i32 (i8, i8)*) Here @f32 and @f64 are two functions. Differential Revision: https://reviews.llvm.org/D87850	2020-10-20 12:41:49 -07:00
David Stenberg	5debead5bf	Handle value uses wrapped in metadata for the use-list order When generating the use-list order, also consider value uses that are operands which are wrapped in metadata; e.g. llvm.dbg.value operands. This fixes PR36778. The test case is based on the reproducer from that report. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D53758	2020-10-20 20:05:59 +02:00
Nicolai Hähnle	58c9429d2a	Introduce CfgTraits abstraction The CfgTraits abstraction simplfies writing algorithms that are generic over the type of CFG, and enables writing such algorithms as regular non-template code that operates on opaque references to CFG blocks and values. Implementations of CfgTraits provide operations on the concrete CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock `. CfgInterface is an abstract base class which provides operations on opaque types CfgBlockRef and CfgValueRef. Those opaque types encapsulate a `void `, but the meaning depends on the concrete CFG type. For example, MachineCfgTraits -- for use with MachineIR in SSA form -- encodes a Register inside CfgValueRef. Converting between concrete references and opaque/generic ones is done by CfgTraits::{fromGeneric,toGeneric}. Convenience methods CfgTraits::{un}wrap{Iterator,Range} are available as well. Writing algorithms in terms of CfgInterface adds some overhead (virtual method calls, plus in same cases it removes the opportunity to inline iterators), but can be much more convenient since generic algorithms can be written as non-templates. This patch adds implementations of CfgTraits for all CFGs on which dominator trees are calculated, so that the dominator tree can be ported to this machinery. Only IrCfgTraits (LLVM IR) and MachineCfgTraits (Machine IR in SSA form) are complete, the other implementations are limited to the absolute minimum required to make the upcoming dominator tree changes work. v5: - fix MachineCfgTraits::blockdef_iterator and allow it to iterate over the instructions in a bundle - use MachineBasicBlock::printName v6: - implement predecessors/successors for all CfgTraits implementations - fix error in unwrapRange - rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming that is consistent with {wrap,unwrap}{Iterator,Range} - use getVRegDef instead of getUniqueVRegDef v7: - std::forward fix in wrapping_iterator - fix typos v8: - cleanup operators on CfgOpaqueType - address other review comments Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d Differential Revision: https://reviews.llvm.org/D83088	2020-10-20 13:50:52 +02:00
Atmn Patel	770f362410	[IR] Adds mustprogress as a LLVM IR attribute This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85393	2020-10-20 03:09:57 -04:00
Alok Kumar Sharma	b846ffc438	[DebugInfo] Support for DWARF operator DW_OP_over LLVM rejects DWARF operator DW_OP_over. This DWARF operator is needed for Flang to support assumed rank array. Summary: Currently LLVM rejects DWARF operator DW_OP_over. Below error is produced when llvm finds this operator. [..] invalid expression !DIExpression(151, 20, 16, 48, 30, 35, 80, 34, 6) warning: ignoring invalid debug info in over.ll [..] There were some parts missing in support of this operator, which are now completed. Testing -added a unit testcase -check-debuginfo -check-llvm Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89208	2020-10-17 08:42:28 +05:30
Matt Arsenault	e3bfefd3cc	Reapply "OpaquePtr: Add type to sret attribute" This reverts commit eb9f7c28e5fe6d75fed3587023e17f2997c8024b. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.	2020-10-16 11:05:02 -04:00
Jameson Nash	b97e5e20ad	fix symbol printing on windows Similar to MCSymbol::print in 3d6c8ebb584375d01b1acead4c2056b3f0c501fc (llvm-svn: 81682, PR4966), these symbols may need to be quoted to be handled by the linker correctly. Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D87099	2020-10-15 17:14:55 -04:00
Matt Arsenault	781bfb732b	InstCombine: Fix infinite loop in copy-constant-to-alloca transform This was broken by 16295d521e294b27106e51fac29957c1aac8ff89, when instructions started being handled and not just constant expressions. This was re-inserting an equivalent bitcast to the original memcpy operand, which made a non-functional IR change on every iteration. This also fixes a secondary problem where it was inserting addrspacecasts which may not have been legal (i.e. it changed the source address space). Start visiting all pointer users and fail out if we can't process them. Also start handling the relevant memory intrinsic users. These cases can be dealt with by running InferAddressSpaces separately.	2020-10-14 12:55:25 -04:00
Ahsan Saghir	179c563da5	[PowerPC] Add assemble disassemble intrinsics for MMA This patch adds support for assemble disassemble intrinsics for MMA. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D88739	2020-10-13 13:21:58 -05:00

... 3 4 5 6 7 ...

4914 Commits