llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	a1a1eb4962	CommandLine.h - use auto const reference in ValuesClass::apply for range loop. NFCI.	2020-09-09 14:21:14 +01:00
Ronak Chauhan	7073a2ba14	Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors" This reverts commit 487a80531006add8102d50dbcce4b6fd729ab1f6. Tests fail on big endian machines.	2020-09-09 18:01:28 +05:30
Simon Pilgrim	2c3a34bc05	[KnownBits] Move SelectionDAG::computeKnownBits ISD::ABS handling to KnownBits::abs Move the ISD::ABS handling to a KnownBits::abs handler, to simplify future implementations in ValueTracking/GlobalISel.	2020-09-09 13:22:58 +01:00
Simon Pilgrim	558faf7f34	APInt.h - return directly from clearUnusedBits in single word cases. NFCI. Consistently use the same pattern of returning *this from the clearUnusedBits() call to allow us to early out from the isSingleWord() path and avoid an else statement.	2020-09-09 13:22:57 +01:00
David Stenberg	7d4bb5d4ca	[UnifyFunctionExitNodes] Fix Modified status for unreachable blocks If a function had at most one return block, the pass would return false regardless if an unified unreachable block was created. This patch fixes that by refactoring runOnFunction into two separate helper functions for handling the unreachable blocks respectively the return blocks, as suggested by @bjope in a review comment. This was caught using the check introduced by D80916. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D85818	2020-09-09 13:36:03 +02:00
Juneyoung Lee	bd9d25252a	[ValueTracking] Add UndefOrPoison/Poison-only version of relevant functions This patch adds isGuaranteedNotToBePoison and programUndefinedIfUndefOrPoison. isGuaranteedNotToBePoison will be used at D75808. The latter function is used at isGuaranteedNotToBePoison. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84242	2020-09-09 20:00:26 +09:00
Simon Pilgrim	3a3a752c6d	TrigramIndex.cpp - remove unnecessary includes. NFCI. TrigramIndex.h already includes most of these.	2020-09-09 11:38:31 +01:00
Simon Pilgrim	095c0be790	[APFloat] Fix uninitialized variable in IEEEFloat constructors Some constructors of IEEEFloat do not initialize member variable exponent. Fix it by initializing exponent with the following values: For NaNs, the `exponent` is `maxExponent+1`. For Infinities, the `exponent` is `maxExponent+1`. For Zeroes, the `exponent` is `maxExponent-1`. Patch by: @nullptr.cpp (Yang Fan) Differential Revision: https://reviews.llvm.org/D86997	2020-09-09 11:38:30 +01:00
Florian Hahn	97c9619e22	[DomTree] Use SmallVector<DomTreeNodeBase *, 4> instead of std::vector. Currentl DomTreeNodeBase is using std::vectot to store it's children. Using SmallVector should be more efficient in terms of compile-time. A size of 4 seems to be the sweet-spot in terms of compile-time, according to http://llvm-compile-time-tracker.com/compare.php?from=9933188c90615c9c264ebb69117f09726e909a25&to=d7a801d027648877b20f0e00e822a7a64c58d976&stat=instructions This results in the following geomean improvements ``` geomean insts max rss O3 -0.31 % +0.02 % ReleaseThinLTO -0.35 % -0.12 % ReleaseLTO -0.28 % -0.12 % O0 -0.06 % -0.02 % NewPM O3 -0.36 % +0.05 % ReleaseThinLTO (link only) -0.44 % -0.10 % ReleaseLTO-g (link only): -0.32 % -0.03 % ``` I am not sure if there's any other benefits of using std::vector over SmallVector. Reviewed By: kuhar, asbirlea Differential Revision: https://reviews.llvm.org/D87319	2020-09-09 11:20:13 +01:00
Denis Antrushin	1969b82658	[Statepoints] Properly handle const base pointer. Current code in InstEmitter assumes all GC pointers are either VRegs or stack slots - hence, taking only one operand. But it is possible to have constant base, in which case it occupies two machine operands. Add a convinience function to StackMaps to get index of next meta argument and use it in InsrEmitter to properly advance to the next statepoint meta operand. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87252	2020-09-09 14:07:00 +07:00
Johannes Doerfert	fd33d62d25	[Attributor] Associate the callback callee with a call site argument (if any) If we have a callback, call site arguments were already associated with the callback callee. Now we also associate the function with the callback callee, thus we know ensure that the following holds true (if all return nonnull): `getAssociatedArgument()->getParent() == getAssociatedFunction()` To test this an early exit from `AAMemoryBehaviorCallSiteArgument::initialize`` is included as well. Without the change to getAssociatedFunction() this kind of early exit for declarations would cause callback call site arguments to miss out.	2020-09-09 00:52:17 -05:00
Johannes Doerfert	0e508fe501	[Attributor] Cleanup `IRPosition::getArgNo` usages As we handle callback calls we need to disambiguate the call site argument number from the callee argument number. While always equal in non-callback calls, a callback comes with a partial parameter-argument mapping so there is no implicit correspondence. Here we split `IRPosition::getArgNo()` into two public functions, `getCallSiteArgNo()` and `getCalleeArgNo()`. Usages are adjusted to pick the right one for their purpose. This fixed some problems that would have been exposed as we more aggressively optimize callbacks.	2020-09-09 00:52:17 -05:00
Johannes Doerfert	8c1b461de9	[Attributor] Provide a command line option that limits recursion depth In `MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.cpp` we initialized attributes until stack frame ~35k caused space to run out. The initial size 1024 is pretty much random.	2020-09-09 00:47:02 -05:00
Max Kazantsev	e249eaa2a3	[NFC] Move functon from IndVarSimplify to SCEV This function can be reused in other places. Differential Revision: https://reviews.llvm.org/D87274 Reviewed By: fhahn, lebedev.ri	2020-09-09 11:20:59 +07:00
Fangrui Song	789e9aff28	[llvm-cov gcov] Compute unmeasured arc counts by Kirchhoff's circuit law For a CFG G=(V,E), Knuth describes that by Kirchoff's circuit law, the minimum number of counters necessary is \|E\|-(\|V\|-1). The emitted edges form a spanning tree. libgcov emitted .gcda files leverages this optimization while clang --coverage's doesn't. Propagate counts by Kirchhoff's circuit law so that llvm-cov gcov can correctly print line counts of gcc --coverage emitted files and enable the future improvement of clang --coverage.	2020-09-08 18:45:11 -07:00
Puyan Lotfi	cc62cbe131	[NFC] Fixing a gcc compiler warning. warning: type qualifiers ignored on cast result type [-Wignored-qualifiers] Differential Revision: https://reviews.llvm.org/D86952	2020-09-08 19:44:33 -04:00
Sergej Jaskiewicz	16b7e4e602	[llvm] [unittest] Allow getting a C string from the TempDir helper class The TempDir.path() member function returns a StringRef. We've been calling the data() method on that StringRef, which does not guarantee to return a null-terminated string (required by chdir and other POSIX functions). Introduce the c_str() method in the TempDir class, which returns the proper string without the need to create a copy of the path at use site.	2020-09-09 01:53:15 +03:00
David Blaikie	b72d78d1d8	llvm-symbolizer: Add optional "start file" to match "start line" Since a function might have portions of its code coming from multiple different files, "start line" is ambiguous (it can't just be resolved relative to the file/line specified). Add start file to disambiguate it.	2020-09-08 15:40:58 -07:00
Craig Topper	065f5d3388	[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.: This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130. This required adding a SDNodeFlags to SelectionDAG::getSetCC. Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines. Differential Revision: https://reviews.llvm.org/D87200	2020-09-08 15:27:21 -07:00
David Stenberg	b3f0a94428	[UnifyFunctionExitNodes] Remove unused getters, NFC The get{Return,Unwind,Unreachable}Block functions in UnifyFunctionExitNodes have not been used for many years, so just remove them. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D87078	2020-09-08 20:42:28 +02:00
Simon Pilgrim	34fd9f9953	CFGUpdate.h - remove unused APInt include. NFCI.	2020-09-08 18:25:25 +01:00
Volkan Keles	c4e63c6e36	GlobalISel: Combine `op undef, x` to 0 https://reviews.llvm.org/D86611	2020-09-08 09:46:38 -07:00
Simon Pilgrim	edc4524189	LiveRegUnits.h - reduce MachineRegisterInfo.h include. NFC. We only need to include MachineInstrBundle.h, but exposes an implicit dependency in MachineOutliner.h. Also, remove duplicate includes from LiveRegUnits.cpp + MachineOutliner.cpp.	2020-09-08 17:27:00 +01:00
Ronak Chauhan	6a6f27edd0	[AMDGPU] Support disassembly for AMDGPU kernel descriptors Decode AMDGPU Kernel descriptors as assembler directives. Reviewed By: scott.linder, jhenderson, kzhuravl Differential Revision: https://reviews.llvm.org/D80713	2020-09-08 21:26:11 +05:30
Xing GUO	51d61a7d47	[DWARFYAML] Make the debug_ranges section optional. This patch makes the debug_ranges section optional. When we specify an empty debug_ranges section, yaml2obj only emits the section header. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D87263	2020-09-08 19:55:47 +08:00
Roman Lebedev	8406429eae	Reland [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline This was reverted in 503deec2183d466dad64b763bab4e15fd8804239 because it caused gigantic increase (3x) in branch mispredictions in certain benchmarks on certain CPU's, see https://reviews.llvm.org/D84108#2227365. It has since been investigated and here are the results: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200907/827578.html > It's an amazingly severe regression, but it's also all due to branch > mispredicts (about 3x without this). The code layout looks ok so there's > probably something else to deal with. I'm not sure there's anything we can > reasonably do so we'll just have to take the hit for now and wait for > another code reorganization to make the branch predictor a bit more happy :) > > Thanks for giving us some time to investigate and feel free to recommit > whenever you'd like. > > -eric So let's just reland this. Original commit message: I've been looking at missed vectorizations in one codebase. One particular thing that stands out is that some of the loops reach vectorizer in a rather mangled form, with weird PHI's, and some of the loops aren't even in a rotated form. After taking a more detailed look, that happened because the loop's headers were too big by then. It is evident that SimplifyCFG's common code hoisting transform is at fault there, because the pattern it handles is precisely the unrotated loop basic block structure. Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled by default, and is always run, unlike it's friend, common code sinking transform, `SinkCommonCodeFromPredecessors()`, which is not enabled by default and is only run once very late in the pipeline. I'm proposing to harmonize this, and disable common code hoisting until //late// in pipeline. Definition of //late// may vary, here currently i've picked the same one as for code sinking, but i suppose we could enable it as soon as right after loop rotation happens. Experimentation shows that this does indeed unsurprizingly help, more loops got rotated, although other issues remain elsewhere. Now, this undoubtedly seriously shakes phase ordering. This will undoubtedly be a mixed bag in terms of both compile- and run- time performance, codesize. Since we no longer aggressively hoist+deduplicate common code, we don't pay the price of said hoisting (which wasn't big). That may allow more loops to be rotated, so we pay that price. That, in turn, that may enable all the transforms that require canonical (rotated) loop form, including but not limited to vectorization, so we pay that too. And in general, no deduplication means more [duplicate] instructions going through the optimizations. But there's still late hoisting, some of them will be caught late. As per benchmarks i've run {F12360204}, this is mostly within the noise, there are some small improvements, some small regressions. One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure this will expose many more pre-existing missed optimizations, as usual :S llvm-compile-time-tracker.com thoughts on this: http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions * this does regress compile-time by +0.5% geomean (unsurprizingly) * size impact varies; for ThinLTO it's actually an improvement The largest fallout appears to be in GVN's load partial redundancy elimination, it spends much more time in `MemoryDependenceResults::getNonLocalPointerDependency()`. Non-local `MemoryDependenceResults` is widely-known to be, uh, costly. There does not appear to be a proper solution to this issue, other than silencing the compile-time performance regression by tuning cut-off thresholds in `MemoryDependenceResults`, at the cost of potentially regressing run-time performance. D84609 attempts to move in that direction, but the path is unclear and is going to take some time. If we look at stats before/after diffs, some excerpts: * RawSpeed (the target) {F12360200} * -14 (-73.68%) loops not rotated due to the header size (yay) * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer * -3937 (-64.19%) common instructions hoisted * +561 (+0.06%) x86 asm instructions * -2 basic blocks * +2418 (+0.11%) IR instructions * vanilla test-suite + RawSpeed + darktable {F12360201} * -36396 (-65.29%) common instructions hoisted * +1676 (+0.02%) x86 asm instructions * +662 (+0.06%) basic blocks * +4395 (+0.04%) IR instructions It is likely to be sub-optimal for when optimizing for code size, so one might want to change tune pipeline by enabling sinking/hoisting when optimizing for size. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D84108 This reverts commit 503deec2183d466dad64b763bab4e15fd8804239.	2020-09-08 00:24:03 +03:00
Simon Pilgrim	8ac9bb3910	AntiDepBreaker.h - remove unnecessary ScheduleDAG.h include. NFCI.	2020-09-07 16:39:42 +01:00
Simon Pilgrim	9a57b8dde1	MachineStableHash.h - remove MachineInstr.h include. NFC. Use forward declarations and move the include to MachineStableHash.cpp	2020-09-07 13:33:48 +01:00
Sam Parker	70e9782c55	[SCEV] Refactor isHighCostExpansionHelper To enable the cost of constants, the helper function has been reorganised: - A struct has been introduced to hold SCEV operand information so that we know the user of the operand, as well as the operand index. The Worklist now uses instead instead of a bare SCEV. - The costing of each SCEV, and collection of its operands, is now performed in a helper function. Differential Revision: https://reviews.llvm.org/D86050	2020-09-07 11:57:46 +01:00
Sam Parker	7c4a7cb063	[SimplifyCFG] Consider cost of combining predicates. Modify FoldBranchToCommonDest to consider the cost of inserting instructions when attempting to combine predicates to fold blocks. The threshold can be controlled via a new option: -simplifycfg-branch-fold-threshold which defaults to '2' to allow the insertion of a not and another logical operator. Differential Revision: https://reviews.llvm.org/D86526	2020-09-07 10:04:50 +01:00
Jay Foad	a04922a28f	[GlobalISel] Extend not_cmp_fold to work on conditional expressions Differential Revision: https://reviews.llvm.org/D86709	2020-09-07 09:31:08 +01:00
Xing GUO	5f500e98a4	[DWARFYAML] Make the debug_addr section optional. This patch makes the debug_addr section optional. When an empty debug_addr section is specified, yaml2obj only emits a section header for it. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D87205	2020-09-07 16:17:18 +08:00
Raphael Isemann	daa93e6992	Add BinaryFormat/ELFRelocs/CSKY.def to LLVM modulemap	2020-09-07 10:14:22 +02:00
Jay Foad	3381408aad	[KnownBits] Implement accurate unsigned and signed max and min Use the new implementation in ValueTracking, SelectionDAG and GlobalISel. Differential Revision: https://reviews.llvm.org/D87034	2020-09-07 09:09:01 +01:00
Zi Xuan Wu	a51dedaf25	[ELF] Add a new e_machine value EM_CSKY and add some CSKY relocation types This is the split part of D86269, which add a new ELF machine flag called EM_CSKY and related relocations. Some target-specific flags and tests for csky can be added in follow-up patches later. Differential Revision: https://reviews.llvm.org/D86610	2020-09-07 10:42:28 +08:00
Amy Kwan	2477050bd8	[PowerPC] Implement Vector Expand Mask builtins in LLVM/Clang This patch implements the vec_expandm function prototypes in altivec.h in order to utilize the vector expand with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82727	2020-09-06 17:13:21 -05:00
Benjamin Kramer	c3337b0542	[SmallVector] Move error handling out of line This reduces duplication and avoids emitting ice cold code into every instance of grow().	2020-09-06 18:06:44 +02:00
Jonas Paulsson	7897525197	[SelectionDAG] Always intersect SDNode flags during getNode() node memoization. Previously SDNodeFlags::instersectWith(Flags) would do nothing if Flags was in an undefined state, which is very bad given that this is the default when getNode() is called without passing an explicit SDNodeFlags argument. This meant that if an already existing and reused node had a flag which the second caller to getNode() did not set, that flag would remain uncleared. This was exposed by https://bugs.llvm.org/show_bug.cgi?id=47092, where an NSW flag was incorrectly set on an add instruction (which did in fact overflow in one of the two original contexts), so when SystemZElimCompare removed the compare with 0 trusting that flag, wrong-code resulted. There is more that needs to be done in this area as discussed here: Differential Revision: https://reviews.llvm.org/D86871 Review: Ulrich Weigand, Sanjay Patel	2020-09-05 10:30:38 +02:00
Lang Hames	976cbeac7c	[ORC] Fix some bugs in TPCDynamicLibrarySearchGenerator, use in llvm-jitlink. TPCDynamicLibrarySearchGenerator was generating errors on missing symbols, but that doesn't fit the DefinitionGenerator contract: A symbol that isn't generated by a particular generator should not cause an error. This commit fixes the error by using SymbolLookupFlags::WeaklyReferencedSymbol for all elements of the lookup, and switches llvm-jitlink to use TPCDynamicLibrarySearchGenerator.	2020-09-04 13:23:52 -07:00
Teresa Johnson	03ab03eb46	[HeapProf] Address post-review comments in instrumentation code Addresses post-review comments from D85948, which can be found here: https://reviews.llvm.org/rG7ed8124d46f9.	2020-09-04 08:59:00 -07:00
Simon Pilgrim	cf654d4aae	CallingConvLower.h - remove unnecessary MachineFunction.h include. NFC. Reduce to forward declaration, add the Register.h include that we still needed, move CCState::ensureMaxAlignment into CallingConvLower.cpp as it was the only function that needed the full definition of MachineFunction. Fix a few implicit dependencies further down.	2020-09-04 12:16:48 +01:00
Simon Pilgrim	8b766631fa	MIRFormatter.h - remove MachineInstr.h include. NFC. Use forward declarations and include the inner dependencies directly.	2020-09-04 11:17:24 +01:00
David Sherwood	4a83ce1e3c	[SVE][CodeGen] Fix up warnings in sve-split-insert/extract tests I have fixed up some more ElementCount/TypeSize related warnings in the following tests: CodeGen/AArch64/sve-split-extract-elt.ll CodeGen/AArch64/sve-split-insert-elt.ll In SelectionDAG::CreateStackTemporary we were relying upon the implicit cast from TypeSize -> uint64_t when calling MachineFrameInfo::CreateStackObject. I've fixed this by passing in the known minimum size instead, which I believe is fine because the associated stack id indicates whether this is a scalable object or not. I've also fixed up a case in TargetLowering::SimplifyDemandedBits when extracting a vector element from a scalable vector. The result is a scalar, hence it wasn't caught at the start of the function. If the vector is scalable we just bail out for now. Differential Revision: https://reviews.llvm.org/D86431	2020-09-04 09:51:31 +01:00
Florian Hahn	ce6e59900a	[MemCpyOpt] Preserve MemorySSA. This patch updates MemCpyOpt to preserve MemorySSA. It uses the MemoryDef at the insertion point of the builder and inserts the new def after that def. In some cases, we just modify a memory instruction. In that case, get the defining access, then remove the memory access and add a new one. If the defining access is in a different block, insert a new def at the beginning of the current block, otherwise after the defining access. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86651	2020-09-04 09:05:33 +01:00
Fangrui Song	7b5b3c5b86	[SmallVector] Include stdexcept if LLVM_ENABLE_EXCEPTIONS std::length_error needs stdexcept.	2020-09-03 18:06:08 -07:00
Michael Liao	414ef02710	[codegen] Ensure target flags are cleared/set properly. NFC. - When an operand is changed into an immediate value or like, ensure their target flags being cleared or set properly. Differential Revision: https://reviews.llvm.org/D87109	2020-09-03 18:37:39 -04:00
Puyan Lotfi	de3532220a	[MIRVRegNamer] Experimental MachineInstr stable hashing (Fowler-Noll-Vo) This hashing scheme has been useful out of tree, and I want to start experimenting with it. Specifically I want to experiment on the MIRVRegNamer, MIRCanononicalizer, and eventually the MachineOutliner. This diff is a first step, that optionally brings stable hashing to the MIRVRegNamer (and as a result, the MIRCanonicalizer). We've tested this hashing scheme on a lot of MachineOperand types that llvm::hash_value can not handle in a stable manner. This stable hashing was also the basis for "Global Machine Outliner for ThinLTO" in EuroLLVM 2020 http://llvm.org/devmtg/2020-04/talks.html#TechTalk_58 Credits: Kyungwoo Lee, Nikolai Tillmann Differential Revision: https://reviews.llvm.org/D86952	2020-09-03 16:13:09 -04:00
Arthur Eubanks	9f8be8e69b	[NewPM][Lint] Port -lint to NewPM This also changes -lint from an analysis to a pass. It's similar to -verify, and that is a normal pass, and lives in llvm/IR. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87057	2020-09-03 13:03:44 -07:00
Wenlei He	9ba54f747b	SVML support for log2 Although LLVM supports vectorization of loops containing log2, it did not support using SVML implementation of it. Added support so that when clang is invoked with -fveclib=SVML now an appropriate SVML library log2 implementation will be invoked. Follow up on: https://reviews.llvm.org/D77114 Tests: Added unit tests to svml-calls.ll, svml-calls-finite.ll. Can be run with llvm-lint. Created a simple c++ file that tests log2, and used clang+ to build it, and output final assembly. Reviewed By: wenlei, craig.topper Differential Revision: https://reviews.llvm.org/D86730	2020-09-03 11:52:29 -07:00
Jamie Schmeiser	b707c16b9c	Revert "Add new hidden option -print-changed which only reports changes to IR" This reverts commit 7bc9924cb2fbd9f3ae53577607822ace267a04e6 due to failure caused by missing a space between trailing >>, required by some versions of C++:wq.	2020-09-03 18:41:20 +00:00
Simon Pilgrim	6f6a13c34a	SelectionDAG.h - remove unnecessary FunctionLoweringInfo.h include. NFCI. Use forward declarations and move the include down to dependent files that actually use it. This also exposes a number of implicit dependencies on KnownBits.h	2020-09-03 18:33:25 +01:00
Dimitry Andric	79876868be	Eliminate the sizing template parameter N from CoalescingBitVector Since the parameter is not used anywhere, and the default size of 16 apparently causes PR47359, remove it. This ensures that IntervalMap will automatically determine the optimal size, using its NodeSizer struct. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D87044	2020-09-03 18:15:41 +02:00
Jamie Schmeiser	60c5153584	Add new hidden option -print-changed which only reports changes to IR A new hidden option -print-changed is added along with code to support printing the IR as it passes through the opt pipeline in the new pass manager. Only those passes that change the IR are reported, with others only having the banner reported, indicating that they did not change the IR, were filtered out or ignored. Filtering of output via the -filter-print-funcs is supported and a new supporting hidden option -filter-passes is added. The latter takes a comma separated list of pass names and filters the output to only show those passes in the list that change the IR. The output can also be modified via the -print-module-scope function. The code introduces a template base class that generalizes the comparison of IRs that takes an IR representation as template parameter. The constructor takes a series of lambdas that provide an event based API for generalized reporting of IRs as they are changed in the opt pipeline through the new pass manager. The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information. Reviewed By: yrouban (Yevgeny Rouban) Differential Revision: https://reviews.llvm.org/D86360	2020-09-03 15:52:35 +00:00
Simon Pilgrim	2fae2eeab3	GlobalISel/Utils.h - remove unused includes. NFCI. Twine is unused, and TargetLowering can be reduced to a forward declaration and moved to Utils.cpp	2020-09-03 15:59:12 +01:00
Sanjay Patel	80dc8a6aaa	[IR][GVN] add/allow commutative intrinsics with >2 args Follow-up to D86798 and rGe25449f.	2020-09-03 10:14:53 -04:00
Florian Hahn	dc272f692f	[GVN] Preserve MemorySSA if it is available. Preserve MemorySSA if it is available before running GVN. DSE with MemorySSA will run closely after GVN. If GVN and 2 other passes preserve MemorySSA, DSE can re-use MemorySSA used by LICM when doing LTO. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86534	2020-09-03 12:28:13 +01:00
Martin Storsjö	5443644017	[AArch64] Add asm directives for the remaining SEH unwind codes Add support in llvm-readobj for displaying them and support in the asm parsser, AArch64TargetStreamer and MCWin64EH for emitting them. The directives for the remaining basic opcodes have names that match the opcode in the documentation. The directives for custom stack cases, that are named MSFT_OP_TRAP_FRAME, MSFT_OP_MACHINE_FRAME, MSFT_OP_CONTEXT and MSFT_OP_CLEAR_UNWOUND_TO_CALL, are given matching assembler directive names that fit into the rest of the opcode naming; .seh_trap_frame, .seh_context, .seh_clear_unwound_to_call The opcode MSFT_OP_MACHINE_FRAME is mapped to the existing opecode enum UOP_PushMachFrame that is used on x86_64, and also uses the corresponding existing x86_64 directive name .seh_pushframe. Differential Revision: https://reviews.llvm.org/D86889	2020-09-03 11:12:01 +03:00
Raphael Isemann	06e7d50498	Fix broken HUGE_VALF macro in llvm-c/DataTypes.h Commit 3a29393b4709d15069130119cf1d136af4a92d77 removes the cmath/math.h includes from the DataTypes.h header to speed up parsing. However the DataTypes.h header was using this header to get the macro `HUGE_VAL` for its own `HUGE_VALF` macro definition. Now the macro instead just expands into a plain `HUGE_VAL` token which leads to compiler errors unless `math.h` was previously included by the including source file. It also leads to compiler warnings with enabled module builds which point out this inconsistency. The correct way to fix this seems to be to just remove HUGE_VALF from the header. llvm-c is not referencing that macro from what I can see and users probably should just include the math headers if they need it (or define it on their own for really old C versions). Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D83761	2020-09-03 09:50:32 +02:00
Jonas Devlieghere	a1a9cb7026	[lldb] Add reproducer verifier Add a reproducer verifier that catches: - Missing or invalid home directory - Missing or invalid working directory - Missing or invalid module/symbol paths - Missing files from the VFS The verifier is enabled by default during replay, but can be skipped by passing --reproducer-no-verify. Differential revision: https://reviews.llvm.org/D86497	2020-09-02 22:00:00 -07:00
Arthur Eubanks	224abd6796	Revert "[NewPM][Lint] Port -lint to NewPM" This reverts commit 883399c8402188520870f99e7d8b3244f000e698.	2020-09-02 21:34:29 -07:00
Arthur Eubanks	bc13e99110	[NewPM][Lint] Port -lint to NewPM This also changes -lint from an analysis to a pass. It's similar to -verify, and that is a normal pass, and lives in llvm/IR. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87057	2020-09-02 21:13:01 -07:00
Craig Topper	639e60808d	[CodeGenPrepare][X86] Teach optimizeGatherScatterInst to turn a splat pointer into GEP with scalar base and 0 index This helps SelectionDAGBuilder recognize the splat can be used as a uniform base. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D86371	2020-09-02 20:44:12 -07:00
Geoffrey Martin-Noble	60cb217dc7	Improve error handling for SmallVector programming errors This patch changes errors in `SmallVector::grow` that are independent of memory capacity to be reported using report_fatal_error or std::length_error instead of report_bad_alloc_error, which falsely signals an OOM. It also cleans up a few related things: - makes report_bad_alloc_error to print the failure reason passed to it. - fixes the documentation to indicate that report_bad_alloc_error calls `abort()` not "an assertion" - uses a consistent name for the size/capacity argument to `grow` and `grow_pod` Reviewed By: mehdi_amini, MaskRay Differential Revision: https://reviews.llvm.org/D86892	2020-09-02 15:00:26 -07:00
Jay Foad	53df1fb508	[APInt] New member function setBitVal Differential Revision: https://reviews.llvm.org/D87033	2020-09-02 21:40:31 +01:00
Albion Fung	d56113a35e	[PowerPC] Implemented Vector Multiply Builtins This patch implements the builtins for Vector Multiply Builtins (vmulxxd family of instructions), and adds the appropriate test cases for these builtins. The builtins utilize the vector multiply instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D83955	2020-09-02 14:16:21 -05:00
Paul Walker	aeb7dd335b	[SVE] Don't reorder subvector/binop sequences when the resulting binop is not legal. When lowering fixed length vector operations for SVE the subvector operations are used extensively to marshall data between scalable and fixed-length vectors. This means that sequences like: extract_subvec(binop(insert_subvec(a), insert_subvec(b))) are very common. DAGCombine only checks if the resulting binop is legal or can be custom lowered when undoing such sequences. When it's custom lowering that is introducing them the result is an infinite legalise->combine->legalise loop. This patch extends the isOperationLegalOr... functions to include a "LegalOnly" parameter to restrict the check to legal operations only. Although isOperationLegal could be used it's common for the affected code paths to be visited pre and post legalisation, so the extra parameter keeps the code tidy. Differential Revision: https://reviews.llvm.org/D86450	2020-09-02 11:01:33 +01:00
Sander de Smalen	40471e4fe8	[AArch64][SVE] Preserve full vector regs over EH edge. Unwinders may only preserve the lower 64bits of Neon and SVE registers, as only the registers in the base ABI are guaranteed to be preserved over the exception edge. The caller will need to preserve additional registers for when the call throws an exception and the unwinder has tried to recover state. For e.g. svint32_t bar(svint32_t); svint32_t foo(svint32_t x, bool err) { try { bar(x); } catch (...) { err = true; } return x; } `z0` needs to be spilled before the call to `bar(x)` and reloaded before returning from foo, as the exception handler may have clobbered z0. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84737	2020-09-02 10:54:18 +01:00
Zi Xuan Wu	ed26c15759	[RFC][Target] Add a new triple called Triple::csky Before upstream a new target called CSKY, make a new triple of that called Triple::csky. For now, it's a 32-bit little endian target and the detail can be referred at D86269. This is the split part of D86269, which add a new target called CSKY. Differential Revision: https://reviews.llvm.org/D86505	2020-09-02 12:46:09 +08:00
Alina Sbirlea	d314cb9743	[MemCpyOptimizer] Preserve analyses and replace use of lambdas to get them. Summary: Analyses are preserved in MemCpyOptimizer. Get analyses before running the pass and store the pointers, instead of using lambdas and getting them every time on demand. Reviewers: lenary, deadalnix, mehdi_amini, nikic, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74494	2020-09-01 17:35:40 -07:00
Varun Gandhi	99d53ed697	[ADT] Make Optional a literal type. This allows returning Optional values from constexpr contexts. Reviewed By: fhahn, dblaikie, rjmccall Differential Revision: https://reviews.llvm.org/D86354	2020-09-01 16:13:40 -07:00
Amy Kwan	2cc81d44bf	[PowerPC] Implement builtins for xvcvspbf16 and xvcvbf16spn This patch adds the builtin implementation for the xvcvspbf16 and xvcvbf16spn instructions. Differential Revision: https://reviews.llvm.org/D86795	2020-09-01 17:16:43 -05:00
Cameron McInally	7b182e9bab	[SVE] Update INSERT_SUBVECTOR DAGCombine to use getVectorElementCount(). A small piece of the project to replace getVectorNumElements() with getVectorElementCount(). Differential Revision: https://reviews.llvm.org/D86894	2020-09-01 16:51:44 -05:00
Sergej Jaskiewicz	0975a7654a	[llvm] [unittests] Remove temporary files after they're not needed Some LLVM unit tests forget to clean up temporary files and directories. Introduce RAII classes for cleaning them up. Refactor the tests to use those classes. Differential Revision: https://reviews.llvm.org/D83228	2020-09-02 00:34:44 +03:00
Amara Emerson	e10dc0379d	Revert "Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")" This reverts commit 8693ddc74371dedc742c9f3d3e4eda1da72c13ea. Re-committing with the test requiring asserts.	2020-09-01 14:29:04 -07:00
Jordan Rupprecht	deb09864b8	Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.") This reverts commit 8ad8f484b63ca507417b58c9016d2761f2b1a1a8. It causes crashes when running `ninja check-llvm-codegen-aarch64-globalisel`, e.g. http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132/steps/test-stage1-compiler/logs/stdio. Note that the crash does not seem to reproduce in debug builds. 5ded4442520d3dbb1aa72e6fe03cddef8828c618 depends on this, so revert that too.	2020-09-01 13:31:57 -07:00
Florian Hahn	2248875209	[Loads] Add canReplacePointersIfEqual helper. This patch adds an initial, incomeplete and unsound implementation of canReplacePointersIfEqual to check if a pointer value A can be replaced by another pointer value B, that are deemed to be equivalent through some means (e.g. information from conditions). Note that is in general not sound to blindly replace pointers based on equality, for example if they are based on different underlying objects. LLVM's memory model is not completely settled as of now; see https://bugs.llvm.org/show_bug.cgi?id=34548 for a more detailed discussion. The initial version of canReplacePointersIfEqual only rejects a very specific case: replacing a pointer with a constant expression that is not dereferenceable. Such a replacement is problematic and can be restricted relatively easily without impacting most code. Using it to limit replacements in GVN/SCCP/CVP only results in small differences in 7 programs out of MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto. This patch is supposed to be an initial step to improve the current situation and the helper should be made stricter in the future. But this will require careful analysis of the impact on performance. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D85524	2020-09-01 20:57:41 +01:00
Arthur Eubanks	81eaf47f84	[Bindings] Add LLVMAddInstructionSimplifyPass Reviewed By: sroland Differential Revision: https://reviews.llvm.org/D86764	2020-09-01 12:38:49 -07:00
Amara Emerson	399486642d	[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _) This is needed for an upcoming change to how we translate conditional branches which might generate these. Differential Revision: https://reviews.llvm.org/D86383	2020-09-01 10:57:17 -07:00
Matt Arsenault	5e713f621a	GlobalISel: Implement computeNumSignBits for G_SELECT	2020-09-01 12:50:19 -04:00
Matt Arsenault	942cf2892e	GlobalISel: Implement computeKnownBits for G_BSWAP and G_BITREVERSE	2020-09-01 12:49:57 -04:00
Volkan Keles	72df60b8c0	GlobalISel: Add combines for extend operations https://reviews.llvm.org/D86516	2020-09-01 08:50:06 -07:00
Matt Arsenault	fbbea16c6b	GlobalISel: Implement computeKnownBits for G_UNMERGE_VALUES	2020-09-01 11:19:27 -04:00
Matt Arsenault	2c2bfb8ff0	GlobalISel: Artifact combine unmerge of unmerge Unmerges have the same fundamental problem as G_TRUNC, and G_TRUNC could be implemented in terms of G_UNMERGE_VALUES. Reducing the number of elements in unmerge results ends up producing the original unmerge type profile, so the artifact combiner needs to eliminate the intermediate illegal registers. This avoids infinite looping in the legalizer in a future change. Assuming an unmerge has each result unmerged the same way, this ends up producing a new unmerge of the source for every definition. I'm not sure if the artifact combiner should either insert temporary merges here and erase the original merge, or if the combiner should look at uses from defs rather than defs from uses for unmerges. In a few cases this regresses from using 16-bit shifts for 8-bit values to using 32-bit shifts, but I think these can be legalized later (the other legalization rules don't try very hard to use 16-bit shifts either).	2020-09-01 11:01:33 -04:00
Anh Tuyen Tran	c3a3d1596a	[LoopIdiomRecognizePass] Options to disable part or the entire Loop Idiom Recognize Pass Loop Idiom Recognize Pass (LIRP) attempts to transform loops with subscripted arrays into memcpy/memset function calls. In some particular situation, this transformation introduces negative impacts. For example: https://bugs.llvm.org/show_bug.cgi?id=47300 This patch will enable users to disable a particular part of the transformation, while he/she can still enjoy the benefit brought about by the rest of LIRP. The default behavior stays unchanged: no part of LIRP is disabled by default. Reviewed By: etiotto (Ettore Tiotto) Differential Revision: https://reviews.llvm.org/D86262	2020-09-01 13:59:24 +00:00
Raphael Isemann	c3c7a59967	Reland [FileCheck] Move FileCheck implementation out of LLVMSupport into its own library This relands e9a3d1a401b07cbf7b11695637f1b549782a26cd which was originally missing linking LLVMSupport into LLMVFileCheck which broke the SHARED_LIBS build. Original summary: The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a good reason for having FileCheck implemented there as it has a very specific use while LLVMSupport is a dependency of pretty much every LLVM tool there is. In fact, the only use of FileCheck I could find (outside the FileCheck tool and the FileCheck unit test) is a single call in GISelMITest.h. This moves the FileCheck logic to its own LLVMFileCheck library. This way only FileCheck and the GlobalISelTests now have a dependency on this code. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86344	2020-09-01 14:59:28 +02:00
Raphael Isemann	02ebf2f3b4	Revert "[lldb] Add reproducer verifier" This reverts commit 297f69afac58fc9dc13897857a5e70131c5adc85. It broke the Fedora 33 x86-64 bot. See the review for more info.	2020-09-01 12:21:44 +02:00
David Sherwood	c4d572ac0e	[SVE][CodeGen] Fix TypeSize/ElementCount related warnings in sve-split-load.ll I have fixed up a number of warnings resulting from TypeSize -> uint64_t casts and calling getVectorNumElements() on scalable vector types. I think most of the changes are fairly trivial except for those in DAGTypeLegalizer::SplitVecRes_MLOAD I've tried to ensure we create the MachineMemoryOperands in a sensible way for scalable vectors. I have added a CHECK line to the following test: CodeGen/AArch64/sve-split-load.ll that ensures no new warnings are added. Differential Revision: https://reviews.llvm.org/D86697	2020-09-01 07:47:59 +01:00
Petr Hosek	11e2ca0270	[CMake] Use find_library for ncurses Currently it is hard to avoid having LLVM link to the system install of ncurses, since it uses check_library_exists to find e.g. libtinfo and not find_library or find_package. With this change the ncurses lib is found with find_library, which also considers CMAKE_PREFIX_PATH. This solves an issue for the spack package manager, where we want to use the zlib installed by spack, and spack provides the CMAKE_PREFIX_PATH for it. This is a similar change as https://reviews.llvm.org/D79219, which just landed in master. Patch By: haampie Differential Revision: https://reviews.llvm.org/D85820	2020-08-31 20:06:21 -07:00
Xing GUO	45321f3e73	[DWARFYAML] Make the debug_str section optional. This patch makes the debug_str section optional. When the debug_str section exists but doesn't contain anything, yaml2obj will emit a section header for it. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D86860	2020-09-01 10:02:09 +08:00
Jonas Devlieghere	7448020ce4	[lldb] Add reproducer verifier Add a reproducer verifier that catches: - Missing or invalid home directory - Missing or invalid working directory - Missing or invalid module/symbol paths - Missing files from the VFS The verifier is enabled by default during replay, but can be skipped by passing --reproducer-no-verify. Differential revision: https://reviews.llvm.org/D86497	2020-08-31 15:14:18 -07:00
Christopher Tetreault	bcc3cadef7	[SVE] Mark VectorType::getNumElements() deprecated getNumElements() is being removed from base VectorType in order to eliminate the class of bugs in which a scalable vector is accidentally treated like a fixed length vector. Clients of this function should either call getElementCount(), and handle the case where getElementCount().isScalable() is true, or they can cast to FixedVectorType and call getNumElements() if they are sure that the vector has fixed width. Deprecated VectorType functions will be removed after the LLVM 12 branch. See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D78127	2020-08-31 15:13:04 -07:00
Sanjay Patel	251968146e	[IR][GVN] allow intrinsics in Instruction's isCommutative query (2nd try) The 1st try was reverted because I missed an assert that needed softening. As discussed in D86798 / rG09652721 , we were potentially returning a different result for whether an Instruction is commutable depending on if we call the base class or derived class method. This requires relaxing asserts in GVN, but that pass seems to be working otherwise. NewGVN requires more work because it uses different code paths for numbering binops and calls.	2020-08-31 16:01:19 -04:00
Christopher Tetreault	6b50d3055a	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Raphael Isemann	20366c891d	Revert "[FileCheck] Move FileCheck implementation out of LLVMSupport into its own library" This reverts commit e9a3d1a401b07cbf7b11695637f1b549782a26cd. Seems the new FileCheck library doesn't link on some bots. Reverting for now.	2020-08-31 11:38:40 +02:00
Raphael Isemann	aecb52031b	[FileCheck] Move FileCheck implementation out of LLVMSupport into its own library The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a good reason for having FileCheck implemented there as it has a very specific use while LLVMSupport is a dependency of pretty much every LLVM tool there is. In fact, the only use of FileCheck I could find (outside the FileCheck tool and the FileCheck unit test) is a single call in GISelMITest.h. This moves the FileCheck logic to its own LLVMFileCheck library. This way only FileCheck and the GlobalISelTests now have a dependency on this code. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86344	2020-08-31 11:24:41 +02:00
Sanjay Patel	56bc7f03f4	Revert "[IR][GVN] allow intrinsics in Instruction's isCommutative query" This reverts commit 25597f7783e7038b8a2ee88bb49ac605b211b564. It is causing crashing on bots such as: http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10523/steps/ninja-build/logs/stdio	2020-08-30 17:02:51 -04:00
Sanjay Patel	d58c2f282d	[IR][GVN] allow intrinsics in Instruction's isCommutative query As discussed in D86798 / rG09652721 , we were potentially returning a different result for whether an Instruction is commutable depending on if we call the base class or derived class method. This requires relaxing an assert in GVN, but that pass seems to be working otherwise. NewGVN requires more work because it uses different code paths for numbering binops and calls.	2020-08-30 16:49:22 -04:00
Sanjay Patel	d6a2f460c7	[FastISel] update to use intrinsic's isCommutative(); NFC This requires adding a missing 'const' to the definition because the callers are using const args, but there should be no change in behavior. The intrinsic method was added with D86798 / rG096527214033	2020-08-30 11:36:41 -04:00
David Green	65d99fc840	[LV] Add some const to RecurrenceDescriptor. NFC	2020-08-30 12:27:51 +01:00
sstefan1	34f7bfd3c0	[Attributor] Introduce module slice. Summary: The module slice describes which functions we can analyze and transform while working on an SCC as part of the Attributor-CGSCC pass. So far we simply restricted it to the SCC. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D86319	2020-08-30 10:30:44 +02:00
Lang Hames	aa6cb2b9d1	[ORC] Add getDFSLinkOrder / getReverseDFSLinkOrder methods to JITDylib. DFS and Reverse-DFS linkage orders are used to order execution of deinitializers and initializers respectively. This patch replaces uses of special purpose DFS order functions in MachOPlatform and LLJIT with uses of the new methods.	2020-08-29 15:17:06 -07:00
Benjamin Kramer	82759b9044	[IR] Inline AttrBuilder::addAttribute. It just sets 1 bit. NFC.	2020-08-29 19:13:49 +02:00
Sanjay Patel	c28488b471	[EarlyCSE] fold commutable intrinsics Handling the new min/max intrinsics is the motivation, but it turns out that we have a bunch of other intrinsics with this missing bit of analysis too. The FP min/max tests show that we are intersecting FMF, so that part should be safe too. As noted in https://llvm.org/PR46897 , there is a commutative property specifier for intrinsics, but no corresponding function attribute, and so apparently no uses of that bit. We may want to remove that next. Follow-up patches should wire up the Instruction::isCommutative() to this IntrinsicInst specialization. That requires updating callers to be aware of the more general commutative property (not just binops). Differential Revision: https://reviews.llvm.org/D86798	2020-08-29 12:11:01 -04:00
Martin Storsjö	7550015c23	[MC] [Win64EH] Fill in FuncletOrFuncEnd if missing This can happen e.g. for code that declare .seh_proc/.seh_endproc in assembly, or for code that use .seh_handlerdata (which triggers the unwind info to be emitted before the end of the function). The TextSection field must be made non-const to be able to use it with Streamer.SwitchSection(). Differential Revision: https://reviews.llvm.org/D86528	2020-08-29 15:15:22 +03:00
Roman Lebedev	2a46a04b28	[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable See https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html and https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html InstSimply is not allowed to perform simplifications to instructions that are not def-reachable from the original instruction.	2020-08-29 09:58:08 +03:00
Roman Lebedev	5d89fa42b3	[NFC][STLExtras] Add make_first_range(), similar to existing make_second_range() Having just one of the two seens weird. I wanted to use it a few times, but it wasn't there.	2020-08-29 09:58:07 +03:00
Xing GUO	1d4cd95b57	[DWARFYAML] Make the debug_abbrev_offset field optional. This patch helps make the debug_abbrev_offset field optional. We don't need to calculate the value of this field in the future. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86614	2020-08-29 14:54:52 +08:00
Fangrui Song	311c781d4a	[gcov] Increment counters with atomicrmw if -fsanitize=thread Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously because non-atomic counter increments can be detected as data races.	2020-08-28 16:32:35 -07:00
Matt Arsenault	9e320d7adb	GlobalISel: Combine out redundant sext_inreg The scalar tests don't work yet, since computeNumSignBits apparently doesn't handle sextload yet, and sext folds into the load first.	2020-08-28 17:57:31 -04:00
Craig Topper	09050e4cf7	[Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None) There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None. So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline. This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations. Differential Revision: https://reviews.llvm.org/D86744	2020-08-28 13:23:45 -07:00
Arthur Eubanks	82da713351	[ObjCARCOpt] Port objc-arc to NPM Since doInitialization() in the legacy pass modifies the module, the NPM pass is a Module pass. Reviewed By: ahatanak, ychen Differential Revision: https://reviews.llvm.org/D86178	2020-08-28 12:59:33 -07:00
Tyker	70c01602af	[SROA] Improve handleling of assumes bundles by SROA This patch fixes this crash https://gcc.godbolt.org/z/Ps8d1e And gives SROA the ability to remove assumes if it allows promoting an alloca to register Without removing assumes when it can't promote to register. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86570	2020-08-28 21:55:45 +02:00
Snehasish Kumar	69b963173e	[llvm][CodeGen] Machine Function Splitter We introduce a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently introduced in LLVM from the Propeller project. The pass targets functions with profile coverage, identifies cold blocks and moves them to a separate section. The linker groups all cold blocks across functions together, decreasing fragmentation and improving icache and itlb utilization. We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017. For clang bootstrap we observe a mean 2.33% runtime improvement with a ~32% reduction in itlb and stlb misses. Additionally, L1 icache misses reduced by 9.5% while L2 instruction misses reduced by 20%. For SPECInt we report the change in IntRate the C/C++ benchmarks. All benchmarks apart from mcf and x264 improve, on average by 0.6% with the max for deepsjeng at 1.6%. Benchmark % Change 500.perlbench_r 0.78 502.gcc_r 0.82 505.mcf_r -0.30 520.omnetpp_r 0.18 523.xalancbmk_r 0.37 525.x264_r -0.46 531.deepsjeng_r 1.61 541.leela_r 0.83 557.xz_r 0.15 Differential Revision: https://reviews.llvm.org/D85368	2020-08-28 11:10:14 -07:00
David Sherwood	56b8c35591	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Sam Parker	710437b36d	[ARM][LowOverheadLoops] Liveouts and reductions Remove the code that tried to look for reduction patterns, since the vectorizer and isel can now produce predicated arithmetic instructios within the loop body. This has required some reorganisation and fixes around live-out and predication checks, as well as looking for cases where an input/output is initialised to zero. Differential Revision: https://reviews.llvm.org/D86613	2020-08-28 13:56:16 +01:00
Martin Storsjö	831184291b	[MC] [Win64EH] Avoid producing malformed xdata records If there's no unwinding opcodes, omit writing the xdata/pdata records. Previously, this generated truncated xdata records, and llvm-readobj would error out when trying to print them. If writing of an xdata record is forced via the .seh_handlerdata directive, skip it if there's no info to make a sensible unwind info structure out of, and clearly error out if such info appeared later in the process. Differential Revision: https://reviews.llvm.org/D86527	2020-08-28 09:05:36 +03:00
serge-sans-paille	a12b4db565	(Expensive) Check for Loop, SCC and Region pass return status This generalizes the logic introduced in https://reviews.llvm.org/D80916 to other passes. It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report their status. Differential Revision: https://reviews.llvm.org/D86589	2020-08-28 07:56:35 +02:00
Valentin Clement	ba3e93ca35	[flang][openacc] Add check for tile clause restriction The tile clause in OpenACC 3.0 imposes some restriction. Element in the tile size list are either * or a constant positive integer expression. If there are n tile sizes in the list, the loop construct must be immediately followed by n tightly-nested loops. This patch implement these restrictions and add some tests. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D86655	2020-08-27 22:13:46 -04:00
Harmen Stoppels	e6870b67d6	Revert "Use find_library for ncurses" The introduction of find_library for ncurses caused more issues than it solved problems. The current open issue is it makes the static build of LLVM fail. It is better to revert for now, and get back to it later. Revert "[CMake] Fix an issue where get_system_libname creates an empty regex capture on windows" This reverts commit 1ed1e16ab83f55d85c90ae43a05cbe08a00c20e0. Revert "Fix msan build" This reverts commit 34fe9613dda3c7d8665b609136a8c12deb122382. Revert "[CMake] Always mark terminfo as unavailable on Windows" This reverts commit 76bf26236f6fd453343666c3cd91de8f74ffd89d. Revert "[CMake] Fix OCaml build failure because of absolute path in system libs" This reverts commit 8e4acb82f71ad4effec8895b8fc957189ce95933. Revert "[CMake] Don't look for terminfo libs when LLVM_ENABLE_TERMINFO=OFF" This reverts commit 495f91fd33d492941c39424a32cf24bcfe192f35. Revert "Use find_library for ncurses" This reverts commit a52173a3e56553d7b795bcf3cdadcf6433117107. Differential revision: https://reviews.llvm.org/D86521	2020-08-27 17:57:26 -07:00
Matt Arsenault	2cd41c208e	GlobalISel: Implement known bits for min/max	2020-08-27 16:56:17 -04:00
Matt Arsenault	fcf8b40603	MIR: Infer not-SSA for subregister defs It's possible to have a single virtual register def with a subreg index that would pass the previous check, but it's not possible to have a subregister def in SSA. This is in preparation for adding stricter checks for SSA MIR.	2020-08-27 16:56:16 -04:00
Vitaly Buka	4f88f9a474	[NFC][ValueTracking] Add OffsetZero into findAllocaForValue For StackLifetime after finding alloca we need to check that values ponting to the begining of alloca. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D86692	2020-08-27 13:46:22 -07:00
Matt Arsenault	cac4e51351	GlobalISel: Add and_trivial_mask to all_combines Also make up a new category of combines.	2020-08-27 16:42:09 -04:00
Eli Friedman	4c050f47e9	[RegisterScavenging] Delete dead function unprocess().	2020-08-27 13:19:32 -07:00
Shinji Okumura	912a13d81d	[Attributor] Do not add AA to dependency graph after the update stage If an AA is registered to the dependency graph in the manifest stage, Attributor aborts in `::manifestAttributes()`. This patch prevents such termination. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86734	2020-08-28 05:16:18 +09:00
Shinji Okumura	6ff8397df1	[Attributor] Guarantee getAAFor not to update AA in the manifestation stage If we query an AA with `Attributor::getAAFor` in `AbstractAttribute::manifest`, the AA may be updated. This patch makes use of the phase flag in Attributor, and handle `getAAFor` behavior according to the flag. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86635	2020-08-28 04:07:42 +09:00
Christopher Tetreault	c849a98c1b	[SVE] Remove calls to VectorType::getNumElements from IR Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D81500	2020-08-27 11:16:10 -07:00
Matt Arsenault	b3488037c4	GlobalISel: Implement known bits for G_MERGE_VALUES	2020-08-27 14:07:18 -04:00
Mikhail Maltsev	f7e914e2c5	[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics This patch adjusts the following ARM/AArch64 LLVM IR intrinsics: - neon_bfmmla - neon_bfmlalb - neon_bfmlalt so that they take and return bf16 and float types. Previously these intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from implementation lacking bf16 IR type). The neon_vbfdot[q] intrinsics are adjusted similarly. This change required some additional selection patterns for vbfdot itself and also for vector shuffles (in a previous patch) because of SelectionDAG transformations kicking in and mangling the original code. This patch makes the generated IR cleaner (less useless bitcasts are produced), but it does not affect the final assembly. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D86146	2020-08-27 18:43:16 +01:00
Aditya Nandakumar	90acb0696f	[GISel] Add new GISel combiners for G_SELECT https://reviews.llvm.org/D83833 Patch adds two new GICombinerRules for G_SELECT. The rules include: combining selects with undef comparisons into their first selectee value, and to combine away selects with constant comparisons. Patch additionally adds a new combiner test for the AArch64 target to test these new G_SELECT combiner rules and the existing select_same_val combiner rule. Patch by mkitzan	2020-08-27 09:40:15 -07:00
Shinji Okumura	1054ad3409	[Attributor] Add a phase flag to Attributor Add a new flag that indicates which stage in the process we are in. This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86678	2020-08-28 01:16:38 +09:00
Aditya Nandakumar	46055bcec5	[GISel]: Fix one more CSE Non determinism https://reviews.llvm.org/D86676 Sometimes we can have the following code x:gpr(s32) = G_OP Say we build G_OP2 to the same x and then delete the previous instruction. Using something like Register X = ...; auto NewMIB = CSEBuilder.buildOp2(X, ... args); Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method. This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code. Also this patch adds this verification at the end of the combiners as well.	2020-08-27 09:06:21 -07:00
Teresa Johnson	bce40acc54	[HeapProf] Clang and LLVM support for heap profiling instrumentation See RFC for background: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html Note that the runtime changes will be sent separately (hopefully this week, need to add some tests). This patch includes the LLVM pass to instrument memory accesses with either inline sequences to increment the access count in the shadow location, or alternatively to call into the runtime. It also changes calls to memset/memcpy/memmove to the equivalent runtime version. The pass is modeled on the address sanitizer pass. The clang changes add the driver option to invoke the new pass, and to link with the upcoming heap profiling runtime libraries. Currently there is no attempt to optimize the instrumentation, e.g. to aggregate updates to the same memory allocation. That will be implemented as follow on work. Differential Revision: https://reviews.llvm.org/D85948	2020-08-27 08:50:35 -07:00
diggerlin	5d038d06f5	Revert "[AIX][XCOFF] emit symbol visibility for xcoff object file." This reverts commit a0818689213234d5a078641432d10eccccf61a13. Based on the Hubert Tong'comment https://reviews.llvm.org/D84265#inline-799085	2020-08-27 11:07:58 -04:00
OCHyams	463930d1cb	[DwarfDebug] Improve single location detection in validThroughout (2/4) With this patch we're now accounting for two more cases which should be considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where RangeEnd comes before ScopeEnd when including meta instructions, but are both preceded by the same non-meta instruction. CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds. `llvm-locstats` (using D85636) shows a very small variable location coverage change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines up with my expectations. I've added a test which checks both of these new cases. The first check in the test isn't strictly necessary for this patch. But I'm not sure that it is explicitly tested anywhere else, and is useful for the final patch in the series. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D86151	2020-08-27 11:52:29 +01:00
Shinji Okumura	cc1db81009	[Attributor] Add flag for undef value to the state of AAPotentialValues Currently, an undef value is reduced to 0 when it is added to a set of potential values. This patch introduces a flag for under values. By this, for example, we can merge two states `{undef}`, `{1}` to `{1}` (because we can reduce the undef to 1). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85592	2020-08-27 16:30:29 +09:00
Jianzhou Zhao	0ded22b2ea	Fix an overflow issue at BackpatchWord This happens when generating a huge file by LTO, for example, with -gmlt. When BitNo is > 2^35, ByteNo is overflowed, and an incorrect output offset is overwritten. This generates ill-formed bitcodes. Reviewed-by: tejohnson, vitalybuka Differential Revision: https://reviews.llvm.org/D86645	2020-08-27 04:46:19 +00:00
Amy Kwan	daa2ee7b26	[PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang This patch implements the function prototypes vec_mulh and vec_dive in order to utilize the vector multiply high (vmulh[s\|u][w\|d]) and vector divide extended (vdive[s\|u][w\|d]) instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82609	2020-08-26 23:14:34 -05:00
Matt Arsenault	a03ce058a1	GlobalISel: Add generic instructions for memory intrinsics AArch64, X86 and Mips currently directly consumes these and custom lowering to produce a libcall, but really these should follow the normal legalization process through the libcall/lower action.	2020-08-26 20:08:45 -04:00
Lang Hames	02bae0a8ff	[ORC][JITLink] Switch to unique ownership for EHFrameRegistrars. This will make stateful registrars (e.g. a future TargetProcessControl based registrar) easier to deal with.	2020-08-26 16:59:45 -07:00
Arthur Eubanks	d74ec65308	[ConstProp] Remove ConstantPropagation As discussed in http://lists.llvm.org/pipermail/llvm-dev/2020-July/143801.html. Currently no users outside of unit tests. Replace all instances in tests of -constprop with -instsimplify. Notable changes in tests: * vscale.ll - @llvm.sadd.sat.nxv16i8 is evaluated by instsimplify, use a fake intrinsic instead * InsertElement.ll - insertelement undef is removed by instsimplify in @insertelement_undef llvm/test/Transforms/ConstProp moved to llvm/test/Transforms/InstSimplify/ConstProp Reviewed By: lattner, nikic Differential Revision: https://reviews.llvm.org/D85159	2020-08-26 15:51:30 -07:00
Juneyoung Lee	ecc3e51e55	[IR] Remove noundef from masked store/load/gather/scatter's pointer operands As discussed in D86576, noundef attribute is removed from masked store/load/gather/scatter's pointer operands. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86656	2020-08-27 06:41:43 +09:00
Wei Mi	d9e172d0fb	[SampleFDO] Enhance profile remapping support for searching inline instance and indirect call promotion candidate. Profile remapping is a feature to match a function in the module with its profile in sample profile if the function name and the name in profile look different but are equivalent using given remapping rules. This is a useful feature to keep the performance stable by specifying some remapping rules when sampleFDO targets are going through some large scale function signature change. However, currently profile remapping support is only valid for outline function profile in SampleFDO. It cannot match a callee with an inline instance profile if they have different but equivalent names. We found that without the support for inline instance profile, remapping is less effective for some large scale change. To add that support, before any remapping lookup happens, all the names in the profile will be inserted into remapper and the Key to the name mapping will be recorded in a map called NameMap in the remapper. During name lookup, a Key will be returned for the given name and it will be used to extract an equivalent name in the profile from NameMap. So with the help of the NameMap, we can translate any given name to an equivalent name in the profile if it exists. Whenever we try to match a name in the module to a name in the profile, we will try the match with the original name first, and if it doesn't match, we will use the equivalent name got from remapper to try the match for another time. In this way, the patch can enhance the profile remapping support for searching inline instance and searching indirect call promotion candidate. In a planned large scale change of int64 type (long long) to int64_t (long), we found the performance of a google internal benchmark degraded by 2% if nothing was done. If existing profile remapping was enabled, the performance degradation dropped to 1.2%. If the profile remapping with the current patch was enabled, the performance degradation further dropped to 0.14% (Note the experiment was done before searching indirect call promotion candidate was added. We hope with the remapping support of searching indirect call promotion candidate, the degradation can drop to 0% in the end. It will be evaluated post commit). Differential Revision: https://reviews.llvm.org/D86332	2020-08-26 11:07:35 -07:00
Juneyoung Lee	5d4d13a2ae	[IR] Add NoUndef attribute to Intrinsics.td This patch adds NoUndef to Intrinsics.td. The attribute is attached to llvm.assume's operand, because llvm.assume(undef) is UB. It is attached to pointer operands of several memory accessing intrinsics as well. This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check unnecessary, so it is removed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86576	2020-08-27 02:54:48 +09:00
Roman Lebedev	b6a0e78067	[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks As FIXME said, they really should be checking for a single user, not use, so let's do that. It is not that unusual to have the same value as incoming value in a PHI node, not unlike how a PHI may have the same incoming basic block more than once. There isn't a nice way to do that, Value::users() isn't uniqified, and Value only tracks it's uses, not Users, so the check is potentially costly since it does indeed potentially involes traversing the entire use list of a value.	2020-08-26 20:20:41 +03:00
Roman Lebedev	19662cd96c	[NFC][Value] Fixup comments, "N users" is NOT the same as "N uses". In those cases, it really means "N uses".	2020-08-26 20:20:41 +03:00
Kai Nacke	2ad3f5c93b	[SystemZ/ZOS] Add header file to encapsulate use of <sysexits.h> The non-standard header file `<sysexits.h>` provides some return values. `EX_IOERR` is used to as a special value to signal a broken pipe to the clang driver. On z/OS Unix System Services, this header file does not exists. This patch - adds a check for `<sysexits.h>`, removing the dependency on `LLVM_ON_UNIX` - adds a new header file `llvm/Support/ExitCodes`, which either includes `<sysexits.h>` or defines `EX_IOERR` - updates the users of `EX_IOERR` to include the new header file Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D83472	2020-08-26 12:44:30 -04:00
Sjoerd Meijer	20938e0e91	[LV] Fallback strategies if tail-folding fails This implements 2 different vectorisation fallback strategies if tail-folding fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue. This can be controlled with option -prefer-predicate-over-epilogue, that has been changed to take a numeric value corresponding to the tail-folding preference and preferred fallback. Patch by: Pierre van Houtryve, Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D79783	2020-08-26 16:55:25 +01:00
Dibya Ranjan Mishra	ef1ac54803	[Support] Allow printing the stack trace only for a given depth Differential Revision: https://reviews.llvm.org/D85458	2020-08-26 09:27:42 -04:00
Matt Arsenault	0ce967f763	GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD This produces less work for addressing mode matching. I think this is safe since I don't think machine IR is supposed to give the same aliasing properties as getelementptr in the IR.	2020-08-26 08:57:15 -04:00
Xing GUO	d25b99b1d8	[DWARFYAML] Make the unit_length and header_length fields optional. This patch makes the unit_length and header_length fields of line tables optional. yaml2obj is able to infer them for us. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86590	2020-08-26 20:35:10 +08:00
QingShan Zhang	48664955df	[Scheduling] Implement a new way to cluster loads/stores Before calling target hook to determine if two loads/stores are clusterable, we put them into different groups to avoid fake cluster due to dependency. For now, we are putting the loads/stores into the same group if they have the same predecessor. We assume that, if two loads/stores have the same predecessor, it is likely that, they didn't have dependency for each other. However, one SUnit might have several predecessors and for now, we just pick up the first predecessor that has non-data/non-artificial dependency, which is too arbitrary. And we are struggling to fix it. So, I am proposing some better implementation. 1. Collect all the loads/stores that has memory info first to reduce the complexity. 2. Sort these loads/stores so that we can stop the seeking as early as possible. 3. For each load/store, seeking for the first non-dependency instruction with the sorted order, and check if they can cluster or not. Reviewed By: Jay Foad Differential Revision: https://reviews.llvm.org/D85517	2020-08-26 12:33:59 +00:00
Georgii Rymar	70f1e167ac	[llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should. Currently, `dyn_cast<XCOFFObjectFile>` always does cast and returns a pointer, even when we pass `ELF`/`Wasm`/`Mach-O` or `COFF` instead of `XCOFF`. It happens because `XCOFFObjectFile` class does not implement `classof`. I've fixed it and added a unit test. Differential revision: https://reviews.llvm.org/D86542	2020-08-26 15:09:55 +03:00
Gabriel Hjort Åkerlund	6b25df96c2	[GlobalISel] Fix and tidy up documentation for ValueMapping class (NFC) The documentation was missing a '/' in '/<2x32-bit> vadd {0, 64, VPR}', and the example code are now aligned to improve readability. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86201	2020-08-26 12:09:01 +02:00
sstefan1	d40ac7afef	Reland [IR] Intrinsics default attributes and opt-out flag Intrinsic properties can now be set to default and applied to all intrinsics. If the attributes are not needed, the user can opt-out by setting the DisableDefaultAttributes flag to true. Differential Revision: https://reviews.llvm.org/D70365	2020-08-26 11:37:59 +02:00
Jan Kratochvil	998ea8d0ba	[Support] Speedup llvm-dwarfdump 3.9x Currently `strace llvm-dwarfdump x.debug >/tmp/file`: ioctl(1, TCGETS, 0x7ffd64d7f340) = -1 ENOTTY (Inappropriate ioctl for device) write(1, " DW_AT_decl_line\t(89)\n"..., 4096) = 4096 ioctl(1, TCGETS, 0x7ffd64d7f400) = -1 ENOTTY (Inappropriate ioctl for device) ioctl(1, TCGETS, 0x7ffd64d7f410) = -1 ENOTTY (Inappropriate ioctl for device) ioctl(1, TCGETS, 0x7ffd64d7f400) = -1 ENOTTY (Inappropriate ioctl for device) After this patch: write(1, "0000000000001102 \"strlen\")\n "..., 4096) = 4096 write(1, "site\n DW_AT_low"..., 4096) = 4096 write(1, "d53)\n\n0x000e4d4d: DW_TAG_G"..., 4096) = 4096 The same speedup can be achieved by `--color=0` but that is not much convenient. This implementation has been suggested by Joerg Sonnenberger. Differential Revision: https://reviews.llvm.org/D86406	2020-08-26 10:29:46 +02:00
Shinji Okumura	0e778b809f	[Attributor] Provide an edge-based interface in AAIsDead This patch produces an edge-based interface in AAIsDead. By this, we can query a set of basic blocks that are directly reachable from a given basic block. This is specifically useful for implementation of AAReachability. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85547	2020-08-26 16:57:52 +09:00
Adrien Guinet	add1ee3a12	[llvm-lipo] Add support for bitcode files A Mach-O universal binary may contain bitcode as a slice. This diff adds proper handling of such binaries to llvm-lipo. Test plan: make check-all Differential revision: https://reviews.llvm.org/D85740	2020-08-25 21:11:18 -07:00
Mircea Trofin	b186c6758c	[MLInliner] Simplify TFUTILS_SUPPORTED_TYPES We only need the C++ type and the corresponding TF Enum. The other parameter was used for the output spec json file, but we can just standardize on the C++ type name there. Differential Revision: https://reviews.llvm.org/D86549	2020-08-25 14:19:39 -07:00
Krzysztof Parzyszek	d984833f21	[SDAG] Improve MemSDNode::getBasePtr It returned getOperand(1), except for STORE for which it returned getOperand(2). Handle MSTORE, MGATHER, and MSCATTER as well.	2020-08-25 15:19:52 -05:00
Juneyoung Lee	65c4cb9e7b	[ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands. Instead of special-casing llvm.assume, I think it is also a viable option to add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86477	2020-08-26 04:40:21 +09:00
Lang Hames	543a62706c	[ORC] Fix an endif comment.	2020-08-25 11:51:20 -07:00
Jeremy Morse	f5080847e6	[LiveDebugValues] Add switches for using instr-ref variable locations This patch adds the -Xclang option "-fexperimental-debug-variable-locations" and same LLVM CodeGen option, to pick which variable location tracking solution to use. Right now all the switch does is pick which LiveDebugValues implementation to use, the normal VarLoc one or the instruction referencing one in rGae6f78824031. Over time, the aim is to add fragments of support in aid of the value-tracking RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html also controlled by this command line switch. That will slowly move variable locations to be defined by an instruction calculating a value, and a DBG_INSTR_REF instruction referring to that value. Thus, this is going to grow into a "use the new kind of variable locations" switch, rather than just "use the new LiveDebugValues implementation". Differential Revision: https://reviews.llvm.org/D83048	2020-08-25 14:58:48 +01:00
David Sherwood	82c9874179	[SVE] Fix TypeSize related warnings with IR truncates of scalable vectors In getCastInstrCost when the instruction is a truncate we were relying upon the implicit TypeSize -> uint64_t cast when asking if a given type has the same size as a legal integer. I've changed the code to only ask the question if the type is fixed length. I have also changed InstCombinerImpl::SimplifyDemandedUseBits to bail out for now if the type is a scalable vector. I've added the following new tests: Analysis/CostModel/AArch64/sve-trunc.ll Transforms/InstCombine/AArch64/sve-trunc.ll for both of these fixes. Differential revision: https://reviews.llvm.org/D86432	2020-08-25 09:17:56 +01:00
Freddy Ye	3ad559cce3	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Fangrui Song	9e7d7f3f68	Revert D85812 "[coroutine] should disable inline before calling coro split" This reverts commit 2e43acfed89b1903de473f682c65878bdebc395a. LLVMCoroutines (the library which contains Coroutines.h) depends on LLVMipo (the library which contains SampleProfile.cpp). It is inappropriate for SampleProfile.cpp to depent on Coroutines.h (circular dependency). The test inverted dependencies as well: llvm/test/Transforms/Coroutines/coro-inline.ll uses -sample-profile.	2020-08-24 11:41:05 -07:00
Matt Arsenault	e9313e5937	TableGen/GlobalISel: Allow inst matcher to check multiple opcodes This is to initially handleg immAllOnesV, which should match G_BUILD_VECTOR or G_BUILD_VECTOR_TRUNC. In the future, it could be used for other patterns cases that map to multiple G_* instructions, such as G_ADD and G_PTR_ADD.	2020-08-24 13:48:51 -04:00
dongAxis	7a35eee5d4	[coroutine] should disable inline before calling coro split summary: When callee coroutine function is inlined into caller coroutine function before coro-split pass, llvm will emits "coroutine should have exactly one defining @llvm.coro.begin". It seems that coro-early pass can not handle this quiet well. So we believe that unsplited coroutine function should not be inlined. This patch fix such issue by not inlining function if it has attribute "coroutine.presplit" (it means the function has not been splited) to fix this issue TestPlan: check-llvm Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D85812	2020-08-24 22:22:08 +08:00
Francesco Petrogalli	4b43384841	[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] Changes: * Change `ToVectorTy` to deal directly with `ElementCount` instances. * `VF == 1` replaced with `VF.isScalar()`. * `VF > 1` and `VF >=2` replaced with `VF.isVector()`. * `VF <=1` is replaced with `VF.isZero() \|\| VF.isScalar()`. * Replaced the uses of `llvm::SmallSet<ElementCount, ...>` with `llvm::SmallSetVector<ElementCount, ...>`. This avoids the need of an ordering function for the `ElementCount` class. * Bits and pieces around printing the `ElementCount` to string streams. To guarantee that this change is a NFC, `VF.Min` and asserts are used in the following places: 1. When it doesn't make sense to deal with the scalable property, for example: a. When computing unrolling factors. b. When shuffle masks are built for fixed width vector types In this cases, an assert(!VF.Scalable && "<mgs>") has been added to make sure we don't enter coepaths that don't make sense for scalable vectors. 2. When there is a conscious decision to use `FixedVectorType`. These uses of `FixedVectorType` will likely be removed in favour of `VectorType` once the vectorizer is generic enough to deal with both fixed vector types and scalable vector types. 3. When dealing with building constants out of the value of VF, for example when computing the vectorization `step`, or building vectors of indices. These operation _make sense_ for scalable vectors too, but changing the code in these places to be generic and make it work for scalable vectors is to be submitted in a separate patch, as it is a functional change. 4. When building the potential VFs in VPlan. Making the VPlan generic enough to handle scalable vectorization factors is a functional change that needs a separate patch. See for example `void LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned MaxVF)`. 5. The class `IntrinsicCostAttribute`: this class still uses `unsigned VF` as updating the field to use `ElementCount` woudl require changes that could result in changing the behavior of the compiler. Will be done in a separate patch. 7. When dealing with user input for forcing the vectorization factor. In this case, adding support for scalable vectorization is a functional change that migh require changes at command line. Note that in some places the idiom ``` unsigned VF = ... auto VTy = FixedVectorType::get(ScalarTy, VF) ``` has been replaced with ``` ElementCount VF = ... assert(!VF.Scalable && ...); auto VTy = VectorType::get(ScalarTy, VF) ``` The assertion guarantees that the new code is (at least in debug mode) functionally equivalent to the old version. Notice that this change had been possible because none of the methods that are specific to `FixedVectorType` were used after the instantiation of `VTy`. Reviewed By: rengolin, ctetreau Differential Revision: https://reviews.llvm.org/D85794	2020-08-24 13:54:03 +00:00
Francesco Petrogalli	0a6aded52a	Revert "[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]" Reverting because the commit message doesn't reflect the one agreed on phabricator at https://reviews.llvm.org/D85794. This reverts commit c8d2b065b98fa91139cc7bb1fd1407f032ef252e.	2020-08-24 13:50:55 +00:00
Matt Arsenault	21aca8a3e2	GlobalISel: Reduce G_SHL width if source is extension shl ([sza]ext x, y) => zext (shl x, y). Turns expensive 64 bit shifts into 32 bit if it does not overflow the source type: This is a port of an AMDGPU DAG combine added in 5fa289f0d8ff85b9e14d2f814a90761378ab54ae. InstCombine does this already, but we need to do it again here to apply it to shifts introduced for lowered getelementptrs. This will help matching addressing modes that use 32-bit offsets in a future patch. TableGen annoyingly assumes only a single match data operand, so introduce a reusable struct. However, this still requires defining a separate GIMatchData for every combine which is still annoying. Adds a morally equivalent function to the existing getShiftAmountTy. Without this, we would have to do try to repeatedly query the legalizer info and guess at what type to use for the shift.	2020-08-24 09:42:40 -04:00
Francesco Petrogalli	04c1fe05b7	[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] Changes: * Change `ToVectorTy` to deal directly with `ElementCount` instances. * `VF == 1` replaced with `VF.isScalar()`. * `VF > 1` and `VF >=2` replaced with `VF.isVector()`. * `VF <=1` is replaced with `VF.isZero() \|\| VF.isScalar()`. * Add `<` operator to `ElementCount` to be able to use `llvm::SmallSetVector<ElementCount, ...>`. * Bits and pieces around printing the ElementCount to string streams. * Added a static method to `ElementCount` to represent a scalar. To guarantee that this change is a NFC, `VF.Min` and asserts are used in the following places: 1. When it doesn't make sense to deal with the scalable property, for example: a. When computing unrolling factors. b. When shuffle masks are built for fixed width vector types In this cases, an assert(!VF.Scalable && "<mgs>") has been added to make sure we don't enter coepaths that don't make sense for scalable vectors. 2. When there is a conscious decision to use `FixedVectorType`. These uses of `FixedVectorType` will likely be removed in favour of `VectorType` once the vectorizer is generic enough to deal with both fixed vector types and scalable vector types. 3. When dealing with building constants out of the value of VF, for example when computing the vectorization `step`, or building vectors of indices. These operation _make sense_ for scalable vectors too, but changing the code in these places to be generic and make it work for scalable vectors is to be submitted in a separate patch, as it is a functional change. 4. When building the potential VFs in VPlan. Making the VPlan generic enough to handle scalable vectorization factors is a functional change that needs a separate patch. See for example `void LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned MaxVF)`. 5. The class `IntrinsicCostAttribute`: this class still uses `unsigned VF` as updating the field to use `ElementCount` woudl require changes that could result in changing the behavior of the compiler. Will be done in a separate patch. 7. When dealing with user input for forcing the vectorization factor. In this case, adding support for scalable vectorization is a functional change that migh require changes at command line. Differential Revision: https://reviews.llvm.org/D85794	2020-08-24 13:39:42 +00:00
Sam Parker	9295052eb8	[SCEV] Still (again) trying to fix buildbots	2020-08-24 11:24:30 +01:00
Sam Parker	0828b21ed0	[SCEV] Still trying to fix windows buildbots	2020-08-24 10:26:48 +01:00
Sam Parker	83a53e27fa	[SCEV] Attempt to fix windows buildbots	2020-08-24 08:29:22 +01:00
Sam Parker	edc1713733	[SCEV] Add operand methods to Cast and UDiv Add methods to access operands in a similar manner to NAryExpr. Differential Revision: https://reviews.llvm.org/D86083	2020-08-24 06:57:07 +01:00
Craig Topper	31a27aaaa3	[X86] Allow 32-bit mode only CPUs with -mtune on 64-bit targets gcc errors on this, but I'm nervous that since -mtune has been ignored by clang for so long that there may be code bases out there that pass 32-bit cpus to clang.	2020-08-22 16:38:05 -07:00
Matt Arsenault	4175888640	GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS This switches from using G_EXTRACT in odd cases to widen with undef and unmerge.	2020-08-22 10:25:53 -04:00
Florian Hahn	384a619d03	[DSE,MemorySSA] Use BatchAA for AA queries. We can use BatchAA to avoid some repeated AA queries. We only remove stores, so I think we will get away with using a single BatchAA instance for the complete run. The changes in AliasAnalysis.h mirror the changes in D85583. The change improves compile-time by roughly 1%. http://llvm-compile-time-tracker.com/compare.php?from=67ad786353dfcc7633c65de11601d7823746378e&to=10529e5b43809808e8c198f88fffd8f756554e45&stat=instructions This is part of the patches to bring down compile-time to the level referenced in http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86275	2020-08-22 08:36:35 +01:00
Sourabh Singh Tomar	9241b8151b	[DebugInfo][flang]Added support for representing Fortran assumed length strings This patch adds support for representing Fortran `character(n)`. Primarily patch is based out of D54114 with appropriate modifications. Test case IR is generated using our downstream classic-flang. We're in process of upstreaming flang PR's but classic-flang has dependencies on llvm, so this has to get in first. Patch includes functional test case for both IR and corresponding dwarf, furthermore it has been manually tested as well using GDB. Source snippet: ``` program assumedLength call sub('Hello') call sub('Goodbye') contains subroutine sub(string) implicit none character(len=), intent(in) :: string print , string end subroutine sub end program assumedLength ``` GDB: ``` (gdb) ptype string type = character (5) (gdb) p string $1 = 'Hello' ``` Reviewed By: aprantl, schweitz Differential Revision: https://reviews.llvm.org/D86305	2020-08-22 10:13:40 +05:30
Alina Sbirlea	d3e372ea8f	[DomTree] Extend update API to allow a post CFG view. Extend the `applyUpdates` in DominatorTree to allow a post CFG view, different from the current CFG. This patch implements the functionality of updating an already up to date DT, to the desired PostCFGView. Combining a set of updates towards an up to date DT and a PostCFGView is not yet supported. Differential Revision: https://reviews.llvm.org/D85472	2020-08-21 17:23:08 -07:00
Roman Lebedev	29a87631f2	Temporairly revert "[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline" As disscussed in post-commit review starting with https://reviews.llvm.org/D84108#2227365 while this appears to be mostly a win overall, especially code-size-wise, this appears to shake //certain// code pattens in a way that is extremely unfavorable for performance (+30% runtime regression) on certain CPU's (i personally can't reproduce). So until the behaviour is better understood, and a path forward is mapped, let's back this out for now. This reverts commit 1d51dc38d89bd33fb8874e242ab87b265b4dec1c.	2020-08-22 00:33:22 +03:00
Nicolai Hähnle	04c6c33b48	MachineSSAUpdater: Allow initialization with just a register class The register class is required for inserting PHIs, but the "current virtual register" isn't actually used for anything, so let's remove it while we're at it. Differential Revision: https://reviews.llvm.org/D85602 Change-Id: I1e647f31570ef21a7ea8e20db3454178e98a6a8b	2020-08-21 23:04:35 +02:00
Alina Sbirlea	cfc3eafb38	[DomTree] Avoid creating an empty GD to reduce compile time.	2020-08-21 13:45:00 -07:00
Serguei Katkov	facbe0803e	[InstCombine] Remove unused entries in gc-live bundle of statepoint If some of gc live value are not used in gc.relocate we can remove them from gc-live bundle of statepoint instruction. Also the CL removes duplicated Values in gc-live bundle. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D85959	2020-08-22 01:36:22 +07:00
Kamau Bridgeman	710eb84124	[PowerPC][PCRelative] Thread Local Storage Support for Initial Exec This patch is the initial support for the Intial Exec Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D81947	2020-08-21 10:13:11 -05:00
diggerlin	c4598749ca	[AIX][XCOFF] emit symbol visibility for xcoff object file. SUMMARY: Reviewers: Jason liu Differential Revision: https://reviews.llvm.org/D84265	2020-08-21 11:00:56 -04:00
Florian Hahn	bd231a8fec	Recommit "[SCEVExpander] Add helper to clean up instrs inserted while expanding." Recommit the patch after fixing an issue reported caused by the fact that re-used values are also added to InsertedValues. Additional tests have been added in 88818491b9dea64ec65c92ce5652bc45bef337a4 This reverts the revert commit 38884641f28e373ce291dc5ea93416756216e536.	2020-08-21 15:04:17 +01:00
Xing GUO	54320a5f65	Recommit: [DWARFYAML] Add support for referencing different abbrev tables. The original commit (7ff0ace96db9164dcde232c36cab6519ea4fce8) was causing build failure and was reverted in 6d242a73264ef1e3e128547f00e0fe2d20d3ada0 ==================== Original Commit Message ==================== This patch adds support for referencing different abbrev tables. We use 'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly assign an abbrev table to compilation units. The syntax is: ``` debug_abbrev: - ID: 0 Table: ... - ID: 1 Table: ... debug_info: - ... AbbrevTableID: 1 ## Reference the second abbrev table. - ... AbbrevTableID: 0 ## Reference the first abbrev table. ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D83116	2020-08-21 19:02:10 +08:00
Roman Lebedev	2893862e92	[NFC] Port InstCount pass to new pass manager	2020-08-21 12:39:42 +03:00
Yevgeny Rouban	1bdf10a116	[NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks Both AfterPass and AfterPassInvalidated pass instrumentation callbacks get additional parameter of type PreservedAnalyses. This patch was created by @fedor.sergeev. I have just slightly changed it. Reviewers: fedor.sergeev Differential Revision: https://reviews.llvm.org/D81555	2020-08-21 16:10:42 +07:00
David Green	53fac1f9ad	[ARM][LV] Add a preferPredicatedReductionSelect target hook As part of D84741, this adds a target hook for the preferPredicatedReductionSelect option and makes use of it under MVE, allowing us to tail predicate most reduction loops. Differential Revision: https://reviews.llvm.org/D85980	2020-08-21 08:48:12 +01:00
Qiu Chaofan	f862a86f7a	[PowerPC] Add readflm/setflm intrinsics to Clang Commit dbcfbffc adds ppc.readflm and ppc.setflm intrinsics to read or write FPSCR register. This patch adds them to Clang. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D85874	2020-08-21 15:12:19 +08:00
Xing GUO	7720abc372	Revert "[DWARFYAML] Add support for referencing different abbrev tables." This reverts commit f7ff0ace96db9164dcde232c36cab6519ea4fce8. This change is causing build failure. http://lab.llvm.org:8011/builders/clang-cmake-armv7-global-isel/builds/10400	2020-08-21 12:15:54 +08:00
Yevgeny Rouban	ae25ecf12b	[ADT] Allow IsSizeLessThanThresholdT for incomplete types. NFC If the type T is incomplete then sizeof(T) results in C++ compilation error at line: static constexpr bool value = sizeof(T) <= (2 * sizeof(void *)); This patch allows incomplete types in parameters of function. Example: using SomeFunc = void(SomeIncompleteType &); llvm::unique_function<SomeFuncType> SomeFunc; Reviewers: DaniilSuchkov, vvereschaka Differential Revision: https://reviews.llvm.org/D81554	2020-08-21 11:01:57 +07:00
Xing GUO	de84ba17c9	[DWARFYAML] Add support for referencing different abbrev tables. This patch adds support for referencing different abbrev tables. We use 'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly assign an abbrev table to compilation units. The syntax is: ``` debug_abbrev: - ID: 0 Table: ... - ID: 1 Table: ... debug_info: - ... AbbrevTableID: 1 ## Reference the second abbrev table. - ... AbbrevTableID: 0 ## Reference the first abbrev table. ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D83116	2020-08-21 11:44:25 +08:00
Xing GUO	6752521113	[DWARFYAML] Add support for emitting multiple abbrev tables. This patch adds support for emitting multiple abbrev tables. Currently, compilation units will always reference the first abbrev table. Reviewed By: jhenderson, labath Differential Revision: https://reviews.llvm.org/D86194	2020-08-21 10:12:08 +08:00
Michael Liao	1cf2d56956	[amdgpu] Add codegen support for HIP dynamic shared memory. Summary: - HIP uses an unsized extern array `extern __shared__ T s[]` to declare the dynamic shared memory, which size is not known at the compile time. Reviewers: arsenm, yaxunl, kpyzhov, b-sumner Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82496	2020-08-20 21:29:18 -04:00
Jon Roelofs	3f612e225b	Fix a couple of typos. NFC	2020-08-20 14:56:57 -06:00
Kamau Bridgeman	7be92ab238	[PowerPC][PCRelative] Thread Local Storage Support for General Dynamic This patch is the initial support for the General Dynamic Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: NeHuang Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D82315	2020-08-20 15:08:13 -05:00
Simon Pilgrim	3ca52d2ee4	Fix Wdocumentation unknown parameter warning. NFC.	2020-08-20 12:41:34 +01:00
Vitaly Buka	8be4d9ede0	[APInt] Allow self-assignment with libstdc++ http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/8256/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Athinlto-function-summary-paramaccess.ll Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D86053	2020-08-20 04:14:40 -07:00
Sebastian Neubauer	b1fe63844a	[AMDGPU] Add A16/G16 to InstCombine When sampling from images with coordinates that only have 16 bit accuracy, convert the image intrinsic call to use a16 or g16. This does only happen if the target hardware supports it. An alternative would be to always apply this combination, independent of the target hardware and extend 16 bit arguments to 32 bit arguments during legalization. To me, this sounds like an unnecessary roundtrip that could prevent some further InstCombine optimizations. Differential Revision: https://reviews.llvm.org/D85887	2020-08-20 10:51:49 +02:00
Georgii Rymar	ab00fa2ce1	[yaml2obj] - Make the 'Machine' key optional. Currently we have to set 'Machine' to something in our YAML descriptions. Usually we use 'EM_X86_64' for 64-bit targets and 'EM_386' for 32-bit targets. At the same time, in fact, in most cases our tests do not need a machine type and we can use 'EM_NONE'. This is cleaner, because avoids the need of using a particular machine. In this patch I've made the 'Machine' key optional (the default value, when it is not specified is `EM_NONE`) and removed it (where possible) from yaml2obj, obj2yaml and llvm-readobj tests. There are few tests left where I decided not to remove it, because I didn't want to touch CHECK lines or doing anything more complex than a removing a "Machine: *" line and formatting lines around. Differential revision: https://reviews.llvm.org/D86202	2020-08-20 11:40:51 +03:00
Bevin Hansson	0867948ac1	[IR] Add FixedPointBuilder. This patch adds a convenience class for using FixedPointSemantics to build fixed-point operations in IR. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D85314	2020-08-20 10:29:57 +02:00
Bevin Hansson	9531c6209d	[ADT] Move FixedPoint.h from Clang to LLVM. This patch moves FixedPointSemantics and APFixedPoint from Clang to LLVM ADT. This will make it easier to use the fixed-point classes in LLVM for constructing an IR builder for fixed-point and for reusing the APFixedPoint class for constant evaluation purposes. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html Reviewed By: leonardchan, rjmccall Differential Revision: https://reviews.llvm.org/D85312	2020-08-20 10:29:45 +02:00
Johannes Doerfert	864a7559d8	Revert "[IR] Intrinsics default attributes and opt-out flag" This commit introduced a non-trivial compile time regression that needs to be addressed: https://reviews.llvm.org/D70365#2227627 Given that it is unclear how long that will take, I'll revert it for now. This reverts commit eedf18fc1f5fc71bb896204abf41fc5a2dbf25f7.	2020-08-20 00:25:32 -05:00
Matt Arsenault	734b071bb5	GlobalISel: Implement fewerElementsVector for G_CONCAT_VECTORS sources This fixes <6 x s16> = G_CONCAT_VECTORS from <3 x s16> handling.	2020-08-19 18:53:24 -04:00
Francesco Petrogalli	5fbee91974	[llvm] Add default constructor of `llvm::ElementCount`. This patch prevents failures like those reported in http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/34173. We have enabled the default constructor for `llvm::ElementCount` to make sure the code compiles on Windows. Reviewed By: ormris Differential Revision: https://reviews.llvm.org/D86240	2020-08-19 21:39:24 +00:00
Sanjay Patel	042574c236	[ValueTracking] define/use max recursion depth in header There's a potential motivating case to increase this limit in PR47191: http://bugs.llvm.org/PR47191 But first we should make it less hacky. The limit in InstCombine is directly tied to this value because an increase there can cause asserts in the underlying value tracking calls if not changed together. The usage in VectorUtils is independent, but the comment suggests that we should use the same value unless there's a known reason to diverge. There are similar limits in codegen analysis, but I think we should leave those independent in case we intentionally want the optimization power/cost to be different there. Differential Revision: https://reviews.llvm.org/D86113	2020-08-19 16:56:59 -04:00
Matt Arsenault	1318b038fa	GlobalISel: Add TargetLowering member to LegalizerHelper	2020-08-19 14:50:35 -04:00
Matt Arsenault	f48b6ba0a8	GlobalISel: Use Register	2020-08-19 13:45:31 -04:00
Mehdi Amini	db235b2187	Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private."" Was reverted because MLIR/Flang builds were broken, these APIs have been fixed in the meantime.	2020-08-19 17:26:36 +00:00
Mehdi Amini	4386b1823a	Revert "[NFC][llvm] Make the contructors of `ElementCount` private." This reverts commit 264afb9e6aebc98c353644dd0700bec808501cab. (and dependent 6b742cc48 and fc53bd610f) MLIR/Flang are broken.	2020-08-19 17:21:37 +00:00
Jessica Paquette	153c17604a	[GlobalISel] Add combine for (x & mask) -> x when (x & mask) == x If we have a mask, and a value x, where (x & mask) == x, we can drop the AND and just use x. This is about a 0.4% geomean code size improvement on CTMark at -O3 for AArch64. In AArch64, this is most useful post-legalization. Patterns like this often show up when legalizing s1s, which must be extended to larger types. e.g. ``` %cmp:_(s32) = G_ICMP ... %and:_(s32) = G_AND %cmp, 1 ``` Since G_ICMP only produces a single bit, there's no reason to mask it with the G_AND. Differential Revision: https://reviews.llvm.org/D85463	2020-08-19 10:20:57 -07:00
Francesco Petrogalli	d75808bc7f	[NFC][llvm] Make the contructors of `ElementCount` private. Differential Revision: https://reviews.llvm.org/D86120	2020-08-19 16:26:44 +00:00
Bjorn Pettersson	1813b6efab	[GlobalISel] Untabify InstructionSelectorImpl.h. NFC	2020-08-19 12:00:00 +02:00
sstefan1	de2379255b	[IR] Intrinsics default attributes and opt-out flag Intrinsic properties can now be set to default and applied to all intrinsics. If the attributes are not needed, the user can opt-out by setting the DisableDefaultAttributes flag to true. Differential Revision: https://reviews.llvm.org/D70365	2020-08-19 10:50:46 +02:00
Ronak Chauhan	4697f34ed6	Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors" This reverts commit cacfb02d28a3cabd4e45d2535cb0686cef48a2c9. Reverting due to buildbot failures.	2020-08-19 13:12:29 +05:30
David Sherwood	f7a1832d69	[SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorLoads In DAGTypeLegalizer::GenWidenVectorLoads the algorithm assumes it only ever deals with fixed width types, hence the offsets for each individual store never take 'vscale' into account. I've changed the code in that function to use TypeSize instead of unsigned for tracking the remaining load amount. In addition, I've changed the load loop to use the new IncrementPointer helper function for updating the addresses in each iteration, since this handles scalable vector types. Also, I've added report_fatal_errors in GenWidenVectorExtLoads, TargetLowering::scalarizeVectorLoad and TargetLowering::scalarizeVectorStores, since these functions currently use a sequence of element-by-element scalar loads/stores. In a similar vein, I've also added a fatal error report in FindMemType for the case when we decide to return the element type for a scalable vector type. I've added new tests in CodeGen/AArch64/sve-split-load.ll CodeGen/AArch64/sve-ld-addressing-mode-reg-imm.ll for the changes in GenWidenVectorLoads. Differential Revision: https://reviews.llvm.org/D85909	2020-08-19 07:54:32 +01:00
Yaxun (Sam) Liu	6660be7005	[HIP] Support target id by --offload-arch This patch introduces support of target id by -offload-arch. Differential Revision: https://reviews.llvm.org/D60620	2020-08-18 23:43:53 -04:00
Ronak Chauhan	142f4dd209	[AMDGPU] Support disassembly for AMDGPU kernel descriptors Decode AMDGPU Kernel descriptors as assembler directives. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D80713	2020-08-19 08:49:07 +05:30
Elliott Hughes	8e3a33cacc	ld128 demangle: allow space for 'L' suffix. Summary: Caught by HWASAN on arm64 Android (which uses ld128 for long double). This was running the existing fuzzer. The specific minimized fuzz input to reproduce this is: __cxa_demangle("1\006ILeeeEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", 0, 0, 0); Reviewers: eugenis, srhines, #libc_abi! Subscribers: kristof.beyls, danielkiss, libcxx-commits Tags: #libc_abi Differential Revision: https://reviews.llvm.org/D77924	2020-08-18 16:14:05 -07:00
Jessica Paquette	67ae683e5b	[GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList It's annoying to have to maintain multiple, nearly identical chains of if statements which all set the same attributes. Add a helper function, `addFlagsUsingAttrFn` which performs the attribute setting. Then, use wrappers for that function in `lowerCall` and `setArgFlags`. (Note that the flag-setting code in `setArgFlags` was missing the returned attribute. There's no selection for this yet, so no test. It's an example of the kind of thing this lets us avoid, though.) Differential Revision: https://reviews.llvm.org/D86159	2020-08-18 11:07:33 -07:00
Matt Arsenault	418515b7d0	GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT Add unit tests since AMDGPU will only trigger this for gigantic vectors, and won't use the annoying odd sized breakdown case.	2020-08-18 13:51:19 -04:00
David Blaikie	01ab206194	[WIP][DebugInfo] Lazily parse debug_loclist offsets Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing down memory usage of symbolizers that don't need to parse this data at all. There's not much benefit to caching these anyway - since they are O(1) lookup and reading once you know where the offset list starts (and can do bounds checking with the offset list size too). In general, I think it might be time to start paying down some of the technical debt of loc/loclist/range/rnglist parsing to try to unify it a bit more. eg: * Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection, LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if these were all wrapped up in two variables - one for loclists, one for rnglists) * rnglists and loclists are handled differently (see: LoclistTableHeader, but no RnglistTableHeader) * maybe all these types could be less stateful - lazily parse what they need to, even reparsing rather than caching because it doesn't seem too expensive, for instance. (though admittedly so long as it's constantcost/overead per compilatiton that's probably adequate) * Maybe implementing and using a DWARFDataExtractor that can be sub-ranged (so we could slice it up to just the single contribution) - though maybe that's not so useful because loc/ranges need to refer to it by absolute, not contribution-relative mechanisms Differential Revision: https://reviews.llvm.org/D86110	2020-08-18 10:49:39 -07:00
Amara Emerson	f6bce1ffcd	[GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x This is restricted to single use loads, which if we fold to sextloads we can find more optimal addressing modes on AArch64. This also fixes an overload the MachineFunction::getMachineMemOperand() method which was incorrectly using the MF alignment instead of the MMO alignment. Differential Revision: https://reviews.llvm.org/D85966	2020-08-18 10:42:15 -07:00
Amara Emerson	d1d273ff1c	[GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c' By detecting this sign extend pattern early, we can uncover opportunities for more optimizations. Differential Revision: https://reviews.llvm.org/D85965	2020-08-18 10:42:15 -07:00
Jessica Paquette	7e08e6c7a3	[GlobalISel][CallLowering] Look through call parameters for flags We weren't looking through the parameters on calls at all. E.g., say you had ``` declare i32 @zext(i32 zeroext %x) ... %y = call i32 @zext(i32 %something) ... ``` At the point of the call, we wouldn't know that the %something should have the zeroext attribute. This sets flags in about the same way as TargetLoweringBase::ArgListEntry::setAttributes. Differential Revision: https://reviews.llvm.org/D86125	2020-08-18 08:48:56 -07:00
Ronak Chauhan	6e3663ae70	[ELF] Hide target specific methods as private Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86136	2020-08-18 18:26:08 +05:30
Ronak Chauhan	0f95014e38	[llvm-objdump][AMDGPU] Detect CPU string AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly. However, the AMDGPU target CPU is stored in e_flags in the ELF object. This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D84519	2020-08-18 17:43:16 +05:30
Shinji Okumura	8816b1755f	[Attributor] Deduce noundef attribute This patch introduces a new abstract attribute `AANoUndef` which corresponds to `noundef` IR attribute and deduce them. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85184	2020-08-18 18:05:54 +09:00
Johannes Doerfert	117fb9d08e	[Attributor][NFC] Directly return proper type to avoid casts	2020-08-17 23:36:36 -05:00
Harmen Stoppels	cca791c2df	Use find_library for ncurses Currently it is hard to avoid having LLVM link to the system install of ncurses, since it uses check_library_exists to find e.g. libtinfo and not find_library or find_package. With this change the ncurses lib is found with find_library, which also considers CMAKE_PREFIX_PATH. This solves an issue for the spack package manager, where we want to use the zlib installed by spack, and spack provides the CMAKE_PREFIX_PATH for it. This is a similar change as https://reviews.llvm.org/D79219, which just landed in master. Differential revision: https://reviews.llvm.org/D85820	2020-08-17 19:52:52 -07:00
Amy Kwan	c630ca1f5c	[PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang This patch implements the vec_extractm function prototypes in altivec.h in order to utilize the vector extract with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82675	2020-08-17 21:14:17 -05:00
Hamilton Tobon Mosquera	59a4de0434	[OpenMPOpt][HideMemTransfersLatency] Split __tgt_target_data_begin_mapper into its "issue" and "wait" counterparts. WIP that tries to hide the latency of runtime calls that involve host to device memory transfers by splitting them into their "issue" and "wait" versions. The "issue" is moved upwards as much as possible. The "wait" is moved downards as much as possible. The "issue" issues the memory transfer asynchronously, returning a handle. The "wait" waits in the returned handle for the memory transfer to finish. We still lack of the movement.	2020-08-17 20:56:10 -05:00
Hongtao Yu	43bf988191	[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d --symbolize-operands --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191	2020-08-17 16:55:12 -07:00
Matt Arsenault	4a4ddffaf1	GlobalISel: Make type for lower action more consistently optional Some of the lower implementations were relying on this, however the type was not set depending on which form .lower* helper form you were using. For instance, if you used an unconditonal lower(), the type was never set. Most of the lower actions do not benefit from a type parameter, and just expand in terms of the original operation's types. However, some lowerings could benefit from an additional type hint to combine a promotion and an expansion. An example of this is for add/sub sat. The DAG integer legalization tries to use smarter expansions directly when promoting the integer type, and doesn't always produce the same instruction with a wider type. Treat this as an optional hint argument, that only means something for specific lower actions. It may be useful to generalize this mechanism to pass a full list of type indexes and desired types, but I haven't run into a case like that yet.	2020-08-17 16:24:55 -04:00
diggerlin	78cee19819	[AIX][XCOFF][Patch1] Provide decoding trace back table information API for xcoff object file for llvm-objdump -d SUMMARY: 1. This patch provided API for decoding the traceback table info and unit test for the these API. 2. Another patchs will do the following things: 2.1 added a new option --traceback-table to decode the trace back table information for xcoff object file when using llvm-objdump to disassemble the xcoff objfile. 2.2 print out the traceback table information for llvm-objdump. Reviewers: Jason liu, Hubert Tong, James Henderson Differential Revision: https://reviews.llvm.org/D81585	2020-08-17 16:23:47 -04:00
Dávid Bolvanský	26599cbe3f	Revert "[BPI] Improve static heuristics for integer comparisons" This reverts commit 50c743fa713002fe4e0c76d23043e6c1f9e9fe6f. Patch will be split to smaller ones.	2020-08-17 20:44:33 +02:00
Valentin Clement	a11d8d3dcc	[flang][directives] Use TableGen to generate clause unparsing Use the TableGen directive back-end to generate code for the clauses unparsing. Reviewed By: sscalpone, kiranchandramohan Differential Revision: https://reviews.llvm.org/D85851	2020-08-17 14:22:25 -04:00
Matt Arsenault	116b71f4c8	GlobalISel: Fix parameter name in doxygen comment	2020-08-17 13:57:10 -04:00
Matt Arsenault	a71dba11e9	GlobalISel: Revisit users of other merge opcodes in artifact combiner The artifact combiner searches for the uses of G_MERGE_VALUES for unmerge/trunc that need further combining. This also needs to handle the vector merge opcodes the same way. This fixes leaving behind some pairs I expected to be removed, that were if the legalizer is run a second time.	2020-08-17 13:56:53 -04:00
Matt Arsenault	5bcccd9fb4	GlobalISel: Remove unnecessary check for copy type COPY isn't allowed to change the type, but can mix no type with type.	2020-08-17 09:19:25 -04:00
Alex Zinenko	544267f834	[llvm] support graceful failure of DataLayout parsing Existing implementation always aborts on syntax errors in a DataLayout description. While this is meaningful for consuming textual IR modules, it is inconvenient for users that may need fine-grained control over the layout from, e.g., command-line options. Propagate errors through the parsing functions and only abort in the top-level parsing function instead. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D85650	2020-08-17 15:10:37 +02:00
Simon Pilgrim	8366289c89	[DemandedBits] Improve accuracy of Add propagator The current demand propagator for addition will mark all input bits at and right of the alive output bit as alive. But carry won't propagate beyond a bit for which both operands are zero (or one/zero in the case of subtraction) so a more accurate answer is possible given known bits. I derived a propagator by working through truth tables and using a bit-reversed addition to make demand ripple to the right, but I'm not sure how to make a convincing argument for its correctness in the comments yet. Nevertheless, here's a minimal implementation and test to get feedback. This would help in a situation where, for example, four bytes (<128) packed into an int are added with four others SIMD-style but only one of the four results is actually read. Known A: 0_______0_______0_______0_______ Known B: 0_______0_______0_______0_______ AOut: 00000000001000000000000000000000 AB, current: 00000000001111111111111111111111 AB, patch: 00000000001111111000000000000000 Committed on behalf of: @rrika (Erika) Differential Revision: https://reviews.llvm.org/D72423	2020-08-17 12:54:09 +01:00
Vitaly Buka	6f71d99b21	[StackSafety] Skip ambiguous lifetime analysis If we can't identify alloca used in lifetime marker we need to assume to worst case scenario. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D84630	2020-08-16 18:05:52 -07:00
Fady Ghanim	a11b92e493	[OpenMP][OMPBuilder] Adding support for `omp single` This adds support for generating `omp single`, and necessary calls for `copyprivate` clause. Differential Revision: https://reviews.llvm.org/D85617	2020-08-16 01:15:16 -04:00
Wenlei He	8c3d7a1d09	[InlineAdvisor] New inliner advisor to replay inlining from optimization remarks This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context. A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose. This is a resubmit of https://reviews.llvm.org/D83743	2020-08-15 20:17:21 -07:00
Aditya Kumar	d33455c31b	[NFC] Fix typo and variable names	2020-08-15 09:06:22 -07:00

... 3 4 5 6 7 ...

42450 Commits