llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Alexander Kornienko	66580103e2	Replace size method call of containers to empty method where appropriate This patch was generated by a clang tidy checker that is being open sourced. The documentation of that checker is the following: /// The emptiness of a container should be checked using the empty method /// instead of the size method. It is not guaranteed that size is a /// constant-time function, and it is generally more efficient and also shows /// clearer intent to use empty. Furthermore some containers may implement the /// empty method but not implement the size method. Using empty whenever /// possible makes it easier to switch to another container in the future. Patch by Gábor Horváth! llvm-svn: 226161	2015-01-15 11:41:30 +00:00
Chandler Carruth	88fd126216	[PM] Separate the TargetLibraryInfo object from the immutable pass. The pass is really just a means of accessing a cached instance of the TargetLibraryInfo object, and this way we can re-use that object for the new pass manager as its result. Lots of delta, but nothing interesting happening here. This is the common pattern that is developing to allow analyses to live in both the old and new pass manager -- a wrapper pass in the old pass manager emulates the separation intrinsic to the new pass manager between the result and pass for analyses. llvm-svn: 226157	2015-01-15 10:41:28 +00:00
NAKAMURA Takumi	0068e08baa	Update libdeps since TLI was moved from Target to Analysis in r226078. llvm-svn: 226126	2015-01-15 05:21:00 +00:00
Erik Eckstein	0b9c664d3e	reapply: SLPVectorizer: Cache results from memory alias checking. This speeds up the dependency calculations for blocks with many load/store/call instructions. Beside the improved runtime, there is no functional change. Compared to the original commit, this re-applied commit contains a bug fix which ensures that there are no incorrect collisions in the alias cache. llvm-svn: 225977	2015-01-14 11:24:47 +00:00
Hao Liu	b1d2970451	Fix a wrong comment in LoopVectorize. I.E. more than two -> exactly two Fix a typo function name in LoopVectorize. I.E. collectStrideAcccess() -> collectStrideAccess() llvm-svn: 225935	2015-01-14 03:02:16 +00:00
Julien Lerouge	4c884b9211	Fix non-determinism issue in SLP The issue was introduced in r214638: + for (auto &BSIter : BlocksSchedules) { + scheduleBlock(BSIter.second.get()); + } Because BlocksSchedules is a DenseMap with BasicBlock* keys, blocks are scheduled in non-deterministic order, resulting in unpredictable IR. Patch by Daniel Reynaud! llvm-svn: 225821	2015-01-13 19:45:52 +00:00
Erik Eckstein	1693292059	Revert "SLPVectorizer: Cache results from memory alias checking." The alias cache has a problem of incorrect collisions in case a new instruction is allocated at the same address as a previously deleted instruction. llvm-svn: 225790	2015-01-13 14:36:46 +00:00
Erik Eckstein	cc1597341f	SLPVectorizer: Cache results from memory alias checking. This speeds up the dependency calculations for blocks with many load/store/call instructions. Beside the improved runtime, there is no functional change. llvm-svn: 225786	2015-01-13 11:37:51 +00:00
Michael Zolotukhin	cffb7c60eb	Update comment. llvm-svn: 225553	2015-01-09 22:15:06 +00:00
Michael Zolotukhin	21e05ef27d	Remove duplicating code. NFC. The removed condition is checked in the previous loop. llvm-svn: 225542	2015-01-09 20:36:19 +00:00
Suyog Sarda	c9ba894c1b	Assumption that "VectorizedValue" will always be an Instruction is not correct. It can be a constant or a vector argument. ex : define i32 @hadd(<4 x i32> %a) #0 { entry: %vecext = extractelement <4 x i32> %a, i32 0 %vecext1 = extractelement <4 x i32> %a, i32 1 %add = add i32 %vecext, %vecext1 %vecext2 = extractelement <4 x i32> %a, i32 2 %add3 = add i32 %add, %vecext2 %vecext4 = extractelement <4 x i32> %a, i32 3 %add5 = add i32 %add3, %vecext4 ret i32 %add5 } llvm-svn: 225517	2015-01-09 10:23:48 +00:00
Jiangning Liu	3fc6a7e69d	Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm. {code} // loop body ... = a[i] (1) ... = a[i+1] (2) ....... a[i+1] = .... (3) a[i] = ... (4) {code} The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed. For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized. The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly. llvm-svn: 225159	2015-01-05 10:08:58 +00:00
Chandler Carruth	c140bae640	[PM] Split the AssumptionTracker immutable pass into two separate APIs: a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the many utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131	2015-01-04 12:03:27 +00:00
Elena Demikhovsky	d6e3f2ad88	Some code improvements in Masked Load/Store. No functional changes. llvm-svn: 224986	2014-12-30 14:28:14 +00:00
Elena Demikhovsky	4a153fb55a	Masked Load/Store - Changed the order of parameters in intrinsics. No functional changes. The documentation is coming. llvm-svn: 224829	2014-12-25 07:49:20 +00:00
Tilmann Scheller	6ecc887cd6	[BBVectorize] Remove two more redundant assignments. Found by the Clang static analyzer. llvm-svn: 224590	2014-12-19 17:21:38 +00:00
Tilmann Scheller	a5daa85a3d	[BBVectorize] Remove redundant assignment. Found by the Clang static analyzer. llvm-svn: 224589	2014-12-19 17:13:12 +00:00
Tilmann Scheller	8e09191da7	[LoopVectorize] Remove redundant assignment. Found by the Clang static analyzer. llvm-svn: 224587	2014-12-19 17:02:31 +00:00
Suyog Sarda	ea83428380	Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. llvm-svn: 224424	2014-12-17 10:34:27 +00:00
Elena Demikhovsky	fe73fcc29b	Masked Load and Store Intrinsics in loop vectorizer. The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 llvm-svn: 224334	2014-12-16 11:50:42 +00:00
Elena Demikhovsky	b5f1976682	Loop Vectorizer minor changes in the code - some comments, function names, identation. Reviewed here: http://reviews.llvm.org/D6527 llvm-svn: 224218	2014-12-14 09:43:50 +00:00
Suyog Sarda	d1c6149e4d	This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it. Test case : float hadd(float* a) { return (a[0] + a[1]) + (a[2] + a[3]); } AArch64 assembly before patch : ldp s0, s1, [x0] ldp s2, s3, [x0, #8] fadd s0, s0, s1 fadd s1, s2, s3 fadd s0, s0, s1 ret AArch64 assembly after patch : ldp d0, d1, [x0] fadd v0.2s, v0.2s, v1.2s faddp s0, v0.2s ret Reviewed Link : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248531.html llvm-svn: 224119	2014-12-12 12:53:44 +00:00
Michael Zolotukhin	54137f16c1	Remove redundant variable. Tested by adding assert(LoopVectorPreHeader == VecPreheader) on LLVM test suite and SPECs. llvm-svn: 223847	2014-12-09 22:45:07 +00:00
Duncan P. N. Exon Smith	3d57886267	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
Duncan P. N. Exon Smith	cc86eec0f7	LoopVectorize: Remove unnecessary RAUW Remove an unnecessary `MDNode::replaceAllUsesWith()`. In the preceding line, `TheLoop->setLoopID()` visits all backedges and sets the new loop ID. This sufficiently updates the loop metadata. Metadata RAUW is going away as part of PR21532. llvm-svn: 223210	2014-12-03 05:41:20 +00:00
Michael Zolotukhin	37717c3cf1	PR21302. Vectorize only bottom-tested loops. rdar://problem/18886083 llvm-svn: 223171	2014-12-02 22:59:06 +00:00
Duncan P. N. Exon Smith	73ce6dbb2b	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
Elena Demikhovsky	36a2243ab7	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Suyog Sarda	e6a1f30c00	Vectorize a reduction chain feeding into a 'return' statement. e.x return (a[0]+b[0]) + (a[1]+b[1]) Differential Revision: http://reviews.llvm.org/D6227 llvm-svn: 222364	2014-11-19 16:07:38 +00:00
David Blaikie	60e6c80905	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Duncan P. N. Exon Smith	64f455849e	IR: Make MDString::getName() private Hide the fact that `MDString`'s string is stored in `Value::Name` -- that's going to change soon. Update the only in-tree client that was using it instead of `Value::getString()`. Part of PR21532. llvm-svn: 221951	2014-11-13 23:59:16 +00:00
Duncan P. N. Exon Smith	8770505e4e	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711	2014-11-11 21:30:22 +00:00
David Majnemer	15ad95a47c	LoopVectorize: Don't assume pointees are sized A pointer's pointee might not be sized: the pointee could be a function. Report this as IK_NoInduction when calculating isInductionVariable. This fixes PR21508. llvm-svn: 221501	2014-11-07 00:31:14 +00:00
Duncan P. N. Exon Smith	8f49c8202f	IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc() Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. llvm-svn: 221167	2014-11-03 18:13:57 +00:00
Duncan P. N. Exon Smith	7004fd9aac	IR: MDNode => Value: Instruction::getMetadata() Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024	2014-11-01 00:10:31 +00:00
Michael Zolotukhin	2a32061dc5	Correctly update dom-tree after loop vectorizer. llvm-svn: 221009	2014-10-31 22:28:03 +00:00
NAKAMURA Takumi	e24697a037	Reformat partially, where I touched for whitespace changes. llvm-svn: 220773	2014-10-28 11:54:52 +00:00
NAKAMURA Takumi	f2d570a79b	Untabify and whitespace cleanups. llvm-svn: 220771	2014-10-28 11:53:30 +00:00
Benjamin Kramer	7eb4c6d7ca	LoopVectorize: Simplify code. No functionality change. llvm-svn: 220405	2014-10-22 19:13:54 +00:00
Matt Arsenault	74dd906076	Add minnum / maxnum intrinsics These are named following the IEEE-754 names for these functions, rather than the libm fmin / fmax to avoid possible ambiguities. Some languages may implement something resembling fmin / fmax which return NaN if either operand is to propagate errors. These implement the IEEE-754 semantics of returning the other operand if either is a NaN representing missing data. llvm-svn: 220341	2014-10-21 23:00:20 +00:00
Hal Finkel	9eb64770d5	[SLPVectorize] Basic ephemeral-value awareness The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). llvm-svn: 219816	2014-10-15 17:35:01 +00:00
Eric Christopher	58efc65f68	No need to cache this unused variable. Patch by Ehsan Akhgari. llvm-svn: 219749	2014-10-14 23:58:51 +00:00
Hal Finkel	3a23f40590	[LoopVectorize] Ignore @llvm.assume for cost estimates and legality A few minor changes to prevent @llvm.assume from interfering with loop vectorization. First, treat @llvm.assume like the lifetime intrinsics, which are scalarized (but don't otherwise interfere with the legality checking). Second, ignore the cost of ephemeral instructions in the loop (these will go away anyway during CodeGen). Alignment assumptions and other uses of @llvm.assume can often end up inside of loops that should be vectorized (this is not uncommon for assumptions generated by __attribute__((align_value(n))), for example). llvm-svn: 219741	2014-10-14 22:59:49 +00:00
Chandler Carruth	75b9591ece	[SCEV] Fix one more caller blindly passing the latch to SCEV's getSmallConstantTripCount even when it isn't the exiting block. I missed this in my first audit, very sorry. This was found in LNT and elsewhere. I don't have a test case, but it was completely obvious from inspection that this was the problem. I'll see if I can reduce a test case, but I'm not really hopeful, and the value seems quite low. llvm-svn: 219562	2014-10-11 05:28:30 +00:00
Chandler Carruth	9e809b8dad	[SCEV] Add some asserts to the recently improved trip count computation routines and fix all of the bugs they expose. I hit a test case that crashed even without these asserts due to passing a non-exiting latch to the ExitingBlock parameter of the trip count computation machinery. However, when I add the nice asserts, it turns out we have plenty of coverage of these bugs, they just didn't manifest in crashers. The core problem seems to stem from an assumption that the latch is the exiting block. While this is often true, and somewhat the "normal" way to think about loops, it isn't necessarily true. The correct way to call the trip count routines in a generic fashion (that is, without a particular exit in mind) is to just use the loop's single exiting block if it has one. The trip count can't be computed generically unless it does. This works great for the loop vectorizer. The loop unroller actually wants to select the latch when it has to chose between multiple exits because for unrolling it is the latch trips that matter. But if this is the desire, it needs to explicitly guard for non-exiting latches and check for the generic trip count in that case. I've added the asserts, and added convenience APIs for querying the trip count generically that check for a single exit block. I've kept the APIs consistent between computing trip count and trip multiples. Thansk to Mark for the help debugging and tracking down the right fix here! llvm-svn: 219550	2014-10-11 00:12:11 +00:00
Sanjay Patel	8030ed3639	Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528	2014-09-10 17:58:16 +00:00
Sanjay Patel	fbe1366788	Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802). The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar instructions into vectors. But this isn't a simple copy - we need to take the intersection (the logical 'and') of the sets of flags on the scalars. The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after: http://reviews.llvm.org/D4015 http://llvm.org/viewvc/llvm-project?view=revision&revision=211339 The vast majority of changed files are existing tests that were not propagating IR flags, but I've also added a new test file for focused testing of IR flag possibilities. Differential Revision: http://reviews.llvm.org/D5172 llvm-svn: 217051	2014-09-03 17:40:30 +00:00
Sanjay Patel	0f9db96fbc	Change name of copyFlags() to copyIRFlags(). Add convenience method for logical 'and' of all flags. NFC. Adding 'IR' to the names in an attempt to be less ambiguous about the flags we're dealing with here. The 'and' method is needed by the SLPVectorizer (PR20802) and possibly other passes. llvm-svn: 217004	2014-09-03 01:06:50 +00:00
Yi Jiang	94e9de2c27	Generate extract for in-tree uses if the use is scalar operand in vectorized instruction. radar://18144665 llvm-svn: 216946	2014-09-02 21:00:39 +00:00
Sanjay Patel	24d425f290	Add a convenience method to copy wrapping, exact, and fast-math flags (NFC). The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions. This patch adds a convenience method to make that operation easier because we need to do this in the loop vectorizer, SLP vectorizer, and possibly other places. Although this is a 'no functional change' patch, I've added a testcase to verify that the exact flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked in existing testcases. Differential Revision: http://reviews.llvm.org/D5138 llvm-svn: 216886	2014-09-01 18:44:57 +00:00

1 2 3 4 5 ...

684 Commits