llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00

Author	SHA1	Message	Date
Nadav Rotem	adfe58a7ad	flip the scev minus direction to simplify the code. llvm-svn: 186466	2013-07-16 22:57:06 +00:00
Nadav Rotem	633bd23118	SLPVectorizer: Improve the compile time of isConsecutive by adding a simple constant-gep check before using SCEV. This check does not always work because not all of the GEPs use a constant offset, but it happens often enough to reduce the number of times we use SCEV. llvm-svn: 186465	2013-07-16 22:51:07 +00:00
Nadav Rotem	ebe95f88ed	SLPVectorizer: Reduce the compile time of the consecutive store lookup. Process groups of stores in chunks of 16. llvm-svn: 186420	2013-07-16 15:25:17 +00:00
Nadav Rotem	f1e0aebca1	PR16628: Fix a bug in the code that merges compares. Compares return i1 but they compare different types. llvm-svn: 186359	2013-07-15 22:52:48 +00:00
Nadav Rotem	37c18f5cda	SLPVectorizer: change the order in which we search for vectorization candidates. Do stores first and PHIs second. llvm-svn: 186277	2013-07-14 06:15:46 +00:00
Craig Topper	58fa7a9b4a	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Arnold Schwaighofer	970f54281c	LoopVectorizer: Disallow reductions whose header phi is used outside the loop If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 llvm-svn: 186256	2013-07-13 19:09:29 +00:00
Andrew Trick	651c624842	LoopVectorize fix: LoopInfo must be valid when invoking utils like SCEVExpander. In general, one should always complete CFG modifications first, update CFG-based analyses, like Dominatores and LoopInfo, then generate instruction sequences. LoopVectorizer was creating a new loop, calling SCEVExpander to generate checks, then updating LoopInfo. I just changed the order. llvm-svn: 186241	2013-07-13 06:20:06 +00:00
Arnold Schwaighofer	b9c37551bc	TargetTransformInfo: address calculation parameter for gather/scather Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187	2013-07-12 19:16:02 +00:00
Nadav Rotem	ee62470368	SLPVectorizer: Sink and enable CSE for ExtractElements. llvm-svn: 186145	2013-07-12 06:09:24 +00:00
Nadav Rotem	1e6246b38c	SLPVectorize: Replace the code that checks for vectorization candidates in successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. llvm-svn: 186139	2013-07-12 00:04:18 +00:00
Nadav Rotem	cd9c4e430f	Remove an argument that we dont use anymore. llvm-svn: 186116	2013-07-11 20:56:13 +00:00
Arnold Schwaighofer	a8667081e1	LoopVectorize: Vectorize all accesses in address space zero with unit stride We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. llvm-svn: 186088	2013-07-11 15:21:55 +00:00
Nadav Rotem	965af9cb85	Fix a warning. llvm-svn: 186064	2013-07-11 05:39:02 +00:00
Nadav Rotem	8e1d89128f	SLPVectorizer: refactor the code that places extracts. Place the code that decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. llvm-svn: 186058	2013-07-11 04:54:05 +00:00
Nadav Rotem	417c1a3150	Fix PR16571, which is a bug in the code that checks that all of the types in the bundle are uniform. llvm-svn: 185970	2013-07-09 21:38:08 +00:00
Nadav Rotem	07b212b07b	Set the default insert point to the first instruction, and not to end() llvm-svn: 185953	2013-07-09 17:55:36 +00:00
Nadav Rotem	c43699ed15	This patch changes the saved IRBuilder insert point from BasicBlock::iterator to AssertingVH. Commit 185883 fixes a bug in the IRBuilder that should fix the ASan bot. AssertingVH can help in exposing some RAUW problems. Thanks Ben and Alexey! llvm-svn: 185886	2013-07-08 23:31:13 +00:00
Nadav Rotem	709b733114	Clear the builder insert point between tree-vectorization phases. llvm-svn: 185777	2013-07-07 14:57:18 +00:00
Nadav Rotem	883fb8ad80	SLPVectorizer: Implement DCE as part of vectorization. This is a complete re-write if the bottom-up vectorization class. Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization. There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design. In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree. llvm-svn: 185774	2013-07-07 06:57:07 +00:00
Craig Topper	783617eba7	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Arnold Schwaighofer	26461b04c7	LoopVectorize: Math functions only read rounding mode Math functions are mark as readonly because they read the floating point rounding mode. Because we don't vectorize loops that would contain function calls that set the rounding mode it is safe to ignore this memory read. llvm-svn: 185299	2013-07-01 00:54:44 +00:00
Benjamin Kramer	78189f47c1	LoopVectorizer: Pack MemAccessInfo pairs. llvm-svn: 185263	2013-06-29 17:52:08 +00:00
Benjamin Kramer	55f216aff4	Move helper classes into anonymous namespaces. llvm-svn: 185262	2013-06-29 17:02:06 +00:00
Nadav Rotem	201beeb585	We preserve the CFG and some of the analysis passes. llvm-svn: 185251	2013-06-29 05:38:15 +00:00
Nadav Rotem	aed4323517	Update docs. llvm-svn: 185250	2013-06-29 05:37:19 +00:00
Nadav Rotem	12c3b510fa	SLP Vectorizer: Add support for trees with external users. To support this we have to insert 'extractelement' instructions to pick the right lane. We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated. llvm-svn: 185230	2013-06-28 22:07:09 +00:00
Nadav Rotem	803f7c3932	LoopVectorizer: Refactor the code that checks if it is safe to predicate blocks. In this code we keep track of pointers that we are allowed to read from, if they are accessed by non-predicated blocks. We use this list to allow vectorization of conditional loads in predicated blocks because we know that these addresses don't segfault. llvm-svn: 185214	2013-06-28 20:46:27 +00:00
Arnold Schwaighofer	e6284189a7	LoopVectorize: Pull dyn_cast into setDebugLocFromInst llvm-svn: 185168	2013-06-28 17:14:48 +00:00
Arnold Schwaighofer	09766b6b9f	LoopVectorize: Use static function instead of DebugLocSetter class I used the class to safely reset the state of the builder's debug location. I think I have caught all places where we need to set the debug location to a new one. Therefore, we can replace the class by a function that just sets the debug location. llvm-svn: 185165	2013-06-28 16:26:54 +00:00
Arnold Schwaighofer	d6aee045b3	LoopVectorize: Preserve debug location info radar://14169017 llvm-svn: 185122	2013-06-28 00:38:54 +00:00
Arnold Schwaighofer	c0e3a07c99	LoopVectorize: Cache edge masks created during if-conversion Otherwise, we end up with an exponential IR blowup. Fixes PR16472. llvm-svn: 185097	2013-06-27 20:31:06 +00:00
Arnold Schwaighofer	ccd78deec7	LoopVectorize: Use vectorized loop invariant gep index anchored in loop Use vectorized instruction instead of original instruction anchored in the original loop. Fixes PR16452 and t2075.c of PR16455. llvm-svn: 185081	2013-06-27 15:11:55 +00:00
Arnold Schwaighofer	18efca433e	LoopVectorize: Don't store a reversed value in the vectorized value map When we store values for reversed induction stores we must not store the reversed value in the vectorized value map. Another instruction might use this value. This fixes 3 test cases of PR16455. llvm-svn: 185051	2013-06-27 00:45:41 +00:00
Nadav Rotem	195bbbe54b	No need to use a Set when a vector would do. llvm-svn: 185047	2013-06-27 00:14:13 +00:00
Nadav Rotem	897ca82595	SLP: When searching for vectorization opportunities scan the blocks in post-order because we grow chains upwards. llvm-svn: 185041	2013-06-26 23:44:45 +00:00
Nadav Rotem	962b32446e	SLP: Dont erase instructions during vectorization because it prevents the outerloops from iterating over the instructions. llvm-svn: 185040	2013-06-26 23:43:23 +00:00
Nadav Rotem	860aebf69a	Erase all of the instructions that we RAUWed llvm-svn: 184969	2013-06-26 17:16:09 +00:00
Nadav Rotem	e0a5a586b8	Do not add cse-ed instructions into the visited map because we dont want to consider them as a candidate for replacement of instructions to be visited. llvm-svn: 184966	2013-06-26 16:54:53 +00:00
Nadav Rotem	a8fba65221	SLPVectorizer: support slp-vectorization of PHINodes between basic blocks llvm-svn: 184888	2013-06-25 23:04:09 +00:00
Nadav Rotem	8fcb707c24	Fix a typo in the code that collected the costs recursively. llvm-svn: 184827	2013-06-25 05:30:56 +00:00
Nadav Rotem	eff545235c	Rename the variable to fix a warning. Thanks Andy Gibbs. llvm-svn: 184749	2013-06-24 15:59:47 +00:00
Arnold Schwaighofer	0a98597e80	Reapply 184685 after the SetVector iteration order fix. This should hopefully have fixed the stage2/stage3 miscompare on the dragonegg testers. "LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598" llvm-svn: 184724	2013-06-24 12:09:15 +00:00
Arnold Schwaighofer	75b76bf92f	LoopVectorize: Use SetVector for the access set We are creating the runtime checks using this set so we need a deterministic iteration order. llvm-svn: 184723	2013-06-24 12:09:12 +00:00
Arnold Schwaighofer	f022b11b08	Revert "LoopVectorize: Use the dependence test utility class" This reverts commit cbfa1ca993363ca5c4dbf6c913abc957c584cbac. We are seeing a stage2 and stage3 miscompare on some dragonegg bots. llvm-svn: 184690	2013-06-24 06:10:41 +00:00
Arnold Schwaighofer	c49cd1a668	LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598 llvm-svn: 184685	2013-06-24 03:55:48 +00:00
Arnold Schwaighofer	f9828b092b	LoopVectorize: Add utility class for checking dependency among accesses This class checks dependences by subtracting two Scalar Evolution access functions allowing us to catch very simple linear dependences. The checker assumes source order in determining whether vectorization is safe. We currently don't reorder accesses. Positive true dependencies need to be a multiple of VF otherwise we impede store-load forwarding. llvm-svn: 184684	2013-06-24 03:55:45 +00:00
Arnold Schwaighofer	67714fedcd	LoopVectorize: Add utility class for building sets of dependent accesses Sets of dependent accesses are built by unioning sets based on underlying objects. This class will be used by the upcoming dependence checker. llvm-svn: 184683	2013-06-24 03:55:44 +00:00
Nadav Rotem	6c2ae14dc5	SLP Vectorizer: Add support for vectorizing parts of the tree. Untill now we detected the vectorizable tree and evaluated the cost of the entire tree. With this patch we can decide to trim-out branches of the tree that are not profitable to vectorizer. Also, increase the max depth from 6 to 12. In the worse possible case where all of the code is made of diamond-shaped graph this can bring the cost to 2**10, but diamonds are not very common. llvm-svn: 184681	2013-06-24 02:52:43 +00:00
Nadav Rotem	5f8e32a66f	SLP Vectorizer: Fix a bug in the code that does CSE on the generated gather sequences. Make sure that we don't replace and RAUW two sequences if one does not dominate the other. llvm-svn: 184674	2013-06-23 21:57:27 +00:00

1 2 3 4 5 ...

387 Commits