llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Arnold Schwaighofer	2c67b7dc58	LoopVectorizer: A reduction that has multiple uses of the reduction value is not a reduction. Really. Under certain circumstances (the use list of an instruction has to be set up right - hence the extra pass in the test case) we would not recognize when a value in a potential reduction cycle was used multiple times by the reduction cycle. Fixes PR18526. radar://15851149 llvm-svn: 199570	2014-01-19 03:18:31 +00:00
Arnold Schwaighofer	9fb94754bd	LoopVectorize: Only strip casts from integer types when replacing symbolic strides Fixes PR18480. llvm-svn: 199291	2014-01-15 03:35:46 +00:00
Chandler Carruth	98adff6224	[PM] Split DominatorTree into a concrete analysis result object which can be used by both the new pass manager and the old. This removes it from any of the virtual mess of the pass interfaces and lets it derive cleanly from the DominatorTreeBase<> template. In turn, tons of boilerplate interface can be nuked and it turns into a very straightforward extension of the base DominatorTree interface. The old analysis pass is now a simple wrapper. The names and style of this split should match the split between CallGraph and CallGraphWrapperPass. All of the users of DominatorTree have been updated to match using many of the same tricks as with CallGraph. The goal is that the common type remains the resulting DominatorTree rather than the pass. This will make subsequent work toward the new pass manager significantly easier. Also in numerous places things became cleaner because I switched from re-running the pass (!!! mid way through some other passes run!!!) to directly recomputing the domtree. llvm-svn: 199104	2014-01-13 13:07:17 +00:00
Chandler Carruth	ee051af6e2	[cleanup] Move the Dominators.h and Verifier.h headers into the IR directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082	2014-01-13 09:26:24 +00:00
Arnold Schwaighofer	15e9d90974	LoopVectorizer: Enable strided memory accesses versioning per default I saw no compile or execution time regressions on x86_64 -mavx -O3. radar://13075509 llvm-svn: 199015	2014-01-11 20:40:34 +00:00
NAKAMURA Takumi	fbff75f61d	LoopVectorize.cpp: Appease MSC16. Excuse me, I hope msc16 builders would be fine till its end day. Introduce nullptr then. ;) llvm-svn: 199001	2014-01-11 09:59:27 +00:00
Arnold Schwaighofer	702d83d3d8	LoopVectorizer: Handle strided memory accesses by versioning for (i = 0; i < N; ++i) A[i * Stride1] += B[i * Stride2]; We take loops like this and check that the symbolic strides 'Strided1/2' are one and drop to the scalar loop if they are not. This is currently disabled by default and hidden behind the flag 'enable-mem-access-versioning'. radar://13075509 llvm-svn: 198950	2014-01-10 18:20:32 +00:00
Chandler Carruth	87f14b4eec	Re-sort all of the includes with ./utils/sort_includes.py so that subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685	2014-01-07 11:48:04 +00:00
Arnold Schwaighofer	e4d65aae7d	LoopVectorizer: Don't if-convert constant expressions that can trap A phi node operand or an instruction operand could be a constant expression that can trap (division). Check that we don't vectorize such cases. PR16729 radar://15653590 llvm-svn: 197449	2013-12-17 01:11:01 +00:00
NAKAMURA Takumi	54fa39136d	Prune redundant dependencies in LLVMBuild.txt. llvm-svn: 196988	2013-12-11 00:30:57 +00:00
NAKAMURA Takumi	b2c60b7ca7	Whitespaces. llvm-svn: 196880	2013-12-10 05:39:12 +00:00
Jakub Staszak	11e1c882f7	Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces overall time of LLVM compilation by ~1%. llvm-svn: 196667	2013-12-07 21:20:17 +00:00
Renato Golin	a4d4a4c44f	Add #pragma vectorize enable/disable to LLVM The intended behaviour is to force vectorization on the presence of the flag (either turn on or off), and to continue the behaviour as expected in its absence. Tests were added to make sure the all cases are covered in opt. No tests were added in other tools with the assumption that they should use the PassManagerBuilder in the same way. This patch also removes the outdated -late-vectorize flag, which was on by default and not helping much. The pragma metadata is being attached to the same place as other loop metadata, but nothing forbids one from attaching it to a function (to enable #pragma optimize) or basic blocks (to hint the basic-block vectorizers), etc. The logic should be the same all around. Patches to Clang to produce the metadata will be produced after the initial implementation is agreed upon and committed. Patches to other vectorizers (such as SLP and BB) will be added once we're happy with the pass manager changes. llvm-svn: 196537	2013-12-05 21:20:02 +00:00
Rafael Espindola	2ad993fb14	Fix non-deterministic behavior. We use CSEBlocks to initialize a worklist: SmallVector<BasicBlock *, 8> CSEWorkList(CSEBlocks.begin(), CSEBlocks.end()); so it must have a deterministic order. llvm-svn: 196520	2013-12-05 18:28:01 +00:00
Arnold Schwaighofer	120880c780	SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use We were creating external uses for scalar values in MustGather entries that also had a ScalarToTreeEntry (they also are present in a vectorized tuple). This meant we would keep a value 'alive' as a scalar and vectorized causing havoc. This is not necessary because when we create a MustGather vector we explicitly create external uses entries for the insertelement instructions of the MustGather vector elements. Fixes PR18129. radar://15582184 llvm-svn: 196508	2013-12-05 15:14:40 +00:00
Alp Toker	e845f8af67	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Nadav Rotem	dc01e91cf5	PR1860 - We can't save a list of ExtractElement instructions to CSE because some of these instructions may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE. llvm-svn: 195791	2013-11-26 22:24:25 +00:00
Arnold Schwaighofer	d0c05d2c84	LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary In signed arithmetic we could end up with an i64 trip count for an i32 phi. Because it is signed arithmetic we know that this is only defined if the i32 does not wrap. It is therefore safe to truncate the i64 trip count to a i32 value. Fixes PR18049. llvm-svn: 195787	2013-11-26 22:11:23 +00:00
Nadav Rotem	643eb4c26e	PR18060 - When we RAUW values with ExtractElement instructions in some cases we generate PHI nodes with multiple entries from the same basic block but with different values. Enabling CSE on ExtractElement instructions make sure that all of the RAUWed instructions are the same. llvm-svn: 195773	2013-11-26 17:29:19 +00:00
Chandler Carruth	a1094eb135	Migrate metadata information from scalar to vector instructions during SLP vectorization. Based on the code in BBVectorizer. Fixes PR17741. Patch by Raul Silvera, reviewed by Hal and Nadav. Reformatted by my driving of clang-format. =] llvm-svn: 195528	2013-11-23 00:48:34 +00:00
Arnold Schwaighofer	3fa9376236	SLPVectorizer: Fix whitespace errors. llvm-svn: 195468	2013-11-22 15:47:17 +00:00
Yi Jiang	74286d427a	SLP Vectorizer: Extract cost will only be added once even if the scalar has multiple external uses. llvm-svn: 195406	2013-11-22 01:57:02 +00:00
Arnold Schwaighofer	242935ec8c	SLPVectorizer: Fix stale for Value pointer array We are slicing an array of Value pointers and process those slices in a loop. The problem is that we might invalidate a later slice by vectorizing a former slice. Use a WeakVH to track the pointer. If the pointer is deleted or RAUW'ed we can tell. The test case will only fail when running with libgmalloc. radar://15498655 llvm-svn: 195162	2013-11-19 22:20:20 +00:00
Arnold Schwaighofer	3149313505	SLPVectorizer: Fix whitespace errors llvm-svn: 195161	2013-11-19 22:20:18 +00:00
Arnold Schwaighofer	e4280ec4dd	LoopVectorizer: Extend the induction variable to a larger type In some case the loop exit count computation can overflow. Extend the type to prevent most of those cases. The problem is loops like: int main () { int a = 1; char b = 0; lbl: a &= 4; b--; if (b) goto lbl; return a; } The backedge count is 255. The induction variable type is i8. If we add one to 255 to get the exit count we overflow to zero. To work around this issue we extend the type of the induction variable to i32 in the case of i8 and i16. PR17532 llvm-svn: 195008	2013-11-18 13:14:32 +00:00
Arnold Schwaighofer	01b6f1cc9a	LoopVectorizer: Use abi alignment for accesses with no alignment When we vectorize a scalar access with no alignment specified, we have to set the target's abi alignment of the scalar access on the vectorized access. Using the same alignment of zero would be wrong because most targets will have a bigger abi alignment for vector types. This probably fixes PR17878. llvm-svn: 194876	2013-11-15 23:09:33 +00:00
Renato Golin	ed3b88828a	Move debug message in vectorizer No functional change, just better reporting. llvm-svn: 194388	2013-11-11 16:27:35 +00:00
Benjamin Kramer	9eaaead296	SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a strict weak ordering. STL debug mode checks this. llvm-svn: 194015	2013-11-04 21:34:55 +00:00
Benjamin Kramer	15ebc47438	SLPVectorizer: Add a missing pair of parens. No functionality change. llvm-svn: 193958	2013-11-03 12:54:32 +00:00
Benjamin Kramer	f45bcf5480	SLPVectorizer: When CSEing generated gathers only scan blocks containing them. Instead of doing a RPO traversal of the whole function remember the blocks containing gathers (typically <= 2) and scan them in dominator-first order. The actual CSE is still quadratic, but I'm not confident that adding a scoped hash table here is worth it as we're only looking at the generated instructions and not arbitrary code. llvm-svn: 193956	2013-11-03 12:27:52 +00:00
Benjamin Kramer	ae919396b6	SLPVectorizer: Remove duplicated function. llvm-svn: 193927	2013-11-02 14:46:27 +00:00
Benjamin Kramer	abc7baa1dc	LoopVectorize: Remove quadratic behavior the local CSE. Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. llvm-svn: 193926	2013-11-02 13:39:00 +00:00
Arnold Schwaighofer	fed0c4f8e8	LoopVectorizer: Move cse code into its own function llvm-svn: 193895	2013-11-01 23:28:54 +00:00
Arnold Schwaighofer	fba1c74b67	LoopVectorizer: Perform redundancy elimination on induction variables When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 llvm-svn: 193891	2013-11-01 22:18:19 +00:00
Benjamin Kramer	3045156cee	LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). llvm-svn: 193860	2013-11-01 14:09:50 +00:00
Arnold Schwaighofer	5d7be45165	LoopVectorizer: If dependency checks fail try runtime checks When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 llvm-svn: 193853	2013-11-01 03:05:07 +00:00
Arnold Schwaighofer	fe8e481ef6	LoopVectorizer: Clear all member data structures in RuntimeCheck.reset() Clear all data structures when resetting the RuntimeCheck data structure. No test case. This was exposed by an upcomming change. llvm-svn: 193852	2013-11-01 03:05:04 +00:00
Arnold Schwaighofer	fe80e563da	ARM cost model: Account for zero cost scalar SROA instructions By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573	2013-10-29 01:33:53 +00:00
Arnold Schwaighofer	6f22639253	SLPVectorizer: Use vector type for vectorized memory operations No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 llvm-svn: 193572	2013-10-29 01:33:50 +00:00
Wan Xiaofei	f3100f24fa	Quick look-up for block in loop. This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460	2013-10-26 03:08:02 +00:00
Hal Finkel	d554c99b37	LoopVectorizer: Don't attempt to vectorize extractelement instructions The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434	2013-10-25 20:40:15 +00:00
Renato Golin	ae79e04f36	Mark vector loops as already vectorized Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. llvm-svn: 193349	2013-10-24 14:50:51 +00:00
Matt Arsenault	fcca6dd732	Use more type helper functions llvm-svn: 193109	2013-10-21 19:43:56 +00:00
Arnold Schwaighofer	789187ee86	SLPVectorizer: Don't vectorize volatile memory operations radar://15231682 Reapply r192799, http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226 showed that the bot is still broken even with this out. llvm-svn: 192820	2013-10-16 17:52:40 +00:00
Arnold Schwaighofer	7097263371	Revert "SLPVectorizer: Don't vectorize volatile memory operations" This speculatively reverts commit 192799. It might have broken a linux buildbot. llvm-svn: 192816	2013-10-16 17:19:40 +00:00
Arnold Schwaighofer	eebda9d6cf	SLPVectorizer: Don't vectorize volatile memory operations radar://15231682 llvm-svn: 192799	2013-10-16 16:09:00 +00:00
Benjamin Kramer	a00487e169	LoopVectorize: Properly reflect PODness in comments. llvm-svn: 192717	2013-10-15 16:19:54 +00:00
Arnold Schwaighofer	38ec37faba	SLPVectorizer: Sort PHINodes based on their opcode Before this patch we relied on the order of phi nodes when we looked for phi nodes of the same type. This could prevent vectorization of cases where there was a phi node of a second type in between phi nodes of some type. This is important for vectorization of an internal graphics kernel. On the test suite + external on x86_64 (and on a run on armv7s) it showed no impact on either performance or compile time. radar://15024459 llvm-svn: 192537	2013-10-12 18:56:27 +00:00
Tobias Grosser	bc154d94d0	LoopVectorize: Add missing INITIALIZE_PASS_DEPENDENCY macros Contributed-by: Peter Zotov <whitequark@whitequark.org> llvm-svn: 192536	2013-10-12 18:29:15 +00:00
Renato Golin	ec7fe56cfa	Better info when debugging vectorizer llvm-svn: 192460	2013-10-11 16:14:39 +00:00

1 2 3 4 5 ...

509 Commits