llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00

Author	SHA1	Message	Date
Nadav Rotem	52f5279653	Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards. For example, this is the hot loop in BZIP: do { m = --p; p = ( ... ); } while (--n); llvm-svn: 173219	2013-01-23 01:35:00 +00:00
Nadav Rotem	9ec02f071a	LoopVectorizer: Implement a new heuristics for selecting the unroll factor. We ignore the cpu frontend and focus on pipeline utilization. We do this because we don't have a good way to estimate the loop body size at the IR level. llvm-svn: 172964	2013-01-20 05:24:29 +00:00
Nadav Rotem	adb4fb5903	Change the cpu type in the test. llvm-svn: 172963	2013-01-20 05:20:56 +00:00
Benjamin Kramer	28c812f680	LoopVectorizer: Emit memory checks into their own basic block. This separates the check for "too few elements to run the vector loop" from the "memory overlap" check, giving a lot nicer code and allowing to skip the memory checks when we're not going to execute the vector code anyways. We still leave the decision of whether to emit the memory checks as branches or setccs, but it seems to be doing a good job. If ugly code pops up we may want to emit them as separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso. Most of this is legwork to allow multiple bypass blocks while updating PHIs, dominators and loop info. llvm-svn: 172902	2013-01-19 13:57:58 +00:00
Benjamin Kramer	efb9c9f20b	Move test that depends on the x86 target into a target-specific directory. Should fix the arm buildbot (which only builds the arm target). llvm-svn: 172611	2013-01-16 13:25:56 +00:00
Nadav Rotem	c6cce40085	Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16). llvm-svn: 172348	2013-01-13 07:56:29 +00:00
Nadav Rotem	c79f1aa3f4	ARM Cost Model: Modify the target independent cost model to ask the target if it supports the different CAST types. We didn't do this on X86 because of the different register sizes and types, but on ARM this makes sense. llvm-svn: 172245	2013-01-11 19:54:13 +00:00
Nadav Rotem	008741a0e0	ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178	2013-01-11 07:11:59 +00:00
Nadav Rotem	d5f59a81d9	LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079	2013-01-10 17:34:39 +00:00
Nadav Rotem	436dc952aa	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Nadav Rotem	4d391c52b0	ARM Cost Model: Add a basic vectorization unrolling test. llvm-svn: 171931	2013-01-09 01:29:07 +00:00
Nadav Rotem	c14d0d93d9	Remove the -licm pass from the loop vectorizer test because the loop vectorizer does it now. llvm-svn: 171930	2013-01-09 01:20:59 +00:00
Nadav Rotem	9c27f36e59	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Nadav Rotem	4aa065a2a3	LoopVectorizer: Add support for floating point reductions llvm-svn: 171812	2013-01-07 23:13:00 +00:00
Nadav Rotem	5906222eab	LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798	2013-01-07 21:54:51 +00:00
Nadav Rotem	9eefe3aaf8	Fix a typo. Remove the duplicated test. llvm-svn: 171584	2013-01-05 01:17:46 +00:00
Nadav Rotem	836b9a9fda	iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583	2013-01-05 01:15:47 +00:00
Nadav Rotem	ef10b99294	Force a fixed unroll count on the target independent tests. This should fix clang-native-arm-cortex-a9. Thanks Renato. llvm-svn: 171582	2013-01-05 00:58:48 +00:00
Paul Redmond	6ce33a6ae9	Do not vectorize loops with subtraction reductions Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537	2013-01-04 22:10:16 +00:00
Nadav Rotem	cb3562a88e	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	ea6706d777	LoopVectorizer: Test the unrolling flag. llvm-svn: 171446	2013-01-03 01:47:31 +00:00
Nadav Rotem	a7cac72b7d	Avoid vectorization when the function has the "noimplicitflot" attribute. llvm-svn: 171429	2013-01-02 23:54:43 +00:00
Nadav Rotem	0e391907a5	LoopVectorizer: Fix a bug in the code that updates the loop exiting block. LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251	2012-12-30 07:47:00 +00:00
Nadav Rotem	4cad811734	If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified. PR14719. llvm-svn: 171124	2012-12-26 23:30:53 +00:00
Nadav Rotem	90712b89cc	LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1 llvm-svn: 171114	2012-12-26 19:08:17 +00:00
Hal Finkel	f9b3cb9121	LoopVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171076	2012-12-25 23:21:29 +00:00
Nick Lewycky	56ef0e9560	Fix typo "Makre" -> "Make". llvm-svn: 171043	2012-12-24 19:55:47 +00:00
Nadav Rotem	ace51e510e	LoopVectorizer: When checking for vectorizable types, also check the StoreInst operands. PR14705. llvm-svn: 171023	2012-12-24 09:14:18 +00:00
Nadav Rotem	309d628c4f	LoopVectorizer: Fix an endless loop in the code that looks for reductions. The bug was in the code that detects PHIs in if-then-else block sequence. PR14701. llvm-svn: 171008	2012-12-24 01:22:06 +00:00
Nadav Rotem	fb56b5fe2e	CostModel: Change the default target-independent implementation for finding the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002	2012-12-23 17:31:23 +00:00
Nadav Rotem	e237376e62	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Nadav Rotem	80fefbe978	Fix a bug in the code that checks if we can vectorize loops while using dynamic memory bound checks. Before the fix we were able to vectorize this loop from the Livermore Loops benchmark: for ( k=1 ; k<n ; k++ ) x[k] = x[k-1] + y[k]; llvm-svn: 170811	2012-12-21 00:07:35 +00:00
Nadav Rotem	ccffd4527d	LoopVectorize: Fix a bug in the scalarization of instructions. Before if-conversion we could check if a value is loop invariant if it was declared inside the basic block. Now that loops have multiple blocks this check is incorrect. This fixes External/SPEC/CINT95/099_go/099_go llvm-svn: 170756	2012-12-20 20:24:40 +00:00
Benjamin Kramer	27ce655c41	Make TargetLowering::getTypeConversion more resilient against odd illegal MVTs. - An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist) - Return the scalar value when an MVT is scalarized (v1i64 -> i64) Fixes PR14639ff. llvm-svn: 170546	2012-12-19 14:34:28 +00:00
Benjamin Kramer	820b613d80	LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations. For example on x86 with SSE4.2 a <8 x i8> add reduction becomes movdqa %xmm0, %xmm1 movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1] paddw %xmm0, %xmm1 pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0] paddw %xmm1, %xmm0 phaddw %xmm0, %xmm0 pextrb $0, %xmm0, %edx instead of pextrb $2, %xmm0, %esi pextrb $0, %xmm0, %edx addb %sil, %dl pextrb $4, %xmm0, %esi addb %dl, %sil pextrb $6, %xmm0, %edx addb %sil, %dl pextrb $8, %xmm0, %esi addb %dl, %sil pextrb $10, %xmm0, %edi pextrb $14, %xmm0, %edx addb %sil, %dil pextrb $12, %xmm0, %esi addb %dil, %sil addb %sil, %dl llvm-svn: 170439	2012-12-18 18:40:20 +00:00
Nadav Rotem	ca05f9e72b	Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc. llvm-svn: 170051	2012-12-13 00:21:03 +00:00
Nadav Rotem	2c25a05088	LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size. llvm-svn: 170004	2012-12-12 19:29:45 +00:00
Nadav Rotem	054379720d	PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion. llvm-svn: 169916	2012-12-11 21:30:14 +00:00
Nadav Rotem	fb45c4d6b4	Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars. llvm-svn: 169904	2012-12-11 18:58:10 +00:00
Nadav Rotem	0715a221d8	Fix PR14565. Don't if-convert loops that have switch statements in them. llvm-svn: 169813	2012-12-11 04:55:10 +00:00
Nadav Rotem	196fc7cc8c	Add support for reverse induction variables. For example: while (i--) sum+=A[i]; llvm-svn: 169752	2012-12-10 19:25:06 +00:00
Paul Redmond	e43761293d	LoopVectorize: support vectorizing intrinsic calls - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711	2012-12-09 20:42:17 +00:00
Nadav Rotem	452993ad1a	Fix a bug in vectorization of if-converted reduction variables. If the reduction variable is not used outside the loop then we ran into an endless loop. This change checks if we found the original PHI. llvm-svn: 169324	2012-12-04 22:40:22 +00:00
Nadav Rotem	4f22c83996	Add support for reduction variables when IF-conversion is enabled. llvm-svn: 169288	2012-12-04 18:17:33 +00:00
Nadav Rotem	43d200ded1	Add the last part that is needed for vectorization of if-converted code. Added the code that actually performs the if-conversion during vectorization. We can now vectorize this code: for (int i=0; i<n; ++i) { unsigned k = 0; if (a[i] > b[i]) <------ IF inside the loop. k = k * 5 + 3; a[i] = k; <---- K is a phi node that becomes vector-select. } llvm-svn: 169217	2012-12-04 06:15:11 +00:00
Nadav Rotem	c973546f75	Add support for pointer induction variables even when there is no integer induction variable. llvm-svn: 168558	2012-11-25 08:41:35 +00:00
Nadav Rotem	6ff38dc8d2	LoopVectorizer: Add initial support for pointer induction variables (for example: dst++ = src++). At the moment we still require to have an integer induction variable (for example: i++). llvm-svn: 168231	2012-11-17 00:27:03 +00:00
Duncan Sands	8c43343240	Relax the restrictions on vector of pointer types, and vector getelementptr. Previously in a vector of pointers, the pointer couldn't be any pointer type, it had to be a pointer to an integer or floating point type. This is a hassle for dragonegg because the GCC vectorizer happily produces vectors of pointers where the pointer is a pointer to a struct or whatever. Vector getelementptr was restricted to just one index, but now that vectors of pointers can have any pointer type it is more natural to allow arbitrary vector getelementptrs. There is however the issue of struct GEPs, where if each lane chose different struct fields then from that point on each lane will be working down into unrelated types. This seems like too much pain for too little gain, so when you have a vector struct index all the elements are required to be the same. llvm-svn: 167828	2012-11-13 12:59:33 +00:00
Nadav Rotem	ee232d62d1	Add support for memory runtime check. When we can, we calculate array bounds. If the arrays are found to be disjoint then we run the vectorized version of the loop. If they are not, we run the scalar code. llvm-svn: 167608	2012-11-09 07:09:44 +00:00
Nadav Rotem	2fb5dc3a15	Cost Model: add tables for some avx type-conversion hacks. llvm-svn: 167480	2012-11-06 19:33:53 +00:00

1 2

78 Commits