llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00

Author	SHA1	Message	Date
Shuxin Yang	eb29e658a0	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. llvm-svn: 193489	2013-10-27 03:08:44 +00:00
Benjamin Kramer	701e41bb58	X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate. Also update the cost model. llvm-svn: 193270	2013-10-23 21:06:07 +00:00
Shuxin Yang	45a453cafe	Use address-taken to disambiguate global variable and indirect memops. Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251	2013-10-23 17:28:19 +00:00
Manman Ren	d9be0d7e44	Simplify testing case (Thanks Rafael for the testing case). llvm-svn: 193177	2013-10-22 18:15:50 +00:00
Manman Ren	87bfd7a670	TBAA: fix PR17620. We can have a struct type with a single field and the field does not start with 0. In that case, we should correctly update the offset. llvm-svn: 193137	2013-10-22 01:40:25 +00:00
Matt Arsenault	3089ca20bd	Fix creating bitcasts between address spaces in SCEV. The test before wasn't successfully testing this since it was missing the datalayout piece to change the size of the second address space. llvm-svn: 193102	2013-10-21 18:41:10 +00:00
Andrew Trick	027f71d443	SCEV should use NSW to get trip count for positive nonunit stride loops. SCEV currently fails to compute loop counts for nonunit stride loops. This comes up frequently. It prevents loop optimization and forces vectorization to insert extra loop checks. For example: void foo(int n, int *x) { for (int i = 0; i < n; i += 3) { x[i] = i; x[i+1] = i+1; x[i+2] = i+2; } } We need to properly handle the case in which limit > INT_MAX-stride. In the above case: n > INT_MAX-3. In this case the loop counter will step beyond the limit and overflow at the same time. However, knowing that signed integer overlow in undefined, we can assume the loop test behavior is arbitrary after overflow. This obeys both C undefined behavior rules, and the more strict LLVM poison value rules. I'm finally fixing this in response to Hal Finkel's persistence. The most probable reason that we never optimized this before is that we were being careful to handle case where the developer expected a side-effect free infinite loop relying on overflow: for (int i = 0; i < n; i += s) { ++j; } return j; If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might expect an infinite loop. However there are plenty of ways to achieve this effect without relying on undefined behavior of signed overflow. llvm-svn: 193015	2013-10-18 23:43:53 +00:00
Chandler Carruth	ee12d58370	Remove the very substantial, largely unmaintained legacy PGO infrastructure. This was essentially work toward PGO based on a design that had several flaws, partially dating from a time when LLVM had a different architecture, and with an effort to modernize it abandoned without being completed. Since then, it has bitrotted for several years further. The result is nearly unusable, and isn't helping any of the modern PGO efforts. Instead, it is getting in the way, adding confusion about PGO in LLVM and distracting everyone with maintenance on essentially dead code. Removing it paves the way for modern efforts around PGO. Among other effects, this removes the last of the runtime libraries from LLVM. Those are being developed in the separate 'compiler-rt' project now, with somewhat different licensing specifically more approriate for runtimes. llvm-svn: 191835	2013-10-02 15:42:23 +00:00
Matt Arsenault	19f03c0f9c	Use CHECK-LABEL llvm-svn: 191713	2013-09-30 23:31:55 +00:00
Manman Ren	2ef9ca7627	TBAA: handle scalar TBAA format and struct-path aware TBAA format. Remove the command line argument "struct-path-tbaa" since we should not depend on command line argument to decide which format the IR file is using. Instead, we check the first operand of the tbaa tag node, if it is a MDNode, we treat it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA format. When clang starts to use struct-path aware TBAA format no matter whether struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support for scalar TBAA format can be dropped. Existing testing cases are updated to use the struct-path aware TBAA format. llvm-svn: 191538	2013-09-27 18:34:27 +00:00
Yi Jiang	ed81b17719	X86 horizontal vector reduction cost model llvm-svn: 191021	2013-09-19 17:48:48 +00:00
Arnold Schwaighofer	eabde1ffce	Costmodel: Add support for horizontal vector reductions Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence: (v0, v1, v2, v3) \ \ / / \ \ / + + (v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef) %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0 This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface. We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp). (v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef) %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0 llvm-svn: 190876	2013-09-17 18:06:50 +00:00
Matt Arsenault	a15663a1b4	Teach ScalarEvolution about pointer address spaces llvm-svn: 190425	2013-09-10 19:55:24 +00:00
Matt Arsenault	0c955e9568	Fix lint assert on integer vector division llvm-svn: 189290	2013-08-26 23:29:33 +00:00
Bill Wendling	df27f44bd7	FileCheck-ize tests. llvm-svn: 188971	2013-08-22 00:51:19 +00:00
Daniel Dunbar	a496d61c01	[tests] Cleanup initialization of test suffixes. - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513	2013-08-16 00:37:11 +00:00
Bill Wendling	21cb95c7bf	FileCheckize some of the testcases. llvm-svn: 187756	2013-08-05 23:43:18 +00:00
Renato Golin	dbd9622c0b	Fixes ARM LNT bot from SLP change in O3 This patch fixes the multiple breakages on ARM test-suite after the SLP vectorizer was introduced by default on O3. The problem was an illegal vector type on ARMTTI::getCmpSelInstrCost() <3 x i1> which is not simple. The guard protects this code from breaking (cause of the problems) but doesn't fix the issue that is generating the odd vector in the first place, which also needs to be investigated. llvm-svn: 187658	2013-08-02 17:10:04 +00:00
Stephen Lin	74b6dc3cef	Add newlines at end of test files, no functionality change llvm-svn: 186263	2013-07-13 22:00:58 +00:00
Hal Finkel	2464b6ac5c	Add the nearbyint -> FNEARBYINT mapping to BasicTargetTransformInfo This fixes an oversight that Intrinsic::nearbyint was not being mapped to ISD::FNEARBYINT (thus fixing the over-optimistic cost we were assigning to nearbyint calls for some targets). llvm-svn: 185783	2013-07-08 03:24:07 +00:00
Nick Lewycky	7b093a1c2f	Extend 'readonly' and 'readnone' to work on function arguments as well as functions. Make the function attributes pass add it to known library functions and when it can deduce it. llvm-svn: 185735	2013-07-06 00:29:58 +00:00
Jakob Stoklund Olesen	1c034ce174	Minimize precision loss when computing cyclic probabilities. Allow block frequencies to exceed 32 bits by using the new BlockFrequency division function. llvm-svn: 185236	2013-06-28 22:40:43 +00:00
Preston Briggs	9a4e6f6c73	(no commit message) llvm-svn: 185187	2013-06-28 18:44:48 +00:00
Nadav Rotem	311bda941c	CostModel: improve the cost model for load/store of non power-of-two types such as <3 x float>, which are popular in graphics. llvm-svn: 185085	2013-06-27 17:52:04 +00:00
Jakob Stoklund Olesen	a4ca837638	Print block frequencies in decimal form. This is easier to read than the internal fixed-point representation. If anybody knows the correct algorithm for converting fixed-point numbers to base 10, feel free to fix it. llvm-svn: 184881	2013-06-25 21:57:38 +00:00
Arnold Schwaighofer	730386bc34	X86 cost model: Vectorizing integer division is a bad idea radar://14057959 llvm-svn: 184872	2013-06-25 19:14:09 +00:00
Benjamin Kramer	3b56c8dd50	BlockFrequency: Bump up the entry frequency a bit. This is a band-aid to fix the most severe regressions we're seeing from basing spill decisions on block frequencies, until we have a better solution. llvm-svn: 184835	2013-06-25 13:34:40 +00:00
Benjamin Kramer	30c35d5305	Revert "BlockFrequency: Saturate at 1 instead of 0 when multiplying a frequency with a branch probability." This reverts commit r184584. Breaks PPC selfhost. llvm-svn: 184590	2013-06-21 20:20:27 +00:00
Benjamin Kramer	3315e168ee	BlockFrequency: Saturate at 1 instead of 0 when multiplying a frequency with a branch probability. Zero is used by BlockFrequencyInfo as a special "don't know" value. It also causes a sink for frequencies as you can't ever get off a zero frequency with more multiplies. This recovers a 10% regression on MultiSource/Benchmarks/7zip. A zero frequency was propagated into an inner loop causing excessive spilling. PR16402. llvm-svn: 184584	2013-06-21 19:30:05 +00:00
Andrew Trick	c470f27448	Unit test for SCEV fix r182989, PR16130. llvm-svn: 183017	2013-05-31 16:42:41 +00:00
Michael Kuperstein	ad5bd9ce5a	Make BasicAliasAnalysis recognize the fact a noalias argument cannot alias another argument, even if the other argument is not itself marked noalias. llvm-svn: 182755	2013-05-28 08:17:48 +00:00
Diego Novillo	d1f091f169	Add a new function attribute 'cold' to functions. Other than recognizing the attribute, the patch does little else. It changes the branch probability analyzer so that edges into blocks postdominated by a cold function are given low weight. Added analysis and code generation tests. Added documentation for the new attribute. llvm-svn: 182638	2013-05-24 12:26:52 +00:00
Tim Northover	eb518c7918	AArch64: use MCJIT by default and enable related tests. This just enables some testing I'd missed after implementing MCJIT support. llvm-svn: 181215	2013-05-06 16:51:08 +00:00
Matt Arsenault	34e1805cd0	Fix unchecked uses of DominatorTree in MemoryDependenceAnalysis. Use unknown results for places where it would be needed llvm-svn: 181176	2013-05-06 02:07:24 +00:00
Tobias Grosser	fb2da25967	RegionInfo: Do not crash if unreachable block is found llvm-svn: 181025	2013-05-03 15:48:34 +00:00
Manman Ren	13b2364d24	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180743	2013-04-29 22:42:01 +00:00
Manman Ren	c576d690b0	Struct-path aware TBAA: change the format of TBAAStructType node. We switch the order of offset and field type to make TBAAStructType node (name, parent node, offset) similar to scalar TBAA node (name, parent node). TypeIsImmutable is added to TBAAStructTag node. llvm-svn: 180654	2013-04-27 00:26:11 +00:00
Arnold Schwaighofer	b1fc314b5f	ARM cost model: Integer div and rem is lowered to a function call Reflect this in the cost model. I observed this in MiBench/consumer-lame. radar://13354716 llvm-svn: 180576	2013-04-25 21:16:18 +00:00
Jim Grosbach	3104dcf2ca	Legalize vector truncates by parts rather than just splitting. Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr llvm-svn: 179989	2013-04-21 23:47:41 +00:00
Arnold Schwaighofer	e1dc8ae8c8	X86 cost model: Exit before calling getSimpleVT on non-simple VTs getSimpleVT can only handle simple value types. radar://13676022 llvm-svn: 179714	2013-04-17 20:04:53 +00:00
Nadav Rotem	06ab05f47a	CostModel: increase the default cost of supported floating point operations from 1 to two. Fixed a few tests that changes because now the cost of one insert + a vector operation on two doubles is lower than two scalar operations on doubles. llvm-svn: 179413	2013-04-12 21:15:03 +00:00
Manman Ren	6a0ccf041c	Aliasing rules for struct-path aware TBAA. Added PathAliases to check if two struct-path tags can alias. Added command line option -struct-path-tbaa. llvm-svn: 179337	2013-04-11 23:24:18 +00:00
Arnold Schwaighofer	3218da2403	X86 cost model: Model cost for uitofp and sitofp on SSE2 The costs are overfitted so that I can still use the legalization factor. For example the following kernel has about half the throughput vectorized than unvectorized when compiled with SSE2. Before this patch we would vectorize it. unsigned short A[1024]; double B[1024]; void f() { int i; for (i = 0; i < 1024; ++i) { B[i] = (double) A[i]; } } radar://13599001 llvm-svn: 179033	2013-04-08 18:05:48 +00:00
Arnold Schwaighofer	16848bcf4a	TargetLowering: Fix getTypeConversion handling of extended vector types The code in getTypeConversion attempts to promote the element vector type before it trys to split or widen the vector. After it failed finding a legal vector type by promoting it would continue using the promoted vector element type. Thereby missing legal splitted vector types. For example the type v32i32 that has a legal split of 4 x v3i32 on x86/sse2 would be transformed to: v32i256 and from there on successively split to: v16i256, v8i256, v1i256 and then finally ends up as an i64 type. By resetting the vector element type to the original vector element type that existed before the promotion the code will attempt to split the vector type to smaller vector widths of the same type. llvm-svn: 178999	2013-04-07 20:22:56 +00:00
Arnold Schwaighofer	52871434dd	X86 cost model: Differentiate cost for vector shifts of constants SSE2 has efficient support for shifts by a scalar. My previous change of making shifts expensive did not take this into account marking all shifts as expensive. This would prevent vectorization from happening where it is actually beneficial. With this change we differentiate between shifts of constants and other shifts. radar://13576547 llvm-svn: 178808	2013-04-04 23:26:24 +00:00
Arnold Schwaighofer	329430aeac	X86 cost model: Vector shifts are expensive in most cases The default logic does not correctly identify costs of casts because they are marked as custom on x86. For some cases, where the shift amount is a scalar we would be able to generate better code. Unfortunately, when this is the case the value (the splat) will get hoisted out of the loop, thereby making it invisible to ISel. radar://13130673 radar://13537826 llvm-svn: 178703	2013-04-03 21:46:05 +00:00
Benjamin Kramer	7634eefc37	X86TTI: Add accurate costs for itofp operations, based on the actual instruction counts. llvm-svn: 178459	2013-04-01 10:23:49 +00:00
Andrew Trick	57ddfcf201	Fix SCEV forgetMemoizedResults should search and destroy backedge exprs. Fixes PR15570: SEGV: SCEV back-edge info invalid after dead code removal. Indvars creates a SCEV expression for the loop's back edge taken count, then determines that the comparison is always true and removes it. When loop-unroll asks for the expression, it contains a NULL SCEVUnknkown (as a CallbackVH). forgetMemoizedResults should invalidate the loop back edges expression. llvm-svn: 177986	2013-03-26 03:14:53 +00:00
Jyotsna Verma	5d3a002f82	Disable profiling tests for Hexagon since it doesn't support JIT. llvm-svn: 177917	2013-03-25 21:15:11 +00:00
Manman Ren	6e08f09d69	Support in AAEvaluator to print alias queries of loads/stores with TBAA tags. Add "evaluate-tbaa" to print alias queries of loads/stores. Alias queries between pointers do not include TBAA tags. Add testing case for "placement new". TBAA currently says NoAlias. llvm-svn: 177772	2013-03-22 22:34:41 +00:00
Michael Liao	fe785c9579	Correct cost model for vector shift on AVX2 - After moving logic recognizing vector shift with scalar amount from DAG combining into DAG lowering, we declare to customize all vector shifts even vector shift on AVX is legal. As a result, the cost model needs special tuning to identify these legal cases. llvm-svn: 177586	2013-03-20 22:01:10 +00:00
Nadav Rotem	317ff20b46	Optimize sext <4 x i8> and <4 x i16> to <4 x i64>. Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com> llvm-svn: 177421	2013-03-19 18:38:27 +00:00
Renato Golin	6d0295565e	Improve long vector sext/zext lowering on ARM The ARM backend currently has poor codegen for long sext/zext operations, such as v8i8 -> v8i32. This patch addresses this by performing a custom expansion in ARMISelLowering. It also adds/changes the cost of such lowering in ARMTTI. This partially addresses PR14867. Patch by Pete Couperus llvm-svn: 177380	2013-03-19 08:15:38 +00:00
Arnold Schwaighofer	0b9d14a046	ARM cost model: Make some vector integer to float casts cheaper The default logic marks them as too expensive. For example, before this patch we estimated: cost of 16 for instruction: %r = uitofp <4 x i16> %v0 to <4 x float> While this translates to: vmovl.u16 q8, d16 vcvt.f32.u32 q8, q8 All other costs are left to the values assigned by the fallback logic. Theses costs are mostly reasonable in the sense that they get progressively more expensive as the instruction sequences emitted get longer. radar://13445992 llvm-svn: 177334	2013-03-18 22:47:09 +00:00
Arnold Schwaighofer	e628d03dcc	ARM cost model: Correct cost for some cheap float to integer conversions Fix cost of some "cheap" cast instructions. Before this patch we used to estimate for example: cost of 16 for instruction: %r = fptoui <4 x float> %v0 to <4 x i16> While we would emit: vcvt.s32.f32 q8, q8 vmovn.i32 d16, q8 vuzp.8 d16, d17 All other costs are left to the values assigned by the fallback logic. Theses costs are mostly reasonable in the sense that they get progressively more expensive as the instruction sequences emitted get longer. radar://13434072 llvm-svn: 177333	2013-03-18 22:47:06 +00:00
Arnold Schwaighofer	c83f5b493e	ARM cost model: Fix costs for some vector selects I was too pessimistic in r177105. Vector selects that fit into a legal register type lower just fine. I was mislead by the code fragment that I was using. The stores/loads that I saw in those cases came from lowering the conditional off an address. Changing the code fragment to: %T0_3 = type <8 x i18> %T1_3 = type <8 x i1> define void @func_blend3(%T0_3* %loadaddr, %T0_3* %loadaddr2, %T1_3* %blend, %T0_3* %storeaddr) { %v0 = load %T0_3* %loadaddr %v1 = load %T0_3* %loadaddr2 ==> FROM: ;%c = load %T1_3* %blend ==> TO: %c = icmp slt %T0_3 %v0, %v1 ==> USE: %r = select %T1_3 %c, %T0_3 %v0, %T0_3 %v1 store %T0_3 %r, %T0_3* %storeaddr ret void } revealed this mistake. radar://13403975 llvm-svn: 177170	2013-03-15 18:31:01 +00:00
Arnold Schwaighofer	77e4a47e9b	ARM cost model: Fix cost of fptrunc and fpext instructions A vector fptrunc and fpext simply gets split into scalar instructions. radar://13192358 llvm-svn: 177159	2013-03-15 15:10:47 +00:00
Arnold Schwaighofer	63a59d3be8	ARM cost model: Increase cost of some vector selects we do terrible on By terrible I mean we store/load from the stack. This matters on PAQp8 in _Z5trainPsS_ii (which is inlined into Mixer::update) where we decide to vectorize a loop with a VF of 8 resulting in a 25% degradation on a cortex-a8. LV: Found an estimated cost of 2 for VF 8 For instruction: icmp slt i32 LV: Found an estimated cost of 2 for VF 8 For instruction: select i1, i32, i32 The bug that tracks the CodeGen part is PR14868. radar://13403975 llvm-svn: 177105	2013-03-14 19:17:02 +00:00
Arnold Schwaighofer	416d47b476	ARM cost model: Increase the cost for vector casts that use the stack Increase the cost of v8/v16-i8 to v8/v16-i32 casts and truncates as the backend currently lowers those using stack accesses. This was responsible for a significant degradation on MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1 where we vectorize one loop to a vector factor of 16. After this patch we select a vector factor of 4 which will generate reasonable code. unsigned char cle[32]; void test(short c) { unsigned short compte; for (compte = 0; compte <= 31; compte++) { cle[compte] = cle[compte] ^ c; } } radar://13220512 llvm-svn: 176898	2013-03-12 21:19:22 +00:00
Jan Wen Voung	74d9647d18	Revert the test moves from 176733. Use "REQUIRES: asserts" instead. llvm-svn: 176873	2013-03-12 16:27:52 +00:00
Jan Wen Voung	2346df4d41	Disable statistics on Release builds and move tests that depend on -stats. Summary: Statistics are still available in Release+Asserts (any +Asserts builds), and stats can also be turned on with LLVM_ENABLE_STATS. Move some of the FastISel stats that were moved under DEBUG() back out of DEBUG(), since stats are disabled across the board now. Many tests depend on grepping "-stats" output. Move those into a orig_dir/Stats/. so that they can be marked as unsupported when building without statistics. Differential Revision: http://llvm-reviews.chandlerc.com/D486 llvm-svn: 176733	2013-03-08 22:56:31 +00:00
Shuxin Yang	048b100cc5	Memory Dependence Analysis (not mem-dep test) take advantage of "invariant.load" metadata. The "invariant.load" metadata indicates the memory unit being accessed is immutable. A load annotated with this metadata can be moved across any store. As I am not sure if it is legal to move such loads across barrier/fence, this change dose not allow such transformation. rdar://11311484 Thank Arnold for code review. llvm-svn: 176562	2013-03-06 17:48:48 +00:00
Arnold Schwaighofer	e60e6fc70f	X86 cost model: Adjust cost for custom lowered vector multiplies This matters for example in following matrix multiply: int mmult(int rows, int cols, int m1, int m2, int m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403	2013-03-02 04:02:52 +00:00
Benjamin Kramer	df474e5dfa	Cost model support for lowered math builtins. We make the cost for calling libm functions extremely high as emitting the calls is expensive and causes spills (on x86) so performance suffers. We still vectorize important calls like ceilf and friends on SSE4.1. and fabs. Differential Revision: http://llvm-reviews.chandlerc.com/D466 llvm-svn: 176287	2013-02-28 19:09:33 +00:00
Bill Wendling	db672f1bc8	Use references to attribute groups on the call/invoke instructions. Listing all of the attributes for the callee of a call/invoke instruction is way too much and makes the IR unreadable. Use references to attributes instead. llvm-svn: 175877	2013-02-22 09:09:42 +00:00
Elena Demikhovsky	0886fb4d55	I optimized the following patterns: sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. llvm-svn: 175619	2013-02-20 12:42:54 +00:00
Bill Wendling	74351693ea	Modify the LLVM assembly output so that it uses references to represent function attributes. This makes the LLVM assembly look better. E.g.: define void @foo() #0 { ret void } attributes #0 = { nounwind noinline ssp } llvm-svn: 175605	2013-02-20 07:21:42 +00:00
Tim Northover	6d736b1227	AArch64: adjust tests which rely on a default JIT Profiling tests do need a JIT. They'll pass if a cross-compiler targetting AArch64 by default has been built, but fail if a native AArch64 compiler has been build. Therefore XFAIL is inappropriate and we mark them unsupported. ExecutionEngine tests are JIT by definition, they should also be unsupported. Transforms/LICM only uses the interpreter to check the output is still sane after optimisation. It can be switched to use an interpreter. llvm-svn: 175433	2013-02-18 11:08:37 +00:00
Arnold Schwaighofer	b7dd0ff204	ARM cost model: Add vector reverse shuffle costs A reverse shuffle is lowered to a vrev and possibly a vext instruction (quad word). radar://13171406 llvm-svn: 174933	2013-02-12 02:40:39 +00:00
Bill Schmidt	53ad58d77a	Refine fix to bug 15041. Thanks to help from Nadav and Hal, I have a more reasonable (and even correct!) approach. This specifically penalizes the insertelement and extractelement operations for the performance hit that will occur on PowerPC processors. llvm-svn: 174725	2013-02-08 18:19:17 +00:00
Arnold Schwaighofer	381c4a3e54	ARM cost model: Address computation in vector mem ops not free Adds a function to target transform info to query for the cost of address computation. The cost model analysis pass now also queries this interface. The code in LoopVectorize adds the cost of address computation as part of the memory instruction cost calculation. Only there, we know whether the instruction will be scalarized or not. Increase the penality for inserting in to D registers on swift. This becomes necessary because we now always assume that address computation has a cost and three is a closer value to the architecture. radar://13097204 llvm-svn: 174713	2013-02-08 14:50:48 +00:00
Arnold Schwaighofer	72b584b5de	ARM cost model: Add costs for vector selects Vector selects are cheap on NEON. They get lowered to a vbsl instruction. radar://13158753 llvm-svn: 174631	2013-02-07 16:10:15 +00:00
Arnold Schwaighofer	d1587de3eb	ARM cost model: Cost for scalar integer casts and floating point conversions Also adds some costs for vector integer float conversions. llvm-svn: 174371	2013-02-05 14:05:55 +00:00
Arnold Schwaighofer	f3c24d2a6f	ARM cost model: Penalize insertelement into D subregisters Swift has a renaming dependency if we load into D subregisters. We don't have a way of distinguishing between insertelement operations of values from loads and other values. Therefore, we are pessimistic for now (The performance problem showed up in example 14 of gcc-loops). radar://13096933 llvm-svn: 174300	2013-02-04 02:52:05 +00:00
Hal Finkel	ab240a9015	Initial implementation of PPCTargetTransformInfo This provides a place to add customized operation cost information and control some other target-specific IR-level transformations. The only non-trivial logic in this checkin assigns a higher cost to unaligned loads and stores (covered by the included test case). llvm-svn: 173520	2013-01-25 23:05:59 +00:00
Nadav Rotem	8952ef0071	Make opt grab the triple from the module and use it to initialize the target machine. llvm-svn: 171341	2013-01-01 08:00:32 +00:00
Dmitri Gribenko	968fc2e59b	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171250	2012-12-30 02:33:22 +00:00
Dmitri Gribenko	abbc99565a	Add a check to the test Analysis/ScalarEvolution/2010-09-03-RequiredTransitive.ll This test did not test anything at all (except for opt crashing, but that was not the reason why it was added). llvm-svn: 171248	2012-12-30 01:42:34 +00:00
Dmitri Gribenko	e3769d450b	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171246	2012-12-30 01:28:40 +00:00
Nadav Rotem	479c7fc4de	We are not ready to estimate the cost of integer expansions based on the number of parts. This test is too noisy. llvm-svn: 170999	2012-12-23 09:11:07 +00:00
Nadav Rotem	3d4c6351cf	Improve the X86 cost model for loads and stores. llvm-svn: 170830	2012-12-21 01:33:59 +00:00
Jakub Staszak	837a7e5c18	Reverse order of checking SSE level when calculating compare cost, so we check AVX2 before AVX. llvm-svn: 170464	2012-12-18 22:57:56 +00:00
Arnold Schwaighofer	182d1ce4b7	Optimistically analyse Phi cycles Analyse Phis under the starting assumption that they are NoAlias. Recursively look at their inputs. If they MayAlias/MustAlias there must be an input that makes them so. Addresses bug 14351. llvm-svn: 169788	2012-12-10 23:02:41 +00:00
Nadav Rotem	6dac3b0c66	Cost Model: change the default cost of control flow instructions (br / ret / ...) to zero. llvm-svn: 169423	2012-12-05 21:21:26 +00:00
Preston Briggs	52d4891df2	Modified dump() to provide a little more information for dependences between instructions that don't share a common loop. Updated the test results appropriately. llvm-svn: 168965	2012-11-30 00:44:47 +00:00
Preston Briggs	f15c406c47	Modified depends() to recognize that when all levels are "=" and there's no possible loo-independent dependence, then there's no dependence. Updated all test result appropriately. llvm-svn: 168719	2012-11-27 19:12:26 +00:00
Preston Briggs	a709774344	Modify depends(Src, Dst, PossiblyLoopIndependent). If the Src and Dst are the same instruction, no loop-independent dependence is possible, so we force the PossiblyLoopIndependent flag to false. The test case results are updated appropriately. llvm-svn: 168678	2012-11-27 06:41:46 +00:00
Preston Briggs	0889167a63	Corrects a problem where we reply exclusively of GEPs to drive analysis. Better is to look for cases with useful GEPs and use them when possible. When a pair of useful GEPs is not available, use the raw SCEVs directly. This approach supports better analysis of pointer dereferencing. In parallel, all the test cases are updated appropriately. Cases where we have a store to *B++ can now be analyzed! llvm-svn: 168474	2012-11-21 23:50:04 +00:00
Hal Finkel	9dc292f3c5	Phi speculation improvement for BasicAA This is a partial solution to PR14351. It removes some of the special significance of the first incoming phi value in the phi aliasing checking logic in BasicAA. In the context of a loop, the old logic assumes that the first incoming value is the interesting one (meaning that it is the one that comes from outside the loop), but this is often not the case. With this change, we now test first the incoming value that comes from a block other than the parent of the phi being tested. llvm-svn: 168245	2012-11-17 02:33:15 +00:00
Benjamin Kramer	60b650f8c4	DependenceAnalysis: Print all dependency pairs when dumping. Update all testcases. Part of a patch by Preston Briggs. llvm-svn: 167827	2012-11-13 12:12:02 +00:00
Nadav Rotem	dce9a7a599	CostModel: add another known vector trunc optimization. llvm-svn: 167488	2012-11-06 21:17:17 +00:00
Nadav Rotem	2fb5dc3a15	Cost Model: add tables for some avx type-conversion hacks. llvm-svn: 167480	2012-11-06 19:33:53 +00:00
Nadav Rotem	890d7c7f8e	CostModel: Add tables for the common x86 compares. llvm-svn: 167421	2012-11-05 23:48:20 +00:00
Nadav Rotem	8ddfd47801	Code Model: Improve the accuracy of the zext/sext/trunc vector cost estimation. llvm-svn: 167412	2012-11-05 22:20:53 +00:00
Nadav Rotem	04d64771f6	Cost Model: Normalize the insert/extract index when splitting types llvm-svn: 167402	2012-11-05 21:12:13 +00:00
Nadav Rotem	a504aa057e	Cost Model: teach the cost model about expanding integers. llvm-svn: 167401	2012-11-05 21:11:10 +00:00
Nadav Rotem	4def3aace5	Implement the cost of abnormal x86 instruction lowering as a table. llvm-svn: 167395	2012-11-05 19:32:46 +00:00
Richard Osborne	258e3e70bb	Don't infer whether a value is captured in the current function from the 'nocapture' attribute. The nocapture attribute only specifies that no copies are made that outlive the function. This isn't the same as there being no copies at all. This fixes PR14045. llvm-svn: 167381	2012-11-05 10:48:24 +00:00
Nadav Rotem	c9bbabd5e9	X86 CostModel: Add support for a some of the common arithmetic instructions for SSE4, AVX and AVX2. llvm-svn: 167347	2012-11-03 00:39:56 +00:00
Nadav Rotem	6f0c234b7f	Add a stub for the x86 cost model impl. Implement a basic cost rule for inserting/extracting from XMM registers. llvm-svn: 167333	2012-11-02 23:27:16 +00:00
Nadav Rotem	6edee82efa	CostModel: add support for Vector Insert and Extract. llvm-svn: 167329	2012-11-02 22:31:56 +00:00
Nadav Rotem	ce21a69b9d	Add a cost model analysis that allows us to estimate the cost of IR-level instructions. llvm-svn: 167324	2012-11-02 21:48:17 +00:00
Benjamin Kramer	0f18b5e49c	Remove LoopDependenceAnalysis. It was unmaintained and not much more than a stub. The new DependenceAnalysis pass is both more general and complete. llvm-svn: 166810	2012-10-26 20:25:01 +00:00
Sebastian Pop	952bb51433	dependence analysis Patch from Preston Briggs <preston.briggs@gmail.com>. This is an updated version of the dependence-analysis patch, including an MIV test based on Banerjee's inequalities. It's a fairly complete implementation of the paper Practical Dependence Testing Gina Goff, Ken Kennedy, and Chau-Wen Tseng PLDI 1991 It cannot yet propagate constraints between coupled RDIV subscripts (discussed in Section 5.3.2 of the paper). It's organized as a FunctionPass with a single entry point that supports testing for dependence between two instructions in a function. If there's no dependence, it returns null. If there's a dependence, it returns a pointer to a Dependence which can be queried about details (what kind of dependence, is it loop independent, direction and distance vector entries, etc). I haven't included every imaginable feature, but there's a good selection that should be adequate for supporting many loop transformations. Of course, it can be extended as necessary. Included in the patch file are many test cases, commented with C code showing the loops and array references. llvm-svn: 165708	2012-10-11 07:32:34 +00:00
James Molloy	e2d4fd89a3	Add default JIT LIT variable. Patch by David Tweed! llvm-svn: 164996	2012-10-02 10:57:08 +00:00
Duncan Sands	c10b8fc428	Now that invoke of an intrinsic is possible (for the llvm.do.nothing intrinsic) teach the callgraph logic to not create callgraph edges to intrinsics for invoke instructions; it already skips this for call instructions. Fixes PR13903. llvm-svn: 164707	2012-09-26 17:16:01 +00:00
Arnold Schwaighofer	e02da03f09	BasicAA: Recognize cyclic NoAlias phis Enhances basic alias analysis to recognize phis whose first incoming values are NoAlias and whose other incoming values are just the phi node itself through some amount of recursion. Example: With this change basicaa reports that ptr_phi and ptr_phi2 do not alias each other. bb: ptr = ptr2 + 1 loop: ptr_phi = phi [bb, ptr], [loop, ptr_plus_one] ptr2_phi = phi [bb, ptr2], [loop, ptr2_plus_one] ... ptr_plus_one = gep ptr_phi, 1 ptr2_plus_one = gep ptr2_phi, 1 This enables the elimination of one load in code like the following: extern int foo; int test_noalias(int ptr, int num, int coeff) { int ptr2 = ptr; int result = (ptr++) * (coeff--); while (num--) { ptr2++ = ptr; result += (coeff--) * (ptr++); } ptr = foo; return result; } Part 2/2 of fix for PR13564. llvm-svn: 163319	2012-09-06 14:41:53 +00:00
Arnold Schwaighofer	1e987368d7	BasicAA: GEPs of NoAlias'ing base ptr with equivalent indices are NoAlias If we can show that the base pointers of two GEPs don't alias each other using precise analysis and the indices and base offset are equal then the two GEPs also don't alias each other. This is primarily needed for the follow up patch that analyses NoAlias'ing PHI nodes. Part 1/2 of fix for PR13564. llvm-svn: 163317	2012-09-06 14:31:51 +00:00
NAKAMURA Takumi	dff0b7dd23	llvm/test/Analysis/Profiling: Mark 3 of them as REQUIRES: loadable_module. FIXME: profile_rt.dll could be built on win32. llvm-svn: 162811	2012-08-29 00:37:46 +00:00
Manman Ren	478cc27601	Profile: set branch weight metadata with data generated from profiling. This patch implements ProfileDataLoader which loads profile data generated by -insert-edge-profiling and updates branch weight metadata accordingly. Patch by Alastair Murray. llvm-svn: 162799	2012-08-28 22:21:25 +00:00
Manman Ren	6342812033	BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle the case of multiple edges from one block to another. A simple example is a switch statement with multiple values to the same destination. The definition of an edge is modified from a pair of blocks to a pair of PredBlock and an index into the successors. Also set the weight correctly when building SelectionDAG from LLVM IR, especially when converting a Switch. IntegersSubsetMapping is updated to calculate the weight for each cluster. llvm-svn: 162572	2012-08-24 18:14:27 +00:00
Benjamin Kramer	b42939c43b	Fix broken check lines. I really need to find a way to automate this, but I can't come up with a regex that has no false positives while handling tricky cases like custom check prefixes. llvm-svn: 162097	2012-08-17 12:28:26 +00:00
Nadav Rotem	685bd842e2	MemoryDependenceAnalysis attempts to find the first memory dependency for function calls. Currently, if GetLocation reports that it did not find a valid pointer (this is the case for volatile load/stores), we ignore the result. This patch adds code to handle the cases where we did not obtain a valid pointer. rdar://11872864 PR12899 llvm-svn: 161802	2012-08-13 23:03:43 +00:00
Nick Lewycky	e84df7229f	Stay rational; don't assert trying to take the square root of a negative value. If it's negative, the loop is already proven to be infinite. Fixes PR13489! llvm-svn: 161107	2012-08-01 09:14:36 +00:00
Chandler Carruth	5d3a0ce4e5	Fix the remaining TCL-style quotes found in the testsuite. This is another mechanical change accomplished though the power of terrible Perl scripts. I have manually switched some "s to 's to make escaping simpler. While I started this to fix tests that aren't run in all configurations, the massive number of tests is due to a really frustrating fragility of our testing infrastructure: things like 'grep -v', 'not grep', and 'expected failures' can mask broken tests all too easily. Essentially, I'm deeply disturbed that I can change the testsuite so radically without causing any change in results for most platforms. =/ llvm-svn: 159547	2012-07-02 19:09:46 +00:00
Chandler Carruth	d200829a4f	Convert the uses of '\|&' to use '2>&1 \|' instead, which works on old versions of Bash. In addition, I can back out the change to the lit built-in shell test runner to support this. This should fix the majority of fallout on Darwin, but I suspect there will be a few straggling issues. llvm-svn: 159544	2012-07-02 18:37:59 +00:00
Chandler Carruth	8a358b3669	Convert all tests using TCL-style quoting to use shell-style quoting. This was done through the aid of a terrible Perl creation. I will not paste any of the horrors here. Suffice to say, it require multiple staged rounds of replacements, state carried between, and a few nested-construct-parsing hacks that I'm not proud of. It happens, by luck, to be able to deal with all the TCL-quoting patterns in evidence in the LLVM test suite. If anyone is maintaining large out-of-tree test trees, feel free to poke me and I'll send you the steps I used to convert things, as well as answer any painful questions etc. IRC works best for this type of thing I find. Once converted, switch the LLVM lit config to use ShTests the same as Clang. In addition to being able to delete large amounts of Python code from 'lit', this will also simplify the entire test suite and some of lit's architecture. Finally, the test suite runs 33% faster on Linux now. ;] For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s llvm-svn: 159525	2012-07-02 12:47:22 +00:00
Nick Lewycky	6b7498ebd5	If the step value is a constant zero, the loop isn't going to terminate. Fixes the assert reported in PR13228! llvm-svn: 159393	2012-06-28 23:44:57 +00:00
Andrew Trick	39e8cd5b2f	SCEV: Handle a corner case reducing AddRecExpr * AddRecExpr If integer overflow causes one of the terms to reach zero, that can force the entire expression to zero. Fixes PR12929: cast<Ty>() argument of incompatible type llvm-svn: 157673	2012-05-30 03:35:20 +00:00
Andrew Trick	c3765e6af0	SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond. getUDivExpr attempts to simplify by checking for overflow. isLoopEntryGuardedByCond then evaluates the loop predicate which may lead to the same getUDivExpr causing endless recursion. Fixes PR12868: clang 3.2 segmentation fault. llvm-svn: 157092	2012-05-19 00:48:25 +00:00
Bill Wendling	9f736e7c65	FileCheck-ize tests. llvm-svn: 155434	2012-04-24 10:45:44 +00:00
Bill Wendling	6824095c62	FileCheck-ize these tests. llvm-svn: 155433	2012-04-24 10:36:42 +00:00
Bill Wendling	b394f59f17	FileCheck-ize these tests. Harden some of them. llvm-svn: 155432	2012-04-24 09:15:38 +00:00
Benjamin Kramer	550faddc94	Revert "SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW." This isn't right either, reverting for now. llvm-svn: 154910	2012-04-17 06:33:57 +00:00
Benjamin Kramer	a690750db9	SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW. Found by inspection. llvm-svn: 154262	2012-04-07 17:19:26 +00:00
Rafael Espindola	1b3dfa1ce2	Handle intrinsics in GlobalsModRef. Fixes pr12351. llvm-svn: 153604	2012-03-28 21:31:24 +00:00
Andrew Trick	88a52fd943	SCEV fix: Handle loop invariant loads. Fixes PR11882: NULL dereference in ComputeLoadConstantCompareExitLimit. llvm-svn: 153480	2012-03-26 22:33:59 +00:00
Andrew Trick	e0b7008fb8	Test scalar evolution directly instead of testing the result of canonical indvars. llvm-svn: 153256	2012-03-22 17:09:31 +00:00
Eli Friedman	1ff1d1f1bc	Duncan pointed out that if the alignment isn't explicitly specified, it defaults to the ABI alignment. Given that, make this code a bit more aggressive in such cases. llvm-svn: 151584	2012-02-27 23:16:46 +00:00
Eli Friedman	15f56db6c0	Teach BasicAA about the LLVM IR rules that allow reading past the end of an object given sufficient alignment. Fixes PR12098. llvm-svn: 151553	2012-02-27 20:46:07 +00:00
Rafael Espindola	34b7c064cb	Change the implementation of dominates(inst, inst) to one based on what the verifier does. This correctly handles invoke. Thanks to Duncan, Andrew and Chris for the comments. Thanks to Joerg for the early testing. llvm-svn: 151469	2012-02-26 02:19:19 +00:00
Eli Bendersky	4afdeeb682	Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed. Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches. llvm-svn: 150664	2012-02-16 06:28:33 +00:00
Nick Lewycky	7425820374	Change CaptureTracking to pass a Use* instead of a Value* when a value is captured. This allows the tracker to look at the specific use, which may be especially interesting for function calls. Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does not iterate until a fixpoint and does not guarantee that it produces the same result regardless of iteration order. The new implementation builds up a graph of how arguments are passed from function to function, and uses a bottom-up walk on the argument-SCCs to assign nocapture. This gets us nocapture more often, and does so rather efficiently and independent of iteration order. llvm-svn: 147327	2011-12-28 23:24:21 +00:00
Chandler Carruth	acdf352b77	Make the unreachable probability much much heavier. The previous probability wouldn't be considered "hot" in some weird loop structures or other compounding probability patterns. This makes it much harder to confuse, but isn't really a principled fix. I'd actually like it if we could model a zero probability, as it would make this much easier to reason about. Suggestions for how to do this better are welcome. llvm-svn: 147142	2011-12-22 09:26:37 +00:00
Chandler Carruth	2bedf185c9	Manually upgrade the test suite to specify the flag to cttz and ctlz. I followed three heuristics for deciding whether to set 'true' or 'false': - Everything target independent got 'true' as that is the expected common output of the GCC builtins. - If the target arch only has one way of implementing this operation, set the flag in the way that exercises the most of codegen. For most architectures this is also the likely path from a GCC builtin, with 'true' being set. It will (eventually) require lowering away that difference, and then lowering to the architecture's operation. - Otherwise, set the flag differently dependending on which target operation should be tested. Let me know if anyone has any issue with this pattern or would like specific tests of another form. This should allow the x86 codegen to just iteratively improve as I teach the backend how to differentiate between the two forms, and everything else should remain exactly the same. llvm-svn: 146370	2011-12-12 11:59:10 +00:00
Andrew Trick	63f81b112e	SCEV fix. In general, Add/Mul expressions should not inherit NSW/NUW. This reverts r139450, fixes r139453, and adds much needed comments and a unit test. llvm-svn: 145367	2011-11-29 02:16:38 +00:00
Andrew Trick	a033165cf9	Filecheckize. llvm-svn: 145363	2011-11-29 02:05:23 +00:00
Chris Lattner	9d1e8420ff	Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic. llvm-svn: 145171	2011-11-27 06:54:59 +00:00
Nick Lewycky	c08fa4916a	Don't forget to check FlagNW when determining whether an AddRecExpr will wrap or not. Patch by Brendon Cahoon! llvm-svn: 144173	2011-11-09 07:11:37 +00:00
Benjamin Kramer	3bb9d5377e	2>&1 doesn't work here, it just creates an empty file called "&1" llvm-svn: 143117	2011-10-27 18:27:45 +00:00
Duncan Sands	be9c2e6e13	Restore commits 142790 and 142843 - they weren't breaking the build bots. Original commit messages: - Reapply r142781 with fix. Original message: Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } - Now that we look at all the header PHIs, we need to consider all the header PHIs when deciding that the loop has stopped evolving. Fixes miscompile in the gcc torture testsuite! llvm-svn: 142919	2011-10-25 12:28:52 +00:00
Chandler Carruth	3cbbc35715	Fix the API usage in loop probability heuristics. It was incorrectly classifying many edges as exiting which were in fact not. These mainly formed edges into sub-loops. It was also not correctly classifying all returning edges out of loops as leaving the loop. With this match most of the loop heuristics are more rational. Several serious regressions on loop-intesive benchmarks like perlbench's loop tests when built with -enable-block-placement are fixed by these updated heuristics. Unfortunately they in turn uncover some other regressions. There are still several improvemenst that should be made to loop heuristics including trip-count, and early back-edge management. llvm-svn: 142917	2011-10-25 09:47:41 +00:00
Duncan Sands	da835efa2a	Speculatively revert commits 142790 and 142843 to see if it fixes the dragonegg and llvm-gcc self-host buildbots. Original commit messages: - Reapply r142781 with fix. Original message: Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } - Now that we look at all the header PHIs, we need to consider all the header PHIs when deciding that the loop has stopped evolving. Fixes miscompile in the gcc torture testsuite! llvm-svn: 142916	2011-10-25 09:26:43 +00:00
Nick Lewycky	289c30130a	Now that we look at all the header PHIs, we need to consider all the header PHIs when deciding that the loop has stopped evolving. Fixes miscompile in the gcc torture testsuite! llvm-svn: 142843	2011-10-24 21:02:38 +00:00
Chandler Carruth	d04f838629	Remove return heuristics from the static branch probabilities, and introduce no-return or unreachable heuristics. The return heuristics from the Ball and Larus paper don't work well in practice as they pessimize early return paths. The only good hitrate return heuristics are those for: - NULL return - Constant return - negative integer return Only the last of these three can possibly require significant code for the returning block, and even the last is fairly rare and usually also a constant. As a consequence, even for the cold return paths, there is little code on that return path, and so little code density to be gained by sinking it. The places where sinking these blocks is valuable (inner loops) will already be weighted appropriately as the edge is a loop-exit branch. All of this aside, early returns are nearly as common as all three of these return categories, and should actually be predicted as taken! Rather than muddy the waters of the static predictions, just remain silent on returns and let the CFG itself dictate any layout or other issues. However, the return heuristic was flagging one very important case: unreachable. Unfortunately it still gave a 1/4 chance of the branch-to-unreachable occuring. It also didn't do a rigorous job of finding those blocks which post-dominate an unreachable block. This patch builds a more powerful analysis that should flag all branches to blocks known to then reach unreachable. It also has better worst-case runtime complexity by not looping through successors for each block. The previous code would perform an N^2 walk in the event of a single entry block branching to N successors with a switch where each successor falls through to the next and they finally fall through to a return. Test case added for noreturn heuristics. Also doxygen comments improved along the way. llvm-svn: 142793	2011-10-24 12:01:08 +00:00
Nick Lewycky	64d4e26aec	Reapply r142781 with fix. Original message: Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } llvm-svn: 142790	2011-10-24 06:57:05 +00:00
Nick Lewycky	d72de74587	Speculatively revert r142781. Bots are showing Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed. coming out of indvars. llvm-svn: 142786	2011-10-24 04:00:25 +00:00
Nick Lewycky	5ab7948d71	Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the loop header when computing the trip count. With this, we now constant evaluate: struct ListNode { const struct ListNode next; int i; }; static const struct ListNode node1 = {0, 1}; static const struct ListNode node2 = {&node1, 2}; static const struct ListNode node3 = {&node2, 3}; int test() { int sum = 0; for (const struct ListNode n = &node3; n != 0; n = n->next) sum += n->i; return sum; } llvm-svn: 142781	2011-10-23 23:43:14 +00:00
Chandler Carruth	151d4fc273	Teach the BranchProbabilityInfo pass to print its results, and use that to bring it under direct test instead of merely indirectly testing it in the BlockFrequencyInfo pass. The next step is to start adding tests for the various heuristics employed, and to start fixing those heuristics once they're under test. llvm-svn: 142778	2011-10-23 21:21:50 +00:00
Nick Lewycky	ce8bfeadff	Make SCEV's brute force analysis stronger in two ways. Firstly, we should be able to constant fold load instructions where the argument is a constant. Second, we should be able to watch multiple PHI nodes through the loop; this patch only supports PHIs in loop headers, more can be done here. With this patch, we now constant evaluate: static const int arr[] = {1, 2, 3, 4, 5}; int test() { int sum = 0; for (int i = 0; i < 5; ++i) sum += arr[i]; return sum; } llvm-svn: 142731	2011-10-22 19:58:20 +00:00
Chandler Carruth	12a645d6f6	Generalize the reading of probability metadata to work for both branches and switches, with arbitrary numbers of successors. Still optimized for the common case of 2 successors for a conditional branch. Add a test case for switch metadata showing up in the BlockFrequencyInfo pass. llvm-svn: 142493	2011-10-19 10:32:19 +00:00
Chandler Carruth	18a382b4b6	Teach the BranchProbabilityInfo analysis pass to read any metadata encoding of probabilities. In the absense of metadata, it continues to fall back on static heuristics. This allows __builtin_expect, after lowering through llvm.expect a branch instruction's metadata, to actually enter the branch probability model. This is one component of resolving PR2577. llvm-svn: 142492	2011-10-19 10:30:30 +00:00
Chandler Carruth	13b475d4f6	Add pass printing support to BlockFrequencyInfo pass. The implementation layer already had support for printing the results of this analysis, but the wiring was missing. Now that printing the analysis works, actually bring some of this analysis, and the BranchProbabilityInfo analysis that it wraps, under test! I'm planning on fixing some bugs and doing other work here, so having a nice place to add regression tests and a way to observe the results is really useful. llvm-svn: 142491	2011-10-19 10:12:41 +00:00
Andrew Trick	b4cabec37a	Missing test case for r141164. llvm-svn: 141166	2011-10-05 06:23:32 +00:00
Nick Lewycky	4898eef762	Reapply r140979 with fix! We never did get a testcase, but careful review of the logic by David Meyer revealed this bug. llvm-svn: 140992	2011-10-03 07:10:45 +00:00
Nick Lewycky	79fec8116f	Revert r140979 due to reports of bootstrap failure. llvm-svn: 140980	2011-10-03 05:14:59 +00:00
Nick Lewycky	a760a29395	Add one more case we compute a max trip count. llvm-svn: 140979	2011-10-03 01:03:57 +00:00
Eli Friedman	f4f4a75d2b	PR10628: Fix getModRefInfo so it queries the underlying alias() implementation correctly while checking nocapture calls. llvm-svn: 140666	2011-09-28 00:34:27 +00:00
Eli Friedman	9c1a430966	Enhance alias analysis for atomic instructions a bit. Upgrade a couple alias-analysis tests to the new atomic instructions. llvm-svn: 140557	2011-09-26 20:15:28 +00:00
Andrew Trick	fe292d43c0	This test only makes sense with -enable-iv-rewrite. llvm-svn: 139576	2011-09-13 02:45:26 +00:00
Eli Friedman	6e9cab83b0	Fix the logic in BasicAliasAnalysis::aliasGEP for comparing GEP's with variable differences so that it actually does something sane. Fixes PR10881. llvm-svn: 139276	2011-09-08 02:23:31 +00:00
Owen Anderson	483f94e8d1	Teach BasicAA about the aliasing properties of memset_pattern16. Fixes PR10872 and <rdar://problem/10065079>. llvm-svn: 139204	2011-09-06 23:33:25 +00:00
Nick Lewycky	8203bcfd03	This transform only handles two-operand AddRec's. Prevent it from trying to handle anything more complex. Fixes PR10383 again! llvm-svn: 139186	2011-09-06 21:42:18 +00:00
Nick Lewycky	3823432a57	The logic inside getMulExpr to simplify {a,+,b}*{c,+,d} was wrong, which was visible given a=b=c=d=1, on iteration #1 (the second iteration). Replace it with correct math. Fixes PR10383! llvm-svn: 139133	2011-09-06 05:05:14 +00:00
Nick Lewycky	18c0b01a56	Revert r139126 due to selfhost failures reported by buildbots. llvm-svn: 139130	2011-09-06 02:43:13 +00:00
Nick Lewycky	30dcc754df	Teach SCEV to report a max backedge count in one interesting case in HowFarToZero; the case for a canonical loop. llvm-svn: 139126	2011-09-05 23:25:16 +00:00
Rafael Espindola	745797b9c4	Move the loads after the calls so that the fix for PR10292 doesn't show that the loads don't alias the allocas. llvm-svn: 134852	2011-07-09 23:53:58 +00:00
Rafael Espindola	ec48ea17ca	Use CHECK-NEXT. llvm-svn: 134850	2011-07-09 22:56:50 +00:00
Chris Lattner	6aa403748e	Remove support for parsing the "type i32" syntax for defining a numbered top level type without a specified number. This syntax isn't documented and blocks forward progress. llvm-svn: 133371	2011-06-19 00:03:46 +00:00
Chris Lattner	ad5400fa72	rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the target indep prefetch change. As usual, updating the testsuite is a PITA. llvm-svn: 133337	2011-06-18 06:05:24 +00:00
Chris Lattner	0899957b99	make the asmparser reject function and type redefinitions. 'Merging' hasn't been needed since llvm-gcc 3.4 days. llvm-svn: 133248	2011-06-17 07:06:44 +00:00
Chris Lattner	4eb6f76fa6	Remove support for using "foo" as symbols instead of %"foo". This is ancient syntax and has been long obsolete. As usual, updating the tests is the nasty part of this. llvm-svn: 133242	2011-06-17 06:36:20 +00:00
Chris Lattner	9ec82f54d4	manually upgrade a bunch of tests to modern syntax, and remove some that are either unreduced or only test old syntax. llvm-svn: 133228	2011-06-17 03:14:27 +00:00
John McCall	7e1ecf5edb	Test case for r132797. llvm-svn: 132962	2011-06-14 03:02:05 +00:00
Dan Gohman	aa7c0761db	Reapply r131781, now that the GVN bug with partially-aliasing loads is disabled. llvm-svn: 132632	2011-06-04 06:50:18 +00:00
Dan Gohman	c7cda7f467	Remove this test, which should have been reverted along with r131781. llvm-svn: 132628	2011-06-04 06:21:23 +00:00
Dan Gohman	8fd6804868	Revert r131781 again. Apparently there is more going on here. llvm-svn: 132625	2011-06-04 05:11:22 +00:00
Dan Gohman	24ef4a0b7d	Reapply r131781 (revert r131809), now that some BasicAA shortcomings it exposed are fixed. llvm-svn: 132611	2011-06-04 00:46:31 +00:00
Dan Gohman	edaf7c535a	Fix BasicAA's recursion detection so that it doesn't pessimize queries in the case of a DAG, where a query reaches a node visited earlier, but it's not on a cycle. This avoids MayAlias results in cases where BasicAA is expected to return MustAlias or PartialAlias in order to protect TBAA. llvm-svn: 132609	2011-06-04 00:31:50 +00:00
Dan Gohman	6d082aec26	When merging MustAlias and PartialAlias, chose PartialAlias instead of conservatively choosing MayAlias. llvm-svn: 132579	2011-06-03 20:17:36 +00:00
Dan Gohman	5b2ad67709	Make DecomposeGEPExpression check SimplifyInstruction only after checking for a GEP, so that it matches what GetUnderlyingObject does. This fixes an obscure bug turned up by bugpoint in the testcase for PR9931. llvm-svn: 131971	2011-05-24 18:24:08 +00:00
Chris Lattner	6830c25f35	I missed a checking with my GVN change. llvm-svn: 131851	2011-05-22 07:20:02 +00:00
Duncan Sands	c228e6bdea	Revert commit 131781, to see if it fixes the x86-64 dragonegg buildbot. Original log message: When BasicAA can determine that two pointers have the same base but differ by a dynamic offset, return PartialAlias instead of MayAlias. See the comment in the code for details. This fixes PR9971. llvm-svn: 131809	2011-05-21 20:54:46 +00:00
Dan Gohman	048c261e5d	When BasicAA can determine that two pointers have the same base but differ by a dynamic offset, return PartialAlias instead of MayAlias. See the comment in the code for details. This fixes PR9971. llvm-svn: 131781	2011-05-21 01:05:08 +00:00
Dan Gohman	4e15bcfe01	Teach BasicAA about arm.neon.vld1 and vst1. llvm-svn: 130327	2011-04-27 20:44:28 +00:00
Dan Gohman	d96c818dd2	When analyzing functions known to only access argument pointees, only check arguments with pointer types. Update the documentation of IntrReadArgMem reflect this. While here, add support for TBAA tags on intrinsic calls. llvm-svn: 130317	2011-04-27 18:39:03 +00:00
Andrew Trick	cef977b295	Test case and comment for PR9633. llvm-svn: 130294	2011-04-27 05:42:17 +00:00
Benjamin Kramer	b2992c34b5	Make tests more useful. lit needs a linter ... llvm-svn: 130126	2011-04-25 10:12:01 +00:00
Eli Friedman	b0e846a68c	PR9634: Don't unconditionally tell the AliasSetTracker that the PreheaderLoad is equivalent to any other relevant value; it isn't true in general. If it is equivalent, the LoopPromoter will tell the AST the equivalence. Also, delete the PreheaderLoad if it is unused. Chris, since you were the last one to make major changes here, can you check that this is sane? llvm-svn: 129049	2011-04-07 01:35:06 +00:00
Chris Lattner	a2345ee59d	remove postdom frontiers, because it is dead. Forward dom frontiers are still used by RegionInfo :( llvm-svn: 128943	2011-04-05 21:57:17 +00:00
Anders Carlsson	8681fe2359	Revert r128140 for now. llvm-svn: 128149	2011-03-23 15:51:12 +00:00
Anders Carlsson	556ad25dec	A global variable with internal linkage where all uses are in one function and whose address is never taken is a non-escaping local object and can't alias anything else. llvm-svn: 128140	2011-03-23 02:19:48 +00:00
Andrew Trick	5c8b815e5f	Propagate SCEV no-wrap flags whenever possible. This needs review. llvm-svn: 127638	2011-03-15 00:37:00 +00:00
Andrew Trick	c4703f6ea1	When SCEV can determine the loop test is X < X, set ExactBECount=0. When ExactBECount is a constant, use it for MaxBECount. When MaxBECount cannot be computed, replace it with ExactBECount. Fixes PR9424. llvm-svn: 127342	2011-03-09 17:29:58 +00:00
Chris Lattner	2596ac19b9	teach SCEV that the scale and addition of an inbounds gep don't NSW. This fixes a FIXME in scev-aa.ll (allowing a new no-alias result) and generally makes things more precise. llvm-svn: 125449	2011-02-13 03:14:49 +00:00
Chris Lattner	fec8b6bd6d	Per discussion with Dan G, inbounds geps certainly can have unsigned overflow (e.g. "gep P, -1"), and while they can have signed wrap in theoretical situations, modelling an AddRec as not having signed wrap is going enough for any case we can think of today. In the future if this isn't enough, we can revisit this. Modeling them as having NUW isn't causing any known problems either FWIW. llvm-svn: 125410	2011-02-11 21:43:33 +00:00
Dan Gohman	6f83adb763	Add another rdar number. llvm-svn: 124125	2011-01-24 17:54:01 +00:00
Nick Lewycky	13a2b8281f	Simplify some code with no functionality change. Make the test a lot more robust against smarter optimizations, using the power of FileCheck. llvm-svn: 124081	2011-01-23 20:06:05 +00:00
Nick Lewycky	2503c9f9c8	Use value ranges to fold ext(trunc) in SCEV when possible. llvm-svn: 124062	2011-01-23 06:20:19 +00:00
Tobias Grosser	ea8985cc25	Implement requiredTransitive The PassManager did not implement the transitivity of requiredTransitive. This was unnoticed since 2006. llvm-svn: 123942	2011-01-20 21:03:22 +00:00
Nick Lewycky	51c13384f5	Similarly, analyze truncate through multiply. llvm-svn: 123842	2011-01-19 18:56:00 +00:00
Nick Lewycky	9867e58096	Add a missed SCEV fold that is required to continue analyzing the IR produced by indvars through the scev expander. trunc(add x, y) --> add(trunc x, y). Currently SCEV largely folds the other way which is probably wrong, but preserved to minimize churn. Instcombine doesn't do this fold either, demonstrating a missed optz'n opportunity on code doing add+trunc+add. llvm-svn: 123838	2011-01-19 16:59:46 +00:00
Nick Lewycky	5a538b62ca	Add a missing SCEV simplification sext(zext x) --> zext x. llvm-svn: 123832	2011-01-19 15:56:12 +00:00
Dan Gohman	df668227fb	Teach BasicAA to return PartialAlias in cases where both pointers are pointing to the same object, one pointer is accessing the entire object, and the other is access has a non-zero size. This prevents TBAA from kicking in and saying NoAlias in such cases. llvm-svn: 123775	2011-01-18 21:16:06 +00:00
Eric Christopher	c6db56a31e	Revert the testcase from the previous reverted commit. llvm-svn: 123227	2011-01-11 09:20:44 +00:00
Chris Lattner	749f1eff13	add a testcase I missed in previous commit. llvm-svn: 123143	2011-01-09 23:52:31 +00:00
Chris Lattner	57e9b35653	teach SCEV analysis of PHI nodes that PHI recurences formed with GEP instructions are always NUW, because PHIs cannot wrap the end of the address space. llvm-svn: 123105	2011-01-09 02:28:48 +00:00
Chris Lattner	fa37cac39c	reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's that have the bit set. llvm-svn: 123104	2011-01-09 02:16:18 +00:00
Chris Lattner	a7735a573d	fix rdar://8813415 - a miscompilation of 164.gzip that loop-idiom exposed. It turns out to be a latent bug in basicaa, scary. llvm-svn: 122772	2011-01-03 21:03:33 +00:00
Chris Lattner	76b74870be	filecheckize llvm-svn: 122771	2011-01-03 21:01:26 +00:00
Dan Gohman	29a260015a	-enable-tbaa is on by default now. llvm-svn: 121945	2010-12-16 02:53:48 +00:00
Dan Gohman	e106936414	Make memcpyopt TBAA-aware. llvm-svn: 121944	2010-12-16 02:51:19 +00:00
Duncan Sands	2699fb1072	Move Sub simplifications and additional Add simplifications out of instcombine and into InstructionSimplify. llvm-svn: 121861	2010-12-15 14:07:39 +00:00
Dan Gohman	b187cce266	Reapply r121520, PartialAlias implementation for BasicAA, now that memdep is updated to handle it. llvm-svn: 121725	2010-12-13 22:50:24 +00:00
Dan Gohman	18e2a55c07	Revert r121520, which may have introduced miscompilations. llvm-svn: 121573	2010-12-10 21:48:28 +00:00
Dan Gohman	d1bf1d8013	Implement PartialAlias checking in BasicAA. llvm-svn: 121520	2010-12-10 20:47:03 +00:00
Chris Lattner	191aa08db1	remove fixme comment too. llvm-svn: 120493	2010-11-30 23:25:01 +00:00
Chris Lattner	eee2bb2ff0	check in all files. This is now handled by my previous DSE commit. llvm-svn: 120492	2010-11-30 23:23:59 +00:00
NAKAMURA Takumi	fc880d67f7	test: Check the feature 'loadable_module' with load modules in %llvmshlibdir. %llvmshlibdir should be 'bin' on Cygming. llvm-svn: 120282	2010-11-29 07:58:32 +00:00
Dan Gohman	139b090e0e	Delete unneeded ssp attributes. llvm-svn: 118836	2010-11-11 21:08:46 +00:00
Dan Gohman	7c6a63aea3	TBAA-enable ArgumentPromotion. llvm-svn: 118804	2010-11-11 18:09:32 +00:00
Dan Gohman	01ed16d764	Make Sink tbaa-aware. llvm-svn: 118788	2010-11-11 16:21:47 +00:00
Dan Gohman	2ffc9c8de3	Add a testcase which demonstrates alias analysis pass precedence. llvm-svn: 118755	2010-11-11 01:03:30 +00:00
Dan Gohman	aef3e95364	Fully invalidate cached results when a prior query's size or type is insufficient for, or incompatible with, the current query. llvm-svn: 118721	2010-11-10 21:45:11 +00:00
Dan Gohman	a42f6c32a3	Teach FunctionAttrs about the VAArg instruction. llvm-svn: 118627	2010-11-09 20:17:38 +00:00
Dan Gohman	cee4cb4b0a	Add a testcase for a call which BasicAA says only accesses memory through its arguments and which TBAA says doesn't write to memory. llvm-svn: 118439	2010-11-08 20:20:11 +00:00
Dan Gohman	c04ed6e5da	Make FunctionAttrs TBAA-aware. llvm-svn: 118417	2010-11-08 17:12:04 +00:00
Dan Gohman	753c9ce807	Teach memdep to use pointsToConstantMemory to determine that loads from constant memory don't alias any stores. llvm-svn: 117636	2010-10-29 01:14:04 +00:00
Dan Gohman	a592a0f539	Add a basic testcase for TBAA-aware DSE. llvm-svn: 117632	2010-10-29 00:54:02 +00:00
Dan Gohman	afaaf2f56b	Add some comments. llvm-svn: 116957	2010-10-20 22:04:02 +00:00
Dan Gohman	6efd04961b	Don't pass the raw invalid pointer used to represent conflicting TBAA information to AliasAnalysis. llvm-svn: 116751	2010-10-18 21:28:00 +00:00
Dan Gohman	ed8fc1b23f	Add a basic testcase for TBAA-aware LICM. llvm-svn: 116745	2010-10-18 21:00:09 +00:00
Dan Gohman	2427af80d1	Run tbaa before basicaa, since that's how it's expected to be used. llvm-svn: 116731	2010-10-18 18:45:59 +00:00
Dan Gohman	7820328076	Make TypeBasedAliasAnalysis default to doing nothing, with a command-line option to enable it. llvm-svn: 116722	2010-10-18 18:17:47 +00:00
Dan Gohman	6aff5b94ff	Make BasicAliasAnalysis a normal AliasAnalysis implementation which does normal initialization and normal chaining. Change the default AliasAnalysis implementation to NoAlias. Update StandardCompileOpts.h and friends to explicitly request BasicAliasAnalysis. Update tests to explicitly request -basicaa. llvm-svn: 116720	2010-10-18 18:04:47 +00:00
Dan Gohman	8dc9781a91	Add a simple testcase for tbaa. llvm-svn: 116272	2010-10-11 23:54:13 +00:00
Benjamin Kramer	caacff25e4	Remove PointerTracking tests. llvm-svn: 115072	2010-09-29 19:20:35 +00:00
Eli Friedman	b5aea103fc	PR7959: Handle negative scales in GEPs correctly in BasicAA for non-64-bit targets. llvm-svn: 114015	2010-09-15 20:08:03 +00:00
Chris Lattner	238f46d92e	remove some noise from tests. llvm-svn: 112889	2010-09-02 22:35:33 +00:00
Michael J. Spencer	2f463fc492	Fix constant-over-index.ll test on windows. llvm-svn: 112483	2010-08-30 15:08:02 +00:00
Chris Lattner	7663b66c31	refix PR1143 by making basicaa analyze zexts of indices aggresively, which I broke with a recent patch. llvm-svn: 111452	2010-08-18 23:09:49 +00:00
Chris Lattner	b4602679d7	fix a buggy test llvm-svn: 111354	2010-08-18 04:55:12 +00:00
Chris Lattner	49d0f29752	fix PR7589: In brief: gep P, (zext x) != gep P, (sext x) DecomposeGEPExpression was getting this wrong, confusing basicaa. llvm-svn: 111352	2010-08-18 04:28:19 +00:00
Chris Lattner	6ac971a27f	filecheckize and detrivialize. llvm-svn: 111350	2010-08-18 04:25:43 +00:00
Dan Gohman	603e66618f	When analyzing loop exit conditions combined with and and or, don't make any assumptions about when the two conditions will agree on when to permit the loop to exit. This fixes PR7845. llvm-svn: 110758	2010-08-11 00:12:36 +00:00
Tobias Grosser	7b96737b7f	RegionInfo: Do not assert if a BB is not part of the dominance tree. llvm-svn: 110665	2010-08-10 09:54:35 +00:00
Dan Gohman	da3f592fb3	Implement a proper getModRefInfo for va_arg. llvm-svn: 110458	2010-08-06 18:24:38 +00:00
Dan Gohman	9135b410fe	Implement AccessesArguments checking in the two-callsite form of BasicAA::getModRefInfo. This allows BasicAA to say that two memset calls to non-aliasing memory locations don't interfere. llvm-svn: 110393	2010-08-05 23:34:50 +00:00
Dan Gohman	7260387710	Fix memdep's code for reasoning about dependences between two calls. A Ref response from getModRefInfo is not useful here. Instead, check for identical calls only in the NoModRef case. Reapply r110270, and strengthen it to compensate for the memdep changes. When both calls are readonly, there is no dependence between them. llvm-svn: 110382	2010-08-05 22:09:15 +00:00
Dan Gohman	c42ed0aa91	Revert r110270 for now. It appears to uncover a memdep bug. llvm-svn: 110293	2010-08-05 00:43:10 +00:00

... 3 4 5 6 7 ...

714 Commits