llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 08:23:21 +01:00

Author	SHA1	Message	Date
Chris Lattner	4d41e7d461	teach "convert from scalar" to handle loads of fca's. llvm-svn: 63659	2009-02-03 21:08:45 +00:00
Chris Lattner	eb3d568867	make scalar conversion handle stores of first class aggregate values. loads are not yet handled (coming soon to an sroa near you). llvm-svn: 63649	2009-02-03 19:30:11 +00:00
Chris Lattner	5f3116636b	Make SROA produce a vector only when the alloca is actually accessed at least once as a vector. This prevents it from compiling the example in not-a-vector into: define double @test(double %A, double %B) { %tmp4 = insertelement <7 x double> undef, double %A, i32 0 %tmp = insertelement <7 x double> %tmp4, double %B, i32 4 %tmp2 = extractelement <7 x double> %tmp, i32 4 ret double %tmp2 } instead, producing the integer code. Producing vectors when they aren't otherwise in the program is dangerous because a lot of other code treats them carefully and doesn't want to break them down. OTOH, many things want to break down tasty i448's. llvm-svn: 63638	2009-02-03 18:15:05 +00:00
Chris Lattner	028861e55b	this produces an undefined result, just check that the alloca is gone and that sroa doesn't crash. llvm-svn: 63637	2009-02-03 18:13:00 +00:00
Evan Cheng	b3da5fb3a4	APInt'fy SimplifyDemandedVectorElts so it can analyze vectors with more than 64 elements. llvm-svn: 63631	2009-02-03 10:05:09 +00:00
Chris Lattner	447b5517bc	add another case of undefined behavior without crashing, PR3466. llvm-svn: 63620	2009-02-03 07:08:57 +00:00
Nick Lewycky	a676cf98e3	Revert r63600. It didn't fix the bug, it just moved it a bit. llvm-svn: 63618	2009-02-03 06:30:37 +00:00
Nick Lewycky	cd8353b6fe	Update the callgraph when replacing InvokeInst with CallInst when inlining. llvm-svn: 63600	2009-02-03 04:34:40 +00:00
Chris Lattner	b47738daab	Teach ConvertUsesToScalar to handle memset, allowing it to handle crazy cases like: struct f { int A, B, C, D, E, F; }; short test4() { struct f A; A.A = 1; memset(&A.B, 2, 12); return A.C; } llvm-svn: 63596	2009-02-03 02:01:43 +00:00
Chris Lattner	2dae393299	rearrange how SRoA handles promotion of allocas to vectors. With the new world order, it can handle cases where the first store into the alloca is an element of the vector, instead of requiring the first analyzed store to have the vector type itself. This allows us to un-xfail test/CodeGen/X86/vec_ins_extract.ll. llvm-svn: 63590	2009-02-03 01:30:09 +00:00
Chris Lattner	7f52743cca	this test produces an undefined value, we don't care what it is, but we do want the alloca promoted. llvm-svn: 63587	2009-02-03 01:13:52 +00:00
Chris Lattner	5c43f87c53	update test llvm-svn: 63532	2009-02-02 18:12:58 +00:00
Chris Lattner	ce09ac0c3d	Fix a bug which caused us to miscompile a couple of Ada tests. Thanks for the beautiful reduced testcase Duncan! llvm-svn: 63529	2009-02-02 18:02:59 +00:00
Chris Lattner	1cb94d541b	reduce testcase. llvm-svn: 63499	2009-02-02 06:55:45 +00:00
Nick Lewycky	e25b96473e	Reinstate this optimization to fold icmp of xor when possible. Don't try to turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This may have been increasing register pressure leading to the bzip2 slowdown. llvm-svn: 63487	2009-01-31 21:30:05 +00:00
Chris Lattner	26698a600e	Fix PR3452 (an infinite loop bootstrapping) by disabling the recent improvements to the EvaluateInDifferentType code. This code works by just inserted a bunch of new code and then seeing if it is useful. Instcombine is not allowed to do this: it can only insert new code if it is useful, and only when it is converging to a more canonical fixed point. Now that we iterate when DCE makes progress, this causes an infinite loop when the code ends up not being used. llvm-svn: 63483	2009-01-31 19:05:27 +00:00
Chris Lattner	c4729610fc	now that all the pieces are in place, teach instcombine's simplifydemandedbits to simplify instructions with multiple uses in contexts where it can get away with it. This allows it to simplify the code in multi-use-or.ll into a single 'add double'. This change is particularly interesting because it will cover up for some common codegen bugs with large integers created due to the recent SROA patch. When working on fixing those bugs, this should be disabled. llvm-svn: 63481	2009-01-31 08:40:03 +00:00
Chris Lattner	abf34563ec	make sure to set Changed=true when instcombine hacks on the code, not doing so prevents it from properly iterating and prevents it from deleting the entire body of dce-iterate.ll llvm-svn: 63476	2009-01-31 07:04:22 +00:00
Chris Lattner	235913be77	Simplify and generalize the SROA "convert to scalar" transformation to be able to handle ANY alloca that is poked by loads and stores of bitcasts and GEPs with constant offsets. Before the code had a number of annoying limitations and caused it to miss cases such as storing into holes in structs and complex casts (as in bitfield-sroa) where we had unions of bitfields etc. This also handles a number of important cases that are exposed due to the ABI lowering stuff we do to pass stuff by value. One case that is pretty great is that we compile 2006-11-07-InvalidArrayPromote.ll into: define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind { %tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1) %tmp105 = bitcast <4 x i32> %tmp10 to i128 %tmp1056 = zext i128 %tmp105 to i256 %tmp.upgrd.43 = lshr i256 %tmp1056, 96 %tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32 ret i32 %tmp.upgrd.44 } which turns into: _func: subl $28, %esp cvttps2dq %xmm1, %xmm0 movaps %xmm0, (%esp) movl 12(%esp), %eax addl $28, %esp ret Which is pretty good code all things considering :). One effect of this is that SROA will start generating arbitrary bitwidth integers that are a multiple of 8 bits. In the case above, we got a 256 bit integer, but the codegen guys assure me that it can handle the simple and/or/shift/zext stuff that we're doing on these operations. This addresses rdar://6532315 llvm-svn: 63469	2009-01-31 02:28:54 +00:00
Chris Lattner	f9dd07a3c3	Fix some issues with volatility, move "CanConvertToScalar" check after the others. llvm-svn: 63227	2009-01-28 20:16:43 +00:00
Chris Lattner	2712dbe282	strengthen this test. llvm-svn: 63222	2009-01-28 19:29:30 +00:00
Mon P Wang	80efbf07bd	Fixed optimization of combining two shuffles where the first shuffle inputs has a different number of elements than the output. llvm-svn: 62998	2009-01-26 04:39:00 +00:00
Chris Lattner	f93b292d9b	Handle single-entry phi nodes gracefully in condprop. llvm-svn: 62985	2009-01-26 02:18:20 +00:00
Chris Lattner	5549fb4e74	Fix PR3408 by making a non-obvious assumption very obvious, and handling the flaw inherent in that assumption. :) llvm-svn: 62984	2009-01-26 02:11:30 +00:00
Nick Lewycky	bd3b6a2b12	Actually run the test in this directory. llvm-svn: 62957	2009-01-25 08:05:07 +00:00
Nick Lewycky	459667b48d	The function that does nothing but call malloc is noalias return. llvm-svn: 62956	2009-01-25 07:59:57 +00:00
Torok Edwin	2a7e7066b3	testcase for PR3381. Also it was an empty struct, not a void after all. llvm-svn: 62920	2009-01-24 17:16:04 +00:00
Chris Lattner	d386e82ec9	Make InstCombineStoreToCast handle aggregates more aggressively, handling the case in Transforms/InstCombine/cast-store-gep.ll, which is a heavily reduced testcase from Clang on x86-64. llvm-svn: 62904	2009-01-24 01:00:13 +00:00
Chris Lattner	b36503c31b	fix two more cases where we could let the NLPDI cache get unsorted. With this, sqlite3 now passes. llvm-svn: 62839	2009-01-23 07:12:16 +00:00
Chris Lattner	d3b233ba51	fix a testcase. llvm-svn: 62758	2009-01-22 07:08:58 +00:00
Chris Lattner	ddc8e78d54	Fix PR3358, a really nasty bug where recursive phi translated analyses could be run without the caches properly sorted. This can fix all sorts of weirdness. Many thanks to Bill for coming up with the 'issorted' verification idea. llvm-svn: 62757	2009-01-22 07:04:01 +00:00
Dale Johannesen	a5699a1e8b	Do not use host floating point types when emitting ASCII IR; loading and storing these can change the bits of NaNs on some hosts. Remove or add warnings at a few other places using host floating point; this is a bad thing to do in general. llvm-svn: 62712	2009-01-21 20:32:55 +00:00
Dale Johannesen	ba0f5e174f	Disable on x86_64 until I figure out what's wrong. llvm-svn: 62660	2009-01-21 02:08:30 +00:00
Dale Johannesen	6854f86296	Make special cases (0 inf nan) work for frem. Besides APFloat, this involved removing code from two places that thought they knew the result of frem(0., x) but were wrong. llvm-svn: 62645	2009-01-21 00:35:19 +00:00
Dale Johannesen	1c12d1b665	Calls to fmod, it turns out, are constant-folded by invoking the host fmod, not by lowering to frem and constant-folding that. Fix this so it tests what I want to test. llvm-svn: 62622	2009-01-20 21:58:13 +00:00
Bill Wendling	5bd5863cdb	Temporarily XFAIL until this can be looked at. r62557 is what caused it to start failing. llvm-svn: 62578	2009-01-20 10:28:39 +00:00
Chris Lattner	6ade48fcaa	another fix for PR3354 llvm-svn: 62561	2009-01-20 01:15:41 +00:00
Chris Lattner	e8fa6f2468	Fix a problem exposed by PR3354: simplifycfg was making a potentially trapping instruction be executed unconditionally. llvm-svn: 62541	2009-01-19 23:03:13 +00:00
Dale Johannesen	5508ead868	Move & restructure test per review. llvm-svn: 62538	2009-01-19 22:33:12 +00:00
Chris Lattner	7b4c55fb34	convert this to an unfoldable potentially trapping constant expr. llvm-svn: 62536	2009-01-19 22:12:33 +00:00
Chris Lattner	b88febb5cd	Fix PR3353, infinitely jump threading an infinite loop make from switches. llvm-svn: 62529	2009-01-19 21:20:34 +00:00
Bill Wendling	bf83203ae6	Temporarily revert r62487. It's causing this error during a release bootstrap of llvm-gcc. Most likely, it's miscompiling one of the "gen*" programs: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./prev-gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./prev-gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.6.0/bin/ -c -g -O2 -mdynamic-no-pic -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -mdynamic-no-pic -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/build -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -o build/gencondmd.o build/gencondmd.c ../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected '}' before ')' token ../../llvm-gcc.src/gcc/config/i386/mmx.md:926: warning: excess elements in struct initializer ../../llvm-gcc.src/gcc/config/i386/mmx.md:926: warning: (near initialization for 'insn_conditions[4]') ../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected '}' before ')' token ../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected ',' or ';' before ')' token ../../llvm-gcc.src/gcc/config/i386/mmx.md:927: error: expected identifier or '(' before ',' token ../../llvm-gcc.src/gcc/config/i386/sse.md:3458: error: expected identifier or '(' before ',' token ... llvm-svn: 62506	2009-01-19 08:46:20 +00:00
Chris Lattner	bb76cc9447	Fix PR3016, a bug which can occur do to an invalid assumption: we assumed a CFG structure that would be valid when all code in the function is reachable, but not all code is necessarily reachable. Do a simple, but horrible, CFG walk to check for this case. llvm-svn: 62487	2009-01-19 02:46:28 +00:00
Nick Lewycky	f4b028bf4c	Forgot this in the previous checkin: fopen now has nocapture, realloc is supposed to take two arguments. llvm-svn: 62457	2009-01-18 04:46:10 +00:00
Chris Lattner	5d1ed9ed1f	Fix PR3335 by not turning a store to one address space into a store to another. llvm-svn: 62351	2009-01-16 20:12:52 +00:00
Evan Cheng	e7c9310d1b	Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type. llvm-svn: 62297	2009-01-16 02:11:43 +00:00
Evan Cheng	d504f9fe27	- Teach CanEvaluateInDifferentType of this xform: sext (zext ty1), ty2 -> zext ty2 - Looking at the number of sign bits of the a sext instruction to determine whether new trunc + sext pair should be added when its source is being evaluated in a different type. llvm-svn: 62263	2009-01-15 17:01:23 +00:00
Chris Lattner	fa0c0e19f6	Fix PR3325, a miscompilation of invokes by IPSCCP. Patch by Jay Foad! llvm-svn: 62244	2009-01-14 21:01:16 +00:00
Dale Johannesen	816f9bc81d	Fix the time regression I introduced in 464.h264ref with my earlier patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. Also, when we build an expression that involves a (possibly non-affine) IV from a different loop as well as an IV from the one we're interested in (containsAddRecFromDifferentLoop), don't recurse into that. We can't do much with it and will get in trouble if we try to create new non-affine IVs or something. More testcases are coming. llvm-svn: 62212	2009-01-14 02:35:31 +00:00
Chris Lattner	2461d79aa9	rewrite OptimizeAwayTrappingUsesOfLoads to 1) avoid a temporary vector and extraneous loop over it, 2) not delete globals used by phis/selects etc which could actually be useful. This fixes PR3321. Many thanks to Duncan for narrowing this down. llvm-svn: 62201	2009-01-14 00:12:58 +00:00
Dale Johannesen	e458c47a74	Fix testsuite regressions from recursive inlining. llvm-svn: 62189	2009-01-13 22:43:37 +00:00
Dan Gohman	958861e65e	Make instcombine ensure that all allocas are explicitly aligned at at least their preferred alignment. llvm-svn: 62176	2009-01-13 20:18:38 +00:00
Dale Johannesen	12bb54e183	Enable recursive inlining. Reduce inlining threshold back to 200; 400 seems to be too high, loses more than it gains. llvm-svn: 62107	2009-01-12 22:11:50 +00:00
Chris Lattner	1219b4e6bc	Fix PR3304 llvm-svn: 61995	2009-01-09 18:18:43 +00:00
Chris Lattner	660c094906	Implement rdar://6480391, extending of equality icmp's to avoid a truncation. I noticed this in the code compiled for a routine using std::map, which produced this code: %25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly %.lobit.i = lshr i32 %25, 31 ; <i32> [#uses=1] %tmp.i = trunc i32 %.lobit.i to i8 ; <i8> [#uses=1] %toBool = icmp eq i8 %tmp.i, 0 ; <i1> [#uses=1] br i1 %toBool, label %bb3, label %bb4 which compiled to: call L_memcmp$stub shrl $31, %eax testb %al, %al jne LBB1_11 ## with this change, we compile it to: call L_memcmp$stub testl %eax, %eax js LBB1_11 This triggers all the time in common code, with patters like this: %169 = and i32 %ply, 1 ; <i32> [#uses=1] %170 = trunc i32 %169 to i8 ; <i8> [#uses=1] %toBool = icmp ne i8 %170, 0 ; <i1> [#uses=1] %7 = lshr i32 %6, 24 ; <i32> [#uses=1] %9 = trunc i32 %7 to i8 ; <i8> [#uses=1] %10 = icmp ne i8 %9, 0 ; <i1> [#uses=1] etc llvm-svn: 61985	2009-01-09 07:47:06 +00:00
Chris Lattner	6140ea4f18	Fix PR3298, a crash in Jump Threading. Apparently even jump threading can have bugs, who knew? ;-) llvm-svn: 61983	2009-01-09 06:08:12 +00:00
Chris Lattner	5ce930d116	Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible. llvm-svn: 61980	2009-01-09 05:44:56 +00:00
Dale Johannesen	4c25cb12ea	Do not inline functions with (dynamic) alloca into functions that don't already have a (dynamic) alloca. Dynamic allocas cause inefficient codegen and we shouldn't propagate this (behavior follows gcc). Two existing tests assumed such inlining would be done; they are hacked by adding an alloca in the caller, preserving the point of the tests. llvm-svn: 61946	2009-01-08 21:45:23 +00:00
Chris Lattner	5a8a2b046d	ValueTracker can't assume that an alloca with no specified alignment will get its preferred alignment. It has to be careful and cautiously assume it will just get the ABI alignment. This prevents instcombine from rounding up the alignment of a load/store without adjusting the alignment of the alloca. llvm-svn: 61934	2009-01-08 19:28:38 +00:00
Chris Lattner	60a03a2f36	This implements the second half of the fix for PR3290, handling loads from allocas that cover the entire aggregate. This handles some memcpy/byval cases that are produced by llvm-gcc. This triggers a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator <kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon). llvm-svn: 61915	2009-01-08 05:42:05 +00:00
Duncan Sands	a254acd1d3	Remove alloca tracking from nocapture analysis. Not only was it not very helpful, it was also wrong! The problem is shown in the testcase: the alloca might be passed to a nocapture callee which dereferences it and returns the original pointer. But because it was a nocapture call we think we don't need to track its uses, but we do. llvm-svn: 61876	2009-01-07 19:39:06 +00:00
Chris Lattner	8adf14ea21	Implement the first half of PR3290: if there is a store of an integer to a (transitive) bitcast the alloca and if that integer has the full size of the alloca, then it clobbers the whole thing. Handle this by extracting pieces out of the stored integer and filing them away in the SROA'd elements. This triggers fairly frequently because the CFE uses integers to pass small structs by value and the inliner exposes these. For example, in kimwitu++, I see a bunch of these with i64 stores to "%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>" In 176.gcc I see a few i32 stores to "%struct..0anon". In the testcase, this is a difference between compiling test1 to: _test1: subl $12, %esp movl 20(%esp), %eax movl %eax, 4(%esp) movl 16(%esp), %eax movl %eax, (%esp) movl (%esp), %eax addl 4(%esp), %eax addl $12, %esp ret vs: _test1: movl 8(%esp), %eax addl 4(%esp), %eax ret The second half of this will be to handle loads of the same form. llvm-svn: 61853	2009-01-07 08:11:13 +00:00
Chris Lattner	e10764369d	make m_ConstantInt(int64_t) safely match ConstantInt's that are larger than i64. This fixes an instcombine crash on PR3235. llvm-svn: 61775	2009-01-05 23:45:50 +00:00
Duncan Sands	130c00e4b2	Teach the internalize pass to also internalize global aliases. llvm-svn: 61754	2009-01-05 21:24:45 +00:00
Duncan Sands	3b98802e9a	Delete unused global aliases with internal linkage. In fact this also deletes those with linkonce linkage, however this is currently dead because for the moment aliases aren't allowed to have this linkage type. llvm-svn: 61742	2009-01-05 20:37:33 +00:00
Nick Lewycky	6685977938	Run a post-pass that marks known function declarations by name. llvm-svn: 61632	2009-01-04 20:27:34 +00:00
Bill Wendling	dd61282551	XFAIL this test. The xform was removed. llvm-svn: 61624	2009-01-04 06:32:28 +00:00
Duncan Sands	c087ba24aa	When calculating 'nocapture' argument attributes, allow the argument to be stored to an alloca by tracking uses of the alloca. This occurs 4 times (out of 7121, 0.05%) in MultiSource/Applications, so may not be worth it. On the other hand, it is easy to do and fairly cheap. The functions it helps are: W_addcom and W_addlit in spiff; process_args (argv) in d (make_dparser); ercPixConcealIMB in JM/ldecod. llvm-svn: 61570	2009-01-02 11:54:37 +00:00
Chris Lattner	f28c74870f	Reimplement the old and horrible bison parser for .ll files with a nice and clean recursive descent parser. This change has a couple of ramifications: 1. The parser code is about 400 lines shorter (in what we maintain, not including what is autogenerated). 2. The code should be significantly faster than the old code because we don't have to work around bison's poor handling of datatypes with ctors/dtors. This also makes the code much more resistant to memory leaks. 3. We now get caret diagnostics from the .ll parser, woo. 4. The actual diagnostics emited from the parser are completely different so a bunch of testcases had to be updated. 5. I now disallow "%ty = type opaque %ty = type i32". There was no good reason to support this, it was just an accident of the old implementation. I have no reason to think that anyone is actually using this. 6. The syntax for sticking a global variable has changed to make it unambiguous. I don't think anyone is depending on this since only clang supports this and it is not solid yet, so I'm not worried about anything breaking. 7. This gets rid of the last use of bison, and along with it the .cvs files. I'll prune this from the makefiles as a subsequent commit. There are a few minor cleanups that can be done after this commit (suggestions welcome!) but this passes dejagnu testing and is ready for its time in the limelight. llvm-svn: 61558	2009-01-02 07:01:27 +00:00
Nick Lewycky	0993a85522	Remove the cyclic part of this test, it was passing for the wrong reason. Two functions which mutually require each other to be nocapture are not currently supported. llvm-svn: 61553	2009-01-02 03:52:27 +00:00
Nick Lewycky	6c53fbb21d	Make adding nocapture a bit stronger. FreeInst is nocapture. Also, functions that don't write can't leak a pointer except through the return value, so a void readonly function is implicitly nocapture. Test these, and add a test that verifies that f1 calling f2 with an otherwise dead pointer gets both of them marked nocapture. llvm-svn: 61552	2009-01-02 03:46:56 +00:00
Duncan Sands	253f6a5dce	Add tests for two types of traps that escape analysis might one day fall into. llvm-svn: 61549	2009-01-02 00:55:51 +00:00
Bill Wendling	efbe8b808c	Add transformation: xor (or (icmp, icmp), true) -> and(icmp, icmp) This is possible because of De Morgan's law. llvm-svn: 61537	2009-01-01 01:18:23 +00:00
Duncan Sands	e112cf52cb	Look through phi nodes and select instructions when calculating nocapture attributes. llvm-svn: 61535	2008-12-31 20:21:34 +00:00
Duncan Sands	36db5853cb	Rename AddReadAttrs to FunctionAttrs, and teach it how to work out (in a very simplistic way) which function arguments (pointer arguments only) are only dereferenced and so do not escape. Mark such arguments 'nocapture'. llvm-svn: 61525	2008-12-31 16:14:43 +00:00
Duncan Sands	bd0cbff28e	Allow readnone functions to read (and write!) global constants, since doing so is irrelevant for aliasing purposes. While this doesn't increase the total number of functions marked readonly or readnone in MultiSource/ Applications (3089), it does result in 12 functions being marked readnone rather than readonly. Before: readnone: 820 readonly: 2269 After: readnone: 832 readonly: 2257 llvm-svn: 61469	2008-12-29 11:34:09 +00:00
Nick Lewycky	8fd2389593	Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2). llvm-svn: 61297	2008-12-21 00:19:21 +00:00
Nick Lewycky	ab50d88e6a	Make all the vector elements positive in an srem of constant vector. llvm-svn: 61195	2008-12-18 06:31:11 +00:00
Chris Lattner	196c166a06	Enhance heap sra to be substantially more aggressive w.r.t PHI nodes. This allows it to do fairly general phi insertion if a load from a pointer global wants to be SRAd but the load is used by (recursive) phi nodes. This fixes a pessimization on ppc introduced by Load PRE. llvm-svn: 61123	2008-12-17 05:28:49 +00:00
Chris Lattner	c4cc4a328f	Fix another crash found by inspection. If we have a PHI node merging the load multiple times, make sure the check the uses of the PHI to ensure they are transformable. llvm-svn: 61102	2008-12-16 21:24:51 +00:00
Chris Lattner	8b1f2f76d7	fix a crash found by inspection. llvm-svn: 61101	2008-12-16 21:04:51 +00:00
Eli Friedman	de614f9842	Add a helper to remove a branch and DCE the condition, and use it consistently for deleting branches. In addition to being slightly more readable, this makes SimplifyCFG a bit better about cleaning up after itself when it makes conditions unused. llvm-svn: 61100	2008-12-16 20:54:32 +00:00
Chris Lattner	b3becc5776	fix PR3217: fully cached queries need to be verified against the visited set before they are used. If used, their blocks need to be added to the visited set so that subsequent queries don't use conflicting pointer values in the cache result blocks. llvm-svn: 61080	2008-12-16 07:10:09 +00:00
Chris Lattner	3ac8ed076a	add testcase for r61051 llvm-svn: 61052	2008-12-15 21:46:23 +00:00
Chris Lattner	dd4c8f09fa	add a basic test for heap-sra llvm-svn: 61041	2008-12-15 19:42:05 +00:00
Chris Lattner	8119a1f70d	Add a testcase for GCC PR 23455, which lpre handles now. Add some comments about why we're not getting other cases. llvm-svn: 61032	2008-12-15 07:49:24 +00:00
Chris Lattner	30c1871282	gvn now hoists this load out of the hot non-call path. llvm-svn: 61028	2008-12-15 06:34:48 +00:00
Chris Lattner	ea2933ff07	Adjust testcase to make it more stable across visitation order changes, unbreaking it after r61024. llvm-svn: 61025	2008-12-15 04:42:00 +00:00
Chris Lattner	22cfa14eed	make GVN try to rename inputs to the resultant replaced values, which cleans up the generated code a bit. This should have the added benefit of not randomly renaming functions/globals like my previous patch did. :) llvm-svn: 61023	2008-12-15 03:46:38 +00:00
Chris Lattner	c92b131639	Implement initial support for PHI translation in memdep. This means that memdep keeps track of how PHIs affect the pointer in dep queries, which allows it to eliminate the load in cases like rle-phi-translate.ll, which basically end up being: BB1: X = load P br BB3 BB2: Y = load Q br BB3 BB3: R = phi [P] [Q] load R turning "load R" into a phi of X/Y. In addition to additional exposed opportunities, this makes memdep safe in many cases that it wasn't before (which is required for load PRE) and also makes it substantially more efficient. For example, consider: bb1: // has many predecessors. P = some_operator() load P In this example, previously memdep would scan all the predecessors of BB1 to see if they had something that would mustalias P. In some cases (e.g. test/Transforms/GVN/rle-must-alias.ll) it would actually find them and end up eliminating something. In many other cases though, it would scan and not find anything useful. MemDep now stops at a block if the pointer is defined in that block and cannot be phi translated to predecessors. This causes it to miss the (rare) cases like rle-must-alias.ll, but makes it faster by not scanning tons of stuff that is unlikely to be useful. For example, this speeds up GVN as a whole from 3.928s to 2.448s (60%)!. IMO, scalar GVN should be enhanced to simplify the rle-must-alias pointer base anyway, which would allow the loads to be eliminated. In the future, this should be enhanced to phi translate through geps and bitcasts as well (as indicated by FIXMEs) making memdep even more powerful. llvm-svn: 61022	2008-12-15 03:35:32 +00:00
Chris Lattner	8f6a8a85a3	another random testcase that shouldn't crash gvn and is good for coverage with future changes. llvm-svn: 61011	2008-12-14 21:20:46 +00:00
Chris Lattner	af4007b39f	RLE isn't smart enough to eliminate this safely yet. llvm-svn: 60994	2008-12-13 21:04:20 +00:00
Chris Lattner	cc5ee569a3	rename some tests to be more uniform in naming convention. llvm-svn: 60988	2008-12-13 18:47:40 +00:00
Chris Lattner	5cb658f43c	gvn should never crash on this. llvm-svn: 60987	2008-12-13 18:39:44 +00:00
Bill Wendling	34182ae3ae	Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM: llvm[2]: Linking Release executable opt (without symbols) ... Undefined symbols: "llvm::APFloat::IEEEsingle", referenced from: __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) "llvm::APFloat::IEEEdouble", referenced from: __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) ld: symbol(s) not found This is in release mode. To replicate, compile llvm and llvm-gcc in optimized mode. Then build llvm, in optimized mode, with the newly created compiler. llvm-svn: 60977	2008-12-13 09:28:44 +00:00
Chris Lattner	8753175cd6	make RLE preserve the name of the load that it replaces. This is just a pretification of the IR. llvm-svn: 60973	2008-12-13 07:22:47 +00:00
Chris Lattner	2550938060	loosen up an assertion that isn't valid when called from invalidateCachedPointerInfo. Thanks to Bill for sending me a testcase. llvm-svn: 60805	2008-12-09 22:45:32 +00:00
Chris Lattner	6a5e9eaa36	Teach BasicAA::getModRefInfo(CallSite, CallSite) some tricks based on readnone/readonly functions. Teach memdep to look past readonly calls when analyzing deps for a readonly call. This allows elimination of a few more calls from 403.gcc: before: 63 gvn - Number of instructions PRE'd 153986 gvn - Number of instructions deleted 50069 gvn - Number of loads deleted after: 63 gvn - Number of instructions PRE'd 153991 gvn - Number of instructions deleted 50069 gvn - Number of loads deleted 5 calls isn't much, but this adds plumbing for the next change. llvm-svn: 60794	2008-12-09 21:19:42 +00:00
Devang Patel	0ef5e583cd	Actually test something. Use PR3170 test case. llvm-svn: 60727	2008-12-08 23:44:46 +00:00
Devang Patel	82fb6bc606	Undo previous patch. llvm-svn: 60701	2008-12-08 17:02:37 +00:00

1 2 3 4 5 ...

892 Commits