llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Benjamin Kramer	b8389165be	PR13095: Give an inline cost bonus to functions using byval arguments. We give a bonus for every argument because the argument setup is not needed anymore when the function is inlined. With this patch we interpret byval arguments as a compact representation of many arguments. The byval argument setup is implemented in the backend as an inline memcpy, so to model the cost as accurately as possible we take the number of pointer-sized elements in the byval argument and give a bonus of 2 instructions for every one of those. The bonus is capped at 8 elements, which is the number of stores at which the x86 backend switches from an expanded inline memcpy to a real memcpy. It would be better to use the real memcpy threshold from the backend, but it's not available via TargetData. This change brings the performance of c-ray in line with gcc 4.7. The included test case tries to reproduce the c-ray problem to catch regressions for this benchmark early, its performance is dominated by the inline decision of a specific call. This only has a small impact on most code, more on x86 and arm than on x86_64 due to the way the ABI works. When building LLVM for x86 it gives a small inline cost boost to virtually any function using StringRef or STL allocators, but only a 0.01% increase in overall binary size. The size of gcc compiled by clang actually shrunk by a couple bytes with this patch applied, but not significantly. llvm-svn: 161413	2012-08-07 11:13:19 +00:00
Chandler Carruth	ca6b087618	Fix PR13412, a nasty miscompile due to the interleaved instsimplify+inline strategy. The crux of the problem is that instsimplify was reasonably relying on an invariant that is true within any single function, but is no longer true mid-inline the way we use it. This invariant is that an argument pointer != a local (alloca) pointer. The fix is really light weight though, and allows instsimplify to be resiliant to these situations: when checking the relation ships to function arguments, ensure that the argumets come from the same function. If they come from different functions, then none of these assumptions hold. All credit to Benjamin Kramer for coming up with this clever solution to the problem. llvm-svn: 161410	2012-08-07 10:59:59 +00:00
Chandler Carruth	49d4e3f282	Add a much more conservative strategy for aligning branch targets. Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409	2012-08-07 09:45:24 +00:00
Nadav Rotem	1fbb339620	When constant folding GEP expressions, keep the address space information of pointers. Together with Ran Chachick <ran.chachick@intel.com> llvm-svn: 160954	2012-07-30 07:25:20 +00:00
Nick Lewycky	a1eb0b5f2e	Add testcases for GlobalOpt changes in r160693 and r160757. llvm-svn: 160925	2012-07-29 01:15:37 +00:00
Nuno Lopes	a4d7ce1441	fix PR13390: do not loop forever with self-referencing self instructions llvm-svn: 160876	2012-07-27 18:21:15 +00:00
Nuno Lopes	7ec9936cb2	fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst. This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW llvm-svn: 160874	2012-07-27 18:03:57 +00:00
Pete Cooper	ddb89a91ca	Simplify demanded bits of select sources where the condition is a constant vector llvm-svn: 160835	2012-07-26 23:10:24 +00:00
Pete Cooper	8d971d19cb	Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand llvm-svn: 160823	2012-07-26 22:37:04 +00:00
Duncan Sands	c785ace7fd	Stop reassociate from looking through expressions of arbitrary complexity. This is a temporary measure until my fix for PR13021 is ready. llvm-svn: 160778	2012-07-26 09:26:40 +00:00
Duncan Sands	9f6bce9180	Don't perform an overaligned load in this test, since that's undefined behaviour that might be exploited one day. llvm-svn: 160714	2012-07-25 09:45:37 +00:00
Duncan Sands	8080fe449f	When folding a load from a global constant, if the load started in the middle of an array element (rather than at the beginning of the element) and extended into the next element, then the load from the second element was being handled wrong due to incorrect updating of the notion of which byte to load next. This fixes PR13442. Thanks to Chris Smowton for reporting the problem, analyzing it and providing a fix. llvm-svn: 160711	2012-07-25 09:14:54 +00:00
Nuno Lopes	06ac861756	teach objectsize about strdup() and strndup() llvm-svn: 160676	2012-07-24 16:28:13 +00:00
Nick Lewycky	6644694650	Teach globalopt to not nuke all stores to globals. Keep them around of they might be deliberate "one time" leaks, so that leak checkers can find them. This is a reapply of r160602 with the fix that this time I'm committing the code I thought I was committing last time; the I->eraseFromParent() goes after the break out of the loop. llvm-svn: 160664	2012-07-24 07:21:08 +00:00
Dan Gohman	b9b982cd41	An objc_retain can serve as a may-use for a different pointer. rdar://11931823. llvm-svn: 160637	2012-07-23 19:27:31 +00:00
Nick Lewycky	f1a5d95995	Revert r160602. llvm-svn: 160603	2012-07-21 09:03:15 +00:00
Nick Lewycky	9d1d5bfd50	Teach globalopt to play nice with leak checkers. This is a reapplication of r160529 that was subsequently reverted. The fix was to not call GV->eraseFromParent() right before the caller does the same. The existing testcases already caught this bug if run under valgrind. llvm-svn: 160602	2012-07-21 08:29:45 +00:00
Nuno Lopes	66a3934c7a	move the bounds checking pass to the instrumentation folder, where it belongs. I dunno why in the world I dropped it in the Scalar folder in the first place. No functionality change. llvm-svn: 160587	2012-07-20 22:39:33 +00:00
Richard Osborne	f82086baa5	Fix assertion in jump threading (PR13405). GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't true if the block ends in an indirect branch with no successors. Fix this by bailing out earlier in this case. llvm-svn: 160546	2012-07-20 10:36:17 +00:00
Nick Lewycky	62064c6cc3	Revert r160529 due to crashes. llvm-svn: 160532	2012-07-19 23:59:21 +00:00
Nick Lewycky	8a31eaccbd	Don't wipe out global variables that are probably storing pointers to heap memory. This makes clang play nice with leak checkers. llvm-svn: 160529	2012-07-19 22:35:28 +00:00
Andrew Trick	db674bed44	Added unit test for PR13361: LSR + SCEV "hangs" on reasonably sized test. llvm-svn: 160439	2012-07-18 18:07:52 +00:00
Andrew Trick	612785f908	indvars: Linear function test replace should avoid reusing undef. Fixes PR13371: indvars pass incorrectly substitutes 'undef' values. I do not like this fix. It's needed until/unless the meaning of undef changes. It attempts to be complete according to the IR spec, but I don't have much confidence in the implementation given the difficulty testing undefined behavior. Worse, this invalidates some of my hard-fought work on indvars and LSR to optimize pointer induction variables. It results benchmark regressions, which I'll track internally. On x86_64 no LTO I see: -3% huffbench -3% 400.perlbench -8% fhourstones My only suggestion for recovering is to change the meaning of undef. If we could trust an arbitrary instruction to produce a some real value that can be manipulated (e.g. incremented) according to non-undef rules, then this case could be easily handled with SCEV. llvm-svn: 160421	2012-07-18 04:35:10 +00:00
Evan Cheng	5e82ad04d5	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
NAKAMURA Takumi	dff0927cea	llvm/test/Transforms/LoopRotate/PhiRename-1.ll: FileCheck-ize. It fixes PR13301. It began choking since Chandler's r159547, possibly due to improper expression on grep from TclParser to ShParser. llvm-svn: 160367	2012-07-17 15:43:17 +00:00
Nuno Lopes	97c381ea93	fix PR13339 (remove the predecessor from the unwind BB when removing an invoke) llvm-svn: 160325	2012-07-16 22:49:40 +00:00
Andrew Trick	8030f89de4	LSR Fix: check SCEV expression safety before expansion. All SCEV expressions used by LSR formulae must be safe to expand. i.e. they may not contain UDiv unless we can prove nonzero denominator. Fixes PR11356: LSR hoists UDiv. llvm-svn: 160205	2012-07-13 23:33:10 +00:00
Evan Cheng	e6c5349fcd	Instcombine was transforming: %shr = lshr i64 %key, 3 %0 = load i64* %val, align 8 %sub = add i64 %0, -1 %and = and i64 %sub, %shr ret i64 %and to: %shr = lshr i64 %key, 3 %0 = load i64* %val, align 8 %sub = add i64 %0, 2305843009213693951 %and = and i64 %sub, %shr ret i64 %and The demanded bit optimization is actually a pessimization because add -1 would be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization to check for negated constant to make sure it is actually reducing the width of the constant. rdar://11793464 llvm-svn: 160101	2012-07-12 01:45:35 +00:00
Nuno Lopes	c676931bb9	instcombine: merge the functions that remove dead allocas and dead mallocs/callocs/... This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :) In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway llvm-svn: 159952	2012-07-09 18:38:20 +00:00
Nuno Lopes	f3ba9a4d21	teach instcombine to remove allocated buffers even if there are stores, memcpy/memmove/memset, and objectsize users. This means we can do cheap DSE for heap memory. Nothing is done if the pointer excapes or has a load. The churn in the tests is mostly due to objectsize, since we want to make sure we don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0) llvm-svn: 159876	2012-07-06 23:09:25 +00:00
Nuno Lopes	eac3a6d03c	BoundsChecking: optimize out the check for offset < 0 if size is known to be >= 0 (signed). (LLVM optimizers cannot do this optimization by themselves) llvm-svn: 159668	2012-07-03 17:30:18 +00:00
Eric Christopher	185293d560	Revert "IntRange:" as it appears to be breaking self hosting. This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c. llvm-svn: 159618	2012-07-02 23:22:21 +00:00
Chandler Carruth	5d3a0ce4e5	Fix the remaining TCL-style quotes found in the testsuite. This is another mechanical change accomplished though the power of terrible Perl scripts. I have manually switched some "s to 's to make escaping simpler. While I started this to fix tests that aren't run in all configurations, the massive number of tests is due to a really frustrating fragility of our testing infrastructure: things like 'grep -v', 'not grep', and 'expected failures' can mask broken tests all too easily. Essentially, I'm deeply disturbed that I can change the testsuite so radically without causing any change in results for most platforms. =/ llvm-svn: 159547	2012-07-02 19:09:46 +00:00
Duncan Sands	16235b1885	GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection. llvm-svn: 159546	2012-07-02 18:55:39 +00:00
Chandler Carruth	d200829a4f	Convert the uses of '\|&' to use '2>&1 \|' instead, which works on old versions of Bash. In addition, I can back out the change to the lit built-in shell test runner to support this. This should fix the majority of fallout on Darwin, but I suspect there will be a few straggling issues. llvm-svn: 159544	2012-07-02 18:37:59 +00:00
Nuno Lopes	e967ebe7bb	fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB llvm-svn: 159534	2012-07-02 16:14:47 +00:00
Stepan Dyatkovskiy	fff4579249	IntRange: - Changed isSingleNumber method behaviour. Now this flag is calculated on demand. IntegersSubsetMapping - Optimized diff operation. - Replaced type of Items field from std::list with std::map. - Added new methods: bool isOverlapped(self &RHS) void add(self& RHS, SuccessorClass S) void detachCase(self& NewMapping, SuccessorClass Succ) void removeCase(SuccessorClass Succ) SuccessorClass findSuccessor(const IntTy& Val) const IntTy* getCaseSingleNumber(SuccessorClass *Succ) IntegersSubsetTest - DiffTest: Added checks for successors. SimplifyCFG Updated SwitchInst usage (now it is case-ragnes compatible) for - SimplifyEqualityComparisonWithOnlyPredecessor - FoldValueComparisonIntoPredecessors llvm-svn: 159527	2012-07-02 13:02:18 +00:00
Chandler Carruth	8a358b3669	Convert all tests using TCL-style quoting to use shell-style quoting. This was done through the aid of a terrible Perl creation. I will not paste any of the horrors here. Suffice to say, it require multiple staged rounds of replacements, state carried between, and a few nested-construct-parsing hacks that I'm not proud of. It happens, by luck, to be able to deal with all the TCL-quoting patterns in evidence in the LLVM test suite. If anyone is maintaining large out-of-tree test trees, feel free to poke me and I'll send you the steps I used to convert things, as well as answer any painful questions etc. IRC works best for this type of thing I find. Once converted, switch the LLVM lit config to use ShTests the same as Clang. In addition to being able to delete large amounts of Python code from 'lit', this will also simplify the entire test suite and some of lit's architecture. Finally, the test suite runs 33% faster on Linux now. ;] For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s llvm-svn: 159525	2012-07-02 12:47:22 +00:00
Duncan Sands	64b10a65e1	Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to the optimizers producing a multiply expression with more multiplications than the original (!). llvm-svn: 159426	2012-06-29 13:25:06 +00:00
Nuno Lopes	66896bbd47	make simplifyCFG erase invokes to readonly/readnone functions llvm-svn: 159385	2012-06-28 22:32:27 +00:00
Nuno Lopes	b0d4abe297	make instcombine produce calls to llvm.donothing instead of a random intrinsic llvm-svn: 159384	2012-06-28 22:31:24 +00:00
Nuno Lopes	52920835c9	make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases) llvm-svn: 159353	2012-06-28 16:13:37 +00:00
Hal Finkel	89ff4e2b47	Allow BBVectorize to form non-2^n-length vectors. The original algorithm only used recursive pair fusion of equal-length types. This is now extended to allow pairing of any types that share the same underlying scalar type. Because we would still generally prefer the 2^n-length types, those are formed first. Then a second set of iterations form the non-2^n-length types. Also, a call to SimplifyInstructionsInBlock has been added after each pairing iteration. This takes care of DCE (and a few other things) that make the following iterations execute somewhat faster. For the same reason, some of the simple shuffle-combination cases are now handled internally. There is some additional refactoring work to be done, but I've had many requests for this feature, so additional refactoring will come soon in future commits (as will additional test cases). llvm-svn: 159330	2012-06-28 05:42:42 +00:00
Nuno Lopes	873f05c3ff	make LVI::getEdgeValue() always intersect the constraints of the edge with the range of the block. Previously it was only performing the intersection for a few cases, thus losing precision llvm-svn: 159320	2012-06-28 01:16:18 +00:00
Matt Beaumont-Gay	93c66a3db1	Revert r159136 due to PR13124. Original commit message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159272	2012-06-27 17:10:33 +00:00
Duncan Sands	1c87a20df1	Some reassociate optimizations create new instructions, which they insert just before the expression root. Any existing operators that are changed to use one of them needs to be moved between it and the expression root, and recursively for the operators using that one. When I rewrote RewriteExprTree I accidentally inverted the logic, resulting in the compacting going down from operators to operands rather than up from operands to the operators using them, oops. Fix this, resolving PR12963. llvm-svn: 159265	2012-06-27 14:19:00 +00:00
Evan Cheng	9132bcf0e3	Remove a instcombine transform that (no longer?) makes sense: // C - zext(bool) -> bool ? C - 1 : C if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1)) if (ZI->getSrcTy()->isIntegerTy(1)) return SelectInst::Create(ZI->getOperand(0), SubOne(C), C); This ends up forming sext i1 instructions that codegen to terrible code. e.g. int blah(_Bool x, _Bool y) { return (x - y) + 1; } => movzbl %dil, %eax movzbl %sil, %ecx shll $31, %ecx sarl $31, %ecx leal 1(%rax,%rcx), %eax ret Without the rule, llvm now generates: movzbl %sil, %ecx movzbl %dil, %eax incl %eax subl %ecx, %eax ret It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-). The transformation was done as part of Eli's r75531. He has given the ok to remove it. rdar://11748024 llvm-svn: 159230	2012-06-26 22:03:13 +00:00
Duncan Sands	1770ae1ae4	Replacing zero-sized alloca's with a null pointer is too aggressive, instead merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS conformance testsuite. What happened there was that a variable sized object was being allocated on the stack, "alloca i8, i32 %size". It was then being passed to another function, which tested that the address was not null (raising an exception if it was) then manipulated %size bytes in it (load and/or store). The optimizers cleverly managed to deduce that %size was zero (congratulations to them, as it isn't at all obvious), which made the alloca zero size, causing the optimizers to replace it with null, which then caused the check mentioned above to fail, and the exception to be raised, wrongly. Note that no loads and stores were actually being done to the alloca (the loop that does them is executed %size times, i.e. is not executed), only the not-null address check. llvm-svn: 159202	2012-06-26 13:39:21 +00:00
Andrew Trick	c5e08120a4	Enable the new LoopInfo algorithm by default. The primary advantage is that loop optimizations will be applied in a stable order. This helps debugging and unit test creation. It is also a better overall implementation without pathologically bad performance on deep functions. On large functions (llvm-stress --size=200000 \| opt -loops) Before: 0.1263s After: 0.0225s On deep functions (after tweaking llvm-stress, thanks Nadav): Before: 0.2281s After: 0.0227s See r158790 for more comments. The loop tree is now consistently generated in forward order, but loop passes are applied in reverse order over the program. If we have a loop optimization that prefers forward order, that can easily be achieved by adding a different type of LoopPassManager. llvm-svn: 159183	2012-06-26 04:11:38 +00:00
Nuno Lopes	bf0bd73d19	revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias llvm-svn: 159175	2012-06-25 23:26:10 +00:00

1 2 3 4 5 ...

3010 Commits