llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00

Author	SHA1	Message	Date
David Blaikie	b57edabf0c	Recommit r233116 better: Remove a redundant instcombine involving bitcasts of geps of bitcasts This just didn't need to be here at all, but the assertion I tried to add wasn't appropriate either - the circumstance isn't impossible, it's just not important to deal with it here - the gep-rooted version of this instcombine will handle this case, we don't need to duplicate it for the case where the gep happens to be used in a bitcast. llvm-svn: 233404	2015-03-27 20:13:55 +00:00
Anna Zaks	bb3446e865	[asan] Speed up isInterestingAlloca check We make many redundant calls to isInterestingAlloca in the AddressSanitzier pass. This is especially inefficient for allocas that have many uses. Let's cache the results to speed up compilation. The compile time improvements depend on the input. I did not see much difference on benchmarks; however, I have a test case where compile time goes from minutes to under a second. llvm-svn: 233397	2015-03-27 18:52:01 +00:00
Yaron Keren	3856893d6d	Remove superfluous .str() and replace std::string concatenation with Twine. llvm-svn: 233392	2015-03-27 17:51:30 +00:00
James Molloy	844dff224e	Reapply r233175 and r233183: float2int. This re-adds float2int to the tree, after fixing PR23038. It turns out the argument to APSInt() is true-if-unsigned, rather than true-if-signed :(. Added testcase and explanatory comment. llvm-svn: 233370	2015-03-27 10:36:57 +00:00
Sanjoy Das	a1908cd835	[NFC] Fix typo in comment. llvm-svn: 233363	2015-03-27 06:01:56 +00:00
Philip Reames	77522a75bc	Code cleanup [NFC] The assertion here was more expensive then it needed to be. We're only inserting allocas in the entry block, so we only need to consider ones in the entry block. llvm-svn: 233362	2015-03-27 05:53:16 +00:00
Philip Reames	92b4844b30	More code cleanup [NFC] llvm-svn: 233361	2015-03-27 05:47:00 +00:00
Philip Reames	96d87ad71c	More code cleanup [NFC] Minor naming, one potentially unsafe cast llvm-svn: 233359	2015-03-27 05:39:32 +00:00
Philip Reames	f28af3049b	Code simplification and style cleanup All the removed assertions are either implied locally by the assert at the top of the function or properties of the verifier. llvm-svn: 233358	2015-03-27 05:34:44 +00:00
Karthik Bhat	2c8e43726d	Refactor Code inside LoopVectorizer's function isInductionVariable. This patch exposes LoopVectorizer's isInductionVariable function as common a functionality. http://reviews.llvm.org/D8608 llvm-svn: 233352	2015-03-27 03:44:15 +00:00
Nick Lewycky	17b59ec505	Revert r233175 and r233183 with it. This pulls float2int back out of the tree, due to PR23038. llvm-svn: 233350	2015-03-27 02:00:11 +00:00
Benjamin Kramer	ab138a6078	InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0 Anding and comparing with zero can be done in a single instruction on most archs so this is a bit cheaper. llvm-svn: 233291	2015-03-26 17:12:06 +00:00
Jingyue Wu	537669aad8	[SLSR] handle candidate form &B[i * S] Summary: This patch enhances SLSR to handle another candidate form &B[i * S]. If we found two candidates S1: X = &B[i * S] S2: Y = &B[i' * S] and S1 dominates S2, we can replace S2 with Y = &X[(i' - i) * S] Test Plan: slsr-gep.ll X86/no-slsr.ll: verify that we do not run SLSR on GEPs that already fit into an addressing mode Reviewers: eliben, atrick, meheff, hfinkel Reviewed By: hfinkel Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D7459 llvm-svn: 233286	2015-03-26 16:49:24 +00:00
Andrea Di Biagio	713ffecfc7	[optnone] Skip pass Float2Int on optnone functions. Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int is not run on optnone functions. llvm-svn: 233183	2015-03-25 12:22:37 +00:00
James Molloy	17b6105997	Reapply r233062: "float2int": Add a new pass to demote from float to int where possible. Now with a fix for PR23008 and extra regression test. llvm-svn: 233175	2015-03-25 10:03:42 +00:00
David Blaikie	27c176e356	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) SCEV looks like it'll need some restructuring - we'll have to do a bit more work for GEP canonicalization, since it'll depend on how it's used if we can even manage to canonicalize it to a non-ugly GEP. I guess we can do some fun stuff like voting (do 2 out of 3 load from the GEP with a certain type that gives a pretty GEP? Does every typed use of the GEP use either a specific type or a generic type (i8*, etc)?) llvm-svn: 233131	2015-03-24 23:34:31 +00:00
Sanjay Patel	1a17022bda	optimize the AVX2 (integer) version of vperm2 into a shuffle ...because this is what happens when an instruction set puts its underwear on after its pants. This is an extension of r232852, r233100, and 233110: http://llvm.org/viewvc/llvm-project?view=revision&revision=232852 http://llvm.org/viewvc/llvm-project?view=revision&revision=233100 http://llvm.org/viewvc/llvm-project?view=revision&revision=233110 llvm-svn: 233127	2015-03-24 22:39:29 +00:00
David Blaikie	0dc1532dd9	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly The changes to InstCombine do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) llvm-svn: 233126	2015-03-24 22:38:16 +00:00
Philip Reames	d57b6c2219	Merge empty landing pads in SimplifyCFG This patch tries to merge duplicate landing pads when they branch to a common shared target. Given IR that looks like this: lpad1: %exn = landingpad {i8, i32} personality i32 (...) @__gxx_personality_v0 cleanup br label %shared_resume lpad2: %exn2 = landingpad {i8, i32} personality i32 (...) @__gxx_personality_v0 cleanup br label %shared_resume shared_resume: call void @fn() ret void } We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well. Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block. Differential Revision: http://reviews.llvm.org/D8297 llvm-svn: 233125	2015-03-24 22:28:45 +00:00
David Blaikie	8b46a1bcb2	Revert "Remove an InstCombine that seems to have become redundant." Assertion fires in compiler-rt. Guess it does fire.. This reverts commit r233116. llvm-svn: 233121	2015-03-24 21:50:35 +00:00
David Blaikie	0914ef9c6b	Remove an InstCombine that seems to have become redundant. Assert that this doesn't fire - I'll remove all of this later, but just leaving it in for a while in case this is firing & we just don't have test coverage. llvm-svn: 233116	2015-03-24 21:31:31 +00:00
Sanjay Patel	c077161b46	[X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles This is the IR optimizer follow-on patch for D8563: the x86 backend patch that converts this kind of shuffle back into a vperm2. This is also a continuation of the transform that started in D8486. In that patch, Andrea suggested that we could convert vperm2 intrinsics that use zero masks into a single shuffle. This is an implementation of that suggestion. Differential Revision: http://reviews.llvm.org/D8567 llvm-svn: 233110	2015-03-24 20:36:42 +00:00
Hans Wennborg	fbd19841f0	Revert r233062 ""float2int": Add a new pass to demote from float to int where possible." This caused PR23008, compiles failing with: "Use still stuck around after Def is destroyed: %.sroa.speculated" Also reverting follow-up r233064. llvm-svn: 233105	2015-03-24 20:07:08 +00:00
Sanjoy Das	5e0d47a08c	[IRCE] Fix how IRCE checks for no-sign-overflow. IRCE requires the induction variables it handles to not sign-overflow. The current scheme of checking if sext({X,+,S}) == {sext(X),+,sext(S)} fails when SCEV simplifies sext(X) too. After this change we //also// check no-signed-wrap by looking at the flags set on the SCEVAddRecExpr. llvm-svn: 233102	2015-03-24 19:29:22 +00:00
Sanjoy Das	867451d1ec	[IRCE] Fix a regression introduced in r232444. IRCE should not try to eliminate range checks that check an induction variable against a loop-varying length. llvm-svn: 233101	2015-03-24 19:29:18 +00:00
Benjamin Kramer	99a748be6f	[float2int] Sort includes and add missing raw_ostream include. llvm-svn: 233064	2015-03-24 11:28:47 +00:00
James Molloy	32530d77c8	"float2int": Add a new pass to demote from float to int where possible. It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used. This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers. This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point. To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around. llvm-svn: 233062	2015-03-24 11:15:23 +00:00
Benjamin Kramer	6a9aa608f1	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
Benjamin Kramer	36883e98af	[ctorutils] Update and sort includes. NFC. llvm-svn: 232995	2015-03-23 19:06:17 +00:00
Benjamin Kramer	e51698aa6c	Another set of missing raw_ostream.h. Still no functional change. llvm-svn: 232993	2015-03-23 18:45:56 +00:00
Benjamin Kramer	45a545b9c6	Purge unused includes throughout libSupport. NFC. llvm-svn: 232976	2015-03-23 18:07:13 +00:00
Benjamin Kramer	d52ec1c0ec	Move private classes into anonymous namespaces NFC. llvm-svn: 232944	2015-03-23 12:30:58 +00:00
Benjamin Kramer	b1b20b3003	[SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform. llvm-svn: 232903	2015-03-21 22:04:26 +00:00
Benjamin Kramer	facc0a564b	[SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check. strchr("123!", C) != nullptr is a common pattern to check if C is one of 1, 2, 3 or !. If the largest element of the string is smaller than the target's register size we can easily create a bitfield and just do a simple test for set membership. int foo(char C) { return strchr("123!", C) != nullptr; } now becomes cmpl $64, %edi ## range check sbbb %al, %al movabsq $0xE000200000001, %rcx btq %rdi, %rcx ## bit test sbbb %cl, %cl andb %al, %cl ## and the two conditions andb $1, %cl movzbl %cl, %eax ## returning an int ret (imho the backend should expand this into a series of branches, but that's a different story) The code is currently limited to bit fields that fit in a register, so usually 64 or 32 bits. Sadly, this misses anything using alpha chars or {}. This could be fixed by just emitting a i128 bit field, but that can generate really ugly code so we have to find a better way. To some degree this is also recreating switch lowering logic, but we can't simply emit a switch instruction and thus change the CFG within instcombine. llvm-svn: 232902	2015-03-21 21:09:33 +00:00
Benjamin Kramer	6c7bd16fc0	SimplifyLibCalls: Add basic optimization of memchr calls. This is just memchr(x, y, 0) -> nullptr and constant folding. llvm-svn: 232896	2015-03-21 15:36:21 +00:00
Kostya Serebryany	55ec403858	[sanitizer] experimental tracing for cmp instructions llvm-svn: 232873	2015-03-21 01:29:36 +00:00
Sanjay Patel	cf8a53f502	[X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles vperm2* intrinsics are just shuffles. In a few special cases, they're not even shuffles. Optimizing intrinsics in InstCombine is better than handling this in the front-end for at least two reasons: 1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders really angry (and so I have regrets about some patches from last week). 2. Doing mask conversion logic in header files is hard to write and subsequently read. There are a couple of TODOs in this patch to complete this optimization. Differential Revision: http://reviews.llvm.org/D8486 llvm-svn: 232852	2015-03-20 21:47:56 +00:00
Andrew Kaylor	7b78ee54b5	Fixing a bug with WinEH PHI handling llvm-svn: 232851	2015-03-20 21:42:54 +00:00
Duncan P. N. Exon Smith	c31f1b0224	SanitizerCoverage: Check for null DebugLocs After a WIP patch to make `DIDescriptor` accessors more strict, this started asserting. llvm-svn: 232832	2015-03-20 18:48:45 +00:00
Duncan P. N. Exon Smith	34c48c3fe8	SampleProfile: Check for missing debug locations Don't use `DebugLoc` accessors if we're pointing at null, which will be a problem after a WIP patch to make the `DIDescriptor` accessors more strict. Caught by Frontend/profile-sample-use-loc-tracking.c (in clang). llvm-svn: 232792	2015-03-20 00:56:55 +00:00
Duncan P. N. Exon Smith	57ccff4630	Verifier: Remove the separate -verify-di pass Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass. Instead, call into the `DebugInfoVerifier` from inside `VerifierLegacyPass::finalizeModule()`. This better matches the logic in `verifyModule()` (used by the new PassManager), avoids requiring two separate passes to verify the IR, and makes the API for "add a pass to verify the IR" simple. Note: the `-verify-debug-info` flag still works (for now, at least; eventually it might make sense to just remove it). llvm-svn: 232772	2015-03-19 22:24:17 +00:00
Peter Collingbourne	1254323d85	LowerBitSets: Avoid reusing byte set addresses. Each use of the byte array uses a different alias. This makes the backend less likely to reuse previously computed byte array addresses, improving the security of the CFI mechanism based on this pass. Differential Revision: http://reviews.llvm.org/D8455 llvm-svn: 232770	2015-03-19 22:02:10 +00:00
Peter Collingbourne	3bdbf14413	libLTO, llvm-lto, gold: Introduce flag for controlling optimization level. This change also introduces a link-time optimization level of 1. This optimization level runs only the globaldce pass as well as cleanup passes for passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html llvm-svn: 232769	2015-03-19 22:01:00 +00:00
Duncan P. N. Exon Smith	e8a4b7fb5f	PassManagerBuilder: Remove effectively dead 'StripDebug' option `StripDebug` was only used by tools/opt/opt.cpp in `AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its command-line flag before it calls `AddStandardLinkPasses()`. Stripping debug info twice isn't very useful. llvm-svn: 232765	2015-03-19 21:37:17 +00:00
Peter Collingbourne	c3999ab5f3	GlobalDCE: Improve performance for large modules containing comdats. When we encounter a global with a comdat, rather than iterating over every global in the module to find globals in the same comdat, store the members in a multimap. This effectively lowers the complexity to O(N log N), improving performance significantly for large modules such as might be encountered during LTO. It looks like we used to do something like this until r219191. No functional change. Differential Revision: http://reviews.llvm.org/D8431 llvm-svn: 232743	2015-03-19 18:23:29 +00:00
Daniel Jasper	7e5f7305cf	[InstCombine] Don't fold a GEP into itself through a PHI node This can only occur (I think) through the back-edge of the loop. However, folding a GEP into itself means that the value of the previous iteration needs to be stored in the meantime, thus requiring an additional register variable to be live, but not actually achieving anything (the gep still needs to be executed once per loop iteration). The attached test case is derived from: typedef unsigned uint32; typedef unsigned char uint8; inline uint8 f(uint32 value, uint8 target) { while (value >= 0x80) { value >>= 7; ++target; } ++target; return target; } uint8 g(uint32 b, uint8 target) { target = f(b, f(42, target)); return target; } What happens is that the GEP stored in incptr2 is folded into itself through the loop's back-edge and the phi-node stored in loopptr, effectively incrementing the ptr by "2" in each iteration instead of "1". In this case, it is actually increasing the number of GEPs required as the GEP before the loop can't be folded away anymore. For comparison: With this patch: define i8* @test4(i32 %value, i8* %buffer) { entry: %cmp = icmp ugt i32 %value, 127 br i1 %cmp, label %loop.header, label %exit loop.header: ; preds = %entry br label %loop.body loop.body: ; preds = %loop.body, %loop.header %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ] %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ] %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1 %shr = lshr i32 %newval, 7 %cmp2 = icmp ugt i32 %newval, 16383 br i1 %cmp2, label %loop.body, label %loop.exit loop.exit: ; preds = %loop.body br label %exit exit: ; preds = %loop.exit, %entry %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ] %incptr3 = getelementptr inbounds i8, i8* %0, i64 2 ret i8* %incptr3 } Without this patch: define i8* @test4(i32 %value, i8* %buffer) { entry: %incptr = getelementptr inbounds i8, i8* %buffer, i64 1 %cmp = icmp ugt i32 %value, 127 br i1 %cmp, label %loop.header, label %exit loop.header: ; preds = %entry br label %loop.body loop.body: ; preds = %loop.body, %loop.header %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ] %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ] %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ] %shr = lshr i32 %newval, 7 %incptr2 = getelementptr inbounds i8, i8* %0, i64 2 %cmp2 = icmp ugt i32 %newval, 16383 br i1 %cmp2, label %loop.body, label %loop.exit loop.exit: ; preds = %loop.body br label %exit exit: ; preds = %loop.exit, %entry %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ] %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1 ret i8* %incptr3 } Review: http://reviews.llvm.org/D8245 llvm-svn: 232718	2015-03-19 11:05:08 +00:00
Sanjoy Das	1f60e8293a	[ConstantRange] Split makeICmpRegion in two. Summary: This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and `makeSatisfyingICmpRegion` with slightly different contracts. The first one is useful for determining what values some expression //may// take, given that a certain `icmp` evaluates to true. The second one is useful for determining what values are guaranteed to //satisfy// a given `icmp`. Reviewers: nlewycky Reviewed By: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8345 llvm-svn: 232575	2015-03-18 00:41:24 +00:00
Michael Zolotukhin	4b02a3bee3	Try to fix a test broken by one of my previous commits. llvm-svn: 232536	2015-03-17 20:31:56 +00:00
Michael Zolotukhin	4658f05da2	LoopVectorize: teach loop vectorizer to vectorize calls. The tests would be committed in a commit for http://reviews.llvm.org/D8131 Review: http://reviews.llvm.org/D8095 llvm-svn: 232530	2015-03-17 19:46:50 +00:00
Michael Zolotukhin	65fd2f0883	LoopVectorizer: Add TargetTransformInfo. Review: http://reviews.llvm.org/D8092 llvm-svn: 232522	2015-03-17 19:17:18 +00:00

1 2 3 4 5 ...

12778 Commits