llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00

Author	SHA1	Message	Date
Yaron Keren	56a458d0dd	Move createEliminateAvailableExternallyPass earlier in the pass pipeline to save running many ModulePasses on available external functions that are thrown away anyhow. llvm-svn: 246619	2015-09-02 06:34:11 +00:00
Hans Wennborg	4ed67f023f	DeadArgElim: don't eliminate arguments from naked functions Differential Revision: http://reviews.llvm.org/D12534 llvm-svn: 246564	2015-09-01 18:06:46 +00:00
Hans Wennborg	601f6f5c64	Fix Windows build by including raw_ostream.h llvm-svn: 246486	2015-08-31 21:19:18 +00:00
Philip Reames	c1eca23247	[FunctionAttr] Infer nonnull attributes on returns Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't. Differential Revision: http://reviews.llvm.org/D9688 llvm-svn: 246476	2015-08-31 19:44:38 +00:00
Jingyue Wu	02e7637de1	[JumpThreading] make jump threading respect convergent annotation. Summary: JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example, if (cond) { ... } else { ... } convergent_call(); if (cond) { ... } else { ... } should not be optimized to if (cond) { ... convergent_call(); ... } else { ... convergent_call(); ... } Test Plan: test/Transforms/JumpThreading/basic.ll Patch by Xuetian Weng. Reviewers: resistor, arsenm, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12484 llvm-svn: 246415	2015-08-31 06:10:27 +00:00
Sanjoy Das	b13b002480	[InstCombine] Fix PR24605. PR24605 is caused due to an incorrect insert point in instcombine's IR builder. When simplifying %t = add X Y ... %m = icmp ... %t the replacement for %t should be placed before %t, not before %m, as there could be a use of %t between %t and %m. llvm-svn: 246315	2015-08-28 19:09:31 +00:00
Chad Rosier	85d4ab5305	Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y. http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313	2015-08-28 18:30:18 +00:00
JF Bastien	940c24167e	Remove Merge Functions pointer comparisons Summary: This patch removes two remaining places where pointer value comparisons are used to order functions: comparing range annotation metadata, and comparing block address constants. (These are both rare cases, and so no actual non-determinism was observed from either case). The fix for range metadata is simple: the annotation always consists of a pair of integers, so we just order by those integers. The fix for block addresses is more subtle. Two constants are the same if they are the same basic block in the same function, or if they refer to corresponding basic blocks in each respective function. Note that in the first case, merging is trivially correct. In the second, the correctness of merging relies on the fact that the the values of block addresses cannot be compared. This change is actually an enhancement, as these functions could not previously be merged (see merge-block-address.ll). There is still a problem with cross function block addresses, in that constants pointing to a basic block in a merged function is not updated. This also more robustly compares floating point constants by all fields of their semantics, and fixes a dyn_cast/cast mixup. Author: jrkoenig Reviewers: dschuff, nlewycky, jfb Subscribers llvm-commits Differential revision: http://reviews.llvm.org/D12376 llvm-svn: 246305	2015-08-28 16:49:09 +00:00
Chandler Carruth	ef9ebf2ab4	[SROA] Fix PR24463, a crash I introduced in SROA by allowing it to handle more allocas with loads past the end of the alloca. I suspect there are some related crashers with slightly different patterns, but I'll fix those and add test cases as I find them. Thanks to David Majnemer for the excellent test case reduction here. Made this super simple to debug and fix. llvm-svn: 246289	2015-08-28 09:03:52 +00:00
Steven Wu	6cf610af2e	Revert r246244 and r246243 These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279	2015-08-28 06:52:00 +00:00
Piotr Padlewski	27645c1742	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244	2015-08-28 01:02:00 +00:00
Piotr Padlewski	a47d0bad33	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246243	2015-08-28 01:01:57 +00:00
Tyler Nowicki	49268c1eff	Improve vectorization diagnostic messages and extend vectorize(enable) pragma. This patch changes the analysis diagnostics produced when loops with floating-point recurrences or memory operations are identified. The new messages say "cannot prove it is safe to reorder * operations; allow reordering by specifying #pragma clang loop vectorize(enable)". Depending on the type of diagnostic the message will include additional options such as ffast-math or __restrict__. This patch also allows the vectorize(enable) pragma to override the low pointer memory check threshold. When the hint is given a higher threshold is used. See the clang patch for the options produced for each diagnostic. llvm-svn: 246187	2015-08-27 18:56:49 +00:00
Chad Rosier	6e3e56c088	[LoopVectorize] Add Support for Small Size Reductions. Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote scalar operations if they are smaller than a native type (e.g., i8 arithmetic is promoted to i32 arithmetic on Arm targets). This patch detects and removes type-promotions within the reduction detection framework, enabling the vectorization of small size reductions. In the legality phase, we look through the ANDs and extensions that InstCombine creates during promotion, keeping track of the smaller type. In the profitability phase, we use the smaller type and ignore the ANDs and extensions in the cost model. Finally, in the code generation phase, we truncate the result of the reduction to allow InstCombine to rewrite the entire expression in the smaller type. This fixes PR21369. http://reviews.llvm.org/D12202 Patch by Matt Simpson <mssimpso@codeaurora.org>! llvm-svn: 246149	2015-08-27 14:12:17 +00:00
James Molloy	d7310f7b46	[LoopVectorize] Extract InductionInfo into a helper class... ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145	2015-08-27 09:53:00 +00:00
Alex Rosenberg	e7cdadc960	Whoops, remove trailing whitespace. llvm-svn: 246141	2015-08-27 05:37:12 +00:00
Philip Reames	d2d22a2dd9	Allow value forwarding past release fences in EarlyCSE A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and then eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134	2015-08-27 01:32:33 +00:00
Philip Reames	258ea49e29	[RewriteStatepointsForGC] Reduce the number of new instructions for base pointers When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints. I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead. Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them. Differential Revision: http://reviews.llvm.org/D12004 llvm-svn: 246133	2015-08-27 01:02:28 +00:00
Tyler Nowicki	35ff72ec4d	Improved printing of analysis diagnostics in the loop vectorizer. This patch ensures that every analysis diagnostic produced by the vectorizer will be printed if the loop has a vectorization hint on it. The condition has also been improved to prevent printing when a disabling hint is specified. llvm-svn: 246132	2015-08-27 01:02:04 +00:00
Philip Reames	1e50c760b7	[SimplifyCFG] Prune code from a provably unreachable switch default As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine. Note: Duplicate case values are disallowed by the LangRef and verifier. Differential Revision: http://reviews.llvm.org/D11995 llvm-svn: 246125	2015-08-26 23:56:46 +00:00
Diego Novillo	1b494a160f	Fix memory leak in sample profile pass. The problem here were the function analyses invoked by the function pass manager from the new IPO pass. I looked at other IPO passes needing dominance information and the only one that requires it (partial inliner) does not use the standard dependency mechanism. This patch mimics what the partial inliner does to compute dominance, post-dominance and loop info. One thing I like about this approach is that I can delay the computation of all this until I actually need it. This should bring the ASAN buildbot back to green. If there's a better way to fix this, I'll do it in a follow-up patch. llvm-svn: 246066	2015-08-26 20:00:27 +00:00
David Majnemer	4e1829b473	[SimplifyLibCalls] Fix a typo cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046	2015-08-26 18:30:16 +00:00
Chandler Carruth	e9168c2be2	[SROA] Rip out all support for SSAUpdater in SROA. This was only added to preserve the old ScalarRepl's use of SSAUpdater which was originally to avoid use of dominance frontiers. Now, we only need a domtree, and we'll need a domtree right after this pass as well and so it makes perfect sense to always and only use the dom-tree powered mem2reg. This was flag-flipper earlier and has stuck reasonably so I wanted to gut the now-dead code out of SROA before we waste more time with it. Among other things, this will make passmanager porting easier. llvm-svn: 246028	2015-08-26 09:09:29 +00:00
Alex Rosenberg	3dbf34b74c	Modernize with range-based for loops. llvm-svn: 246018	2015-08-26 06:11:41 +00:00
Alex Rosenberg	9cb69096ff	Reduce code duplication. llvm-svn: 246017	2015-08-26 06:11:38 +00:00
Alex Rosenberg	8242d19f7c	Trailing whitespace llvm-svn: 246016	2015-08-26 06:11:36 +00:00
JF Bastien	c5d6ea096a	Comparing operands should not require the same ValueID Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging. Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12302 llvm-svn: 246001	2015-08-26 03:02:58 +00:00
Charles Davis	a28ebc86c1	Make variable argument intrinsics behave correctly in a Win64 CC function. Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 llvm-svn: 245990	2015-08-25 23:27:41 +00:00
Evgeniy Stepanov	1c84ad1a9a	[msan] Precise instrumentation for icmp sgt %x, -1. Extend signed relational comparison instrumentation with a special case for comparisons with -1. This fixes an MSan false positive when such comparison is used as a sign bit test. https://llvm.org/bugs/show_bug.cgi?id=24561 llvm-svn: 245980	2015-08-25 22:19:11 +00:00
NAKAMURA Takumi	48e91d8a5c	Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940. llvm-svn: 245957	2015-08-25 17:11:17 +00:00
Matthias Braun	194ba5f3f6	Fix dependencies/shared library build llvm-svn: 245955	2015-08-25 17:07:40 +00:00
Wei Mi	f8e0f7a698	The patch replace the overflow check in loop vectorization with the minimum loop iterations check. The loop minimum iterations check below ensures the loop has enough trip count so the generated vector loop will likely be executed, and it covers the overflow check. Differential Revision: http://reviews.llvm.org/D12107. llvm-svn: 245952	2015-08-25 16:43:47 +00:00
Diego Novillo	776683223a	Convert SampleProfile pass into a Module pass. Eventually, we will need sample profiles to be incorporated into the inliner's cost models. To do this, we need the sample profile pass to be a module pass. This patch makes no functional changes beyond the mechanical adjustments needed to run SampleProfile as a module pass. llvm-svn: 245940	2015-08-25 15:25:11 +00:00
Piotr Padlewski	9757df9cfe	Assume intrinsic handling in global opt It doesn't solve the problem, when for example we load something, and then assume that it is the same as some constant value, because globalopt will fail on unknown load instruction. The proposed solution would be to skip some instructions that we can't evaluate and they are safe to skip (f.e. load, assume and many others) and see if they are required to perform optimization (f.e. we don't care about ephemeral instructions that may appear using @llvm.assume()) http://reviews.llvm.org/D12266 llvm-svn: 245919	2015-08-25 01:34:15 +00:00
Sanjay Patel	7c54263be5	fix typo; NFC llvm-svn: 245869	2015-08-24 20:11:14 +00:00
Adhemerval Zanella	4e03bf749d	[sanitizers] Add DFSan support for AArch64 42-bit VMA This patch adds support for dfsan on aarch64-linux with 42-bit VMA (current default config for 64K pagesize kernels). The support is enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 245840	2015-08-24 13:48:10 +00:00
Mehdi Amini	490bf85c83	Require Dominator Tree For SROA, improve compile-time TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree. There is no reason not to require it up-front for SROA. Some history is necessary to understand why we ended-up here. r123437 switched the second (Legacy)SROA in the optimizer pipeline to use SSAUpdater in order to avoid recomputing the costly DominanceFrontier. The purpose was to speed-up the compile-time. Later r123609 removed the need for the DominanceFrontier in (Legacy)SROA. Right after, some cleanup was made in r123724 to remove any reference to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and SROA_DT (the latter replacing SROA_DF). The second argument of `createScalarReplAggregatesPass` was renamed from `UseDomFrontier` to `UseDomTree`. I believe this is were a mistake was made. The pipeline was not updated and the call site was still: PM->add(createScalarReplAggregatesPass(-1, false)); At that time, SROA was immediately followed in the pipeline by EarlyCSE which required alread the DominatorTree. Not requiring the DominatorTree in SROA didn't save anything, but unfortunately it was lost at this point. When the new SROA Pass was introduced in r163965, I believe the goal was to have an exact replacement of the existing SROA, this bug slipped through. You can see currently: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager SROA Dominator Tree Construction Early CSE After this patch: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager Dominator Tree Construction SROA Early CSE This improves the compile time from 88s to 23s for PR17855. https://llvm.org/bugs/show_bug.cgi?id=17855 And from 113s to 12s for PR16756 https://llvm.org/bugs/show_bug.cgi?id=16756 Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D12267 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245820	2015-08-23 22:15:49 +00:00
Joseph Tremoulet	56089ea65e	[WinEH] Require token linkage in EH pad/ret signatures Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797	2015-08-23 00:26:33 +00:00
JF Bastien	91f6300e91	Improve the determinism of MergeFunctions Summary: Merge functions previously relied on unsigned comparisons of pointer values to order functions. This caused observable non-determinism in the compiler for large bitcode programs. Basically, opt -mergefuncs program.bc \| md5sum produces different hashes when run repeatedly on the same machine. Differing output was observed on three large bitcodes, but it was less frequent on the smallest file. It is possible that this only manifests on the large inputs, hence remaining undetected until now. This patch fixes this by removing (almost, see below) all places where comparisons between pointers are used to order functions. Most of these changes are local, but the comparison of global values requires assigning an identifier to each local in the order it is visited. This is very similar to the way the comparison function identifies Value's defined within a function. Because the order of visiting the functions and their subparts is deterministic, the identifiers assigned to the globals will be as well, and the order of functions will be deterministic. With these changes, there is no more observed non-determinism. There is also only minor slowdowns (negligible to 4%) compared to the baseline, which is likely a result of the fact that global comparisons involve hash lookups and not just pointer comparisons. The one caveat so far is that programs containing BlockAddress constants can still be non-deterministic. It is not clear what the right solution is here. In particular, even if the global numbers are used to order by function, we still need a way to order the BasicBlock's. Unfortunately, we cannot just bail out and fail to order the functions or consider them equal, because we require a total order over functions. Note that programs with BlockAddress constants are relatively rare, so the impact of leaving this in is minor as long as this pass is opt-in. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: jevinskie, llvm-commits, chapuni Differential revision: http://reviews.llvm.org/D12168 llvm-svn: 245762	2015-08-21 23:27:24 +00:00
Tyler Nowicki	0df99a252e	Standardized 'failed' to 'Failed' in LoopVectorizationRequirements. llvm-svn: 245759	2015-08-21 23:03:24 +00:00
Sanjoy Das	ead6e3fe61	Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" The original checkin was buggy, this change has a fix. Original commit message: [InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245753	2015-08-21 22:22:37 +00:00
David Blaikie	98f759d4d6	[opaque pointer type]: Pass explicit pointee type when building a constant GEP. Gets a bit tricky in the ValueMapper, of course - not sure if we should just expose a list of explicit types for each Value so that the ValueMapper can be neutral to these special cases (it's OK for things like load, where the explicit type is the result type - but when that's not the case, it means plumbing through another "special" type... ) llvm-svn: 245728	2015-08-21 20:16:51 +00:00
NAKAMURA Takumi	9d82bd7d88	Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" It caused miscompilation in clang. llvm-svn: 245678	2015-08-21 07:46:07 +00:00
Peter Collingbourne	96b0ffdce8	TransformUtils: Introduce module splitter. The module splitter splits a module into linkable partitions. It will be used to implement parallel LTO code generation. This initial version of the splitter does not attempt to deal with the somewhat subtle symbol visibility issues around module splitting. These will be dealt with in a future change. Differential Revision: http://reviews.llvm.org/D12132 llvm-svn: 245662	2015-08-21 02:48:20 +00:00
Sanjoy Das	d7d7f13bd6	[InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245635	2015-08-20 22:31:55 +00:00
Michael Zolotukhin	977fe2551a	[SLP] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245633	2015-08-20 22:28:15 +00:00
Michael Zolotukhin	3f340be0fb	[LoopVectorize] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245632	2015-08-20 22:27:38 +00:00
Adrian Prantl	578302805a	Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata() and make it always preserve debug locations, since all callers wanted this behavior anyway. This is addressing a post-commit review feedback for r245589. NFC (inside the LLVM tree). llvm-svn: 245622	2015-08-20 22:00:30 +00:00
Adhemerval Zanella	c60acaac8a	[asan] Add ASAN support for AArch64 42-bit VMA This patch adds support for asan on aarch64-linux with 42-bit VMA (current default config for 64K pagesize kernels). The support is enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 245594	2015-08-20 18:30:40 +00:00
Jingyue Wu	2a45313ac4	[ValueTracking] computeOverflowForSignedAdd and isKnownNonNegative Summary: Refactor, NFC Extracts computeOverflowForSignedAdd and isKnownNonNegative from NaryReassociate to ValueTracking in case others need it. Reviewers: reames Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D11313 llvm-svn: 245591	2015-08-20 18:27:04 +00:00
Adrian Prantl	bbd441b7ad	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adrian Prantl	8ccf2b1946	Fix a debug location handling bug in GVN. Caught by the famous "DebugLoc describes the currect SubProgram" assertion. When GVN is removing a nonlocal load it updates the debug location of the SSA value it replaced the load with with the one of the load. In the testcase this actually overwrites a valid debug location with an empty one. In reality GVN has to make an arbitrary choice between two equally valid debug locations. This patch changes to behavior to only update the location if the value doesn't already have a debug location. llvm-svn: 245588	2015-08-20 18:23:56 +00:00
Adam Nemet	ace191f26c	[LVer] Fix FIXME: hide addPHINodes, NFC Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up. Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop() and computing DefsUsedOutsideOfLoop will happen implicitly. With that there is no reason to expose addPHINodes anymore. Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and addPHINodes in LVerLICM and things should just work. llvm-svn: 245579	2015-08-20 17:22:29 +00:00
Balaram Makam	9086980655	Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation. Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd. Reviewers: majnemer Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D12156 llvm-svn: 245569	2015-08-20 15:35:00 +00:00
Benjamin Kramer	564947ae00	Make helper functions static. NFC. llvm-svn: 245549	2015-08-20 09:57:22 +00:00
Bjorn Steinbrink	268f3f5f81	Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks" llvm-svn: 245543	2015-08-20 08:58:47 +00:00
Bjorn Steinbrink	ea4fe2b310	[DSE] Enable removal of lifetime intrinsics in terminating blocks Usually DSE is not supposed to remove lifetime intrinsics, but it's actually ok to remove them for dead objects in terminating blocks, because they convey no extra information there. Until we hit a lifetime start that cannot be removed, that is. Because from that point on the lifetime intrinsics become interesting again, e.g. for stack coloring. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11710 llvm-svn: 245542	2015-08-20 08:25:28 +00:00
Chandler Carruth	e93d4cc138	[ARC] Pull the ObjC ARC components that really serve the role of analyses into LLVM's Analysis library rather than having them in a Transforms library. This is motivated by the need to have the core AliasAnalysis infrastructure be aware of the ObjCARCAliasAnalysis. However, it also seems like a nice and clean separation. Everything was very easy to move and this doesn't create much clutter in the analysis library IMO. Differential Revision: http://reviews.llvm.org/D12133 llvm-svn: 245541	2015-08-20 08:06:03 +00:00
David Majnemer	ee4740a5b5	Replace some calls to isa<LandingPadInst> with isEHPad() No functionality change is intended. llvm-svn: 245487	2015-08-19 19:54:02 +00:00
Nick Lewycky	fa6a61daaa	More clean up, still NFC. Remove dead variables now that the casts are gone. llvm-svn: 245420	2015-08-19 06:25:30 +00:00
Nick Lewycky	85f9af940c	Clean up this file a little. Remove dead casts, casting Values to Values. Adjust some comments for typos and whitespace. NFC. llvm-svn: 245419	2015-08-19 06:22:33 +00:00
Ashutosh Nema	d2b31a8acc	Exposed findDefsUsedOutsideOfLoop as a loop utility function Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils. Reviewed By: anemet llvm-svn: 245416	2015-08-19 05:40:42 +00:00
Eric Christopher	0f80d201ba	Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks." This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960 This reverts r245195. llvm-svn: 245402	2015-08-19 02:15:13 +00:00
Nick Lewycky	aacc91f91c	Fix three typos in comments; "easilly" -> "easily". llvm-svn: 245379	2015-08-18 22:41:58 +00:00
Chandler Carruth	bf271cc4e6	[PM/AA] Remove the last relics of the separate IPA library from LLVM, folding the code into the main Analysis library. There already wasn't much of a distinction between Analysis and IPA. A number of the passes in Analysis are actually IPA passes, and there doesn't seem to be any advantage to separating them. Moreover, it makes it hard to have interactions between analyses that are both local and interprocedural. In trying to make the Alias Analysis infrastructure work with the new pass manager, it becomes particularly awkward to navigate this split. I've tried to find all the places where we referenced this, but I may have missed some. I have also adjusted the C API to continue to be equivalently functional after this change. Differential Revision: http://reviews.llvm.org/D12075 llvm-svn: 245318	2015-08-18 17:51:53 +00:00
Sanjay Patel	1163b7b428	use minSize wrapper; NFCI These were missed when other uses were switched over: http://llvm.org/viewvc/llvm-project?view=revision&revision=243994 llvm-svn: 245311	2015-08-18 16:44:23 +00:00
Justin Bogner	82f54ac1cb	Revert "Constant propagation after hiting llvm.assume" This was also failing bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build This reverts r245265. llvm-svn: 245269	2015-08-18 07:00:34 +00:00
Piotr Padlewski	ef4e67a9b9	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 245265	2015-08-18 03:55:30 +00:00
Karthik Bhat	334796e2f8	Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195	2015-08-17 05:51:39 +00:00
David Majnemer	fc4172a950	Revert "[InstCombinePHI] Partial simplification of identity operations." This reverts commit r244887, it caused PR24470. llvm-svn: 245194	2015-08-17 03:11:26 +00:00
Chandler Carruth	4d1e1851a4	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Benjamin Kramer	3ecde27fac	[SimplifyLibCalls] Drop default template args. No functional change. llvm-svn: 245189	2015-08-16 21:16:37 +00:00
Sanjay Patel	138d4e067f	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
Sanjoy Das	7f59e40939	[LSR][NFC] Don’t duplicate entity name at the beginning of the comment. llvm-svn: 245183	2015-08-16 18:22:46 +00:00
Sanjoy Das	a96e244416	[LSR][NFC] Use camelCase for method names in Formula and RegUseTracker. llvm-svn: 245182	2015-08-16 18:22:43 +00:00
David Majnemer	0b72e6bfa5	Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172	2015-08-16 07:11:59 +00:00
David Majnemer	d32daec92c	[InstCombine] Replace an and+icmp with a trunc+icmp Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171	2015-08-16 07:09:17 +00:00
NAKAMURA Takumi	f382e9735d	MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting. Don't assume second would be ordered in the module. llvm-svn: 245168	2015-08-16 02:41:23 +00:00
Yaron Keren	481afd2310	Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890 ByteSize and BitSize should not be size_t but unsigned, considering 1) They are at most 2^16 and 2^19, respectively. 2) BitSize is an argument to Type::getIntNTy which takes unsigned. Also, use the correct utostr instead itostr and cache the string result. Thanks to James Touton for reporting this! llvm-svn: 245167	2015-08-15 19:06:14 +00:00
David Majnemer	85a57db552	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
JF Bastien	fe4c9948ee	Accelerate MergeFunctions with hashing This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140	2015-08-15 01:18:18 +00:00
Matt Arsenault	e214cd5c49	LoopStrengthReduce: Try to pass address space to isLegalAddressingMode This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137	2015-08-15 00:53:06 +00:00
Nick Lewycky	465237af32	Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458! llvm-svn: 245119	2015-08-14 22:46:49 +00:00
Evgeniy Stepanov	3606284dd4	[msan] Fix handling of musttail calls. MSan instrumentation for return values of musttail calls is not allowed by the IR constraints, and not needed at the same time. llvm-svn: 245106	2015-08-14 22:03:50 +00:00
Justin Bogner	40c0301b05	[sancov] Fix an unused variable warning introduced in r245067 llvm-svn: 245072	2015-08-14 17:03:45 +00:00
Reid Kleckner	575463d985	[sancov] Leave llvm.localescape in the entry block Summary: Similar to the change we applied to ASan. The same test case works. Reviewers: samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11961 llvm-svn: 245067	2015-08-14 16:45:42 +00:00
James Molloy	025f427f26	Separate out BDCE's analysis into a separate DemandedBits analysis. This allows other areas of the compiler to use BDCE's bit-tracking. NFCI. llvm-svn: 245039	2015-08-14 11:09:09 +00:00
Adam Nemet	93efd6c370	[LVer] Remove unused Pass parameter from versionLoop, NFC llvm-svn: 245032	2015-08-14 06:30:26 +00:00
David Majnemer	10f2d9234b	[IR] Add token types This introduces the basic functionality to support "token types". The motivation stems from the need to perform operations on a Value whose provenance cannot be obscured. There are several applications for such a type but my immediate motivation stems from WinEH. Our personality routine enforces a single-entry - single-exit regime for cleanups. After several rounds of optimizations, we may be left with a terminator whose "cleanup-entry block" is not entirely clear because control flow has merged two cleanups together. We have experimented with using labels as operands inside of instructions which are not terminators to indicate where we came from but found that LLVM does not expect such exotic uses of BasicBlocks. Instead, we can use this new type to clearly associate the "entry point" and "exit point" of our cleanup. This is done by having the cleanuppad yield a Token and consuming it at the cleanupret. The token type makes it impossible to obscure or otherwise hide the Value, making it trivial to track the relationship between the two points. What is the burden to the optimizer? Well, it turns out we have already paid down this cost by accepting that there are certain calls that we are not permitted to duplicate, optimizations have to watch out for such instructions anyway. There are additional places in the optimizer that we will probably have to update but early examination has given me the impression that this will not be heroic. Differential Revision: http://reviews.llvm.org/D11861 llvm-svn: 245029	2015-08-14 05:09:07 +00:00
Karthik Bhat	b20d3957b6	Add support for cross block dse. This patch enables dead stroe elimination across basicblocks. Example: define void @test_02(i32 %N) { %1 = alloca i32 store i32 %N, i32* %1 store i32 10, i32* @x %2 = load i32, i32* %1 %3 = icmp ne i32 %2, 0 br i1 %3, label %4, label %5 ; <label>:4 store i32 5, i32* @x br label %7 ; <label>:5 %6 = load i32, i32* @x store i32 %6, i32* @y br label %7 ; <label>:7 store i32 15, i32* @x ret void } In the above example dead store "store i32 5, i32* @x" is now eliminated. Differential Revision: http://reviews.llvm.org/D11143 llvm-svn: 245025	2015-08-14 04:17:23 +00:00
Chandler Carruth	df256e0563	[PM/AA] Run clang-format over the ObjCARC Alias Analysis code to normalize its formatting before I make more substantial changes. llvm-svn: 245024	2015-08-14 03:57:00 +00:00
Chandler Carruth	3ab3bc1349	[PM/AA] Don't bother forward declaring Function and Value, just include their headers. llvm-svn: 245023	2015-08-14 03:55:36 +00:00
Chandler Carruth	5effbccc9f	[PM/AA] Extract the interface for GlobalsModRef into a header along with its creation function. This required shifting a bunch of method definitions to be out-of-line so that we could leave most of the implementation guts in the .cpp file. llvm-svn: 245021	2015-08-14 03:48:20 +00:00
Chandler Carruth	ac11f6dc12	[PM/AA] Hoist the interface to TBAA into a dedicated header along with its creation function. Update the relevant includes accordingly. llvm-svn: 245019	2015-08-14 03:33:48 +00:00
Chandler Carruth	0e1aede735	[PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the creation function there. Same basic refactoring as the other alias analyses. Nothing special required this time around. llvm-svn: 245012	2015-08-14 02:55:50 +00:00
Chandler Carruth	394485dd25	[PM/AA] Extract a minimal interface for CFLAA to its own header file. I've used forward declarations and reorderd the source code some to make this reasonably clean and keep as much of the code as possible in the source file, including all the stratified set details. Just the basic AA interface and the create function are in the header file, and the header file is now included into the relevant locations. llvm-svn: 245009	2015-08-14 02:42:20 +00:00
Jingyue Wu	200cc080ad	[SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow. Summary: This patch implements my promised optimization to reunites certain sexts from operands after we extract the constant offset. See the header comment of reuniteExts for its motivation. One key building block that enables this optimization is Bjarke's poison value analysis (D11212). That helps to prove "a +nsw b" can't overflow. Reviewers: broune Subscribers: jholewinski, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12016 llvm-svn: 245003	2015-08-14 02:02:05 +00:00
Chandler Carruth	c62f147131	[LIR] Re-instate r244880, reverted in r244884, factoring the handling of AliasAnalysis in LoopIdiomRecognize. The previous commit to LIR, r244879, exposed some scary bug in the loop pass pipeline with an assert failure that showed up on several bots. This patch got reverted as part of getting that revision reverted, but they're actually independent and unrelated. This patch has no functional change and should be completely safe. It is also useful for my current work on the AA infrastructure. llvm-svn: 244993	2015-08-14 00:21:10 +00:00
Sanjay Patel	2663ebf6a5	don't repeat function names in comments; NFC llvm-svn: 244977	2015-08-13 22:53:20 +00:00
Davide Italiano	3672dcbb24	[SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947	2015-08-13 20:34:26 +00:00
Jingyue Wu	7505d46dee	[SeparateConstOffsetFromGEP] strengthen the inbounds attribute We used to be over-conservative about preserving inbounds. Actually, the second GEP (which applies the constant offset) can inherit the inbounds attribute of the original GEP, because the resultant pointer is equivalent to that of the original GEP. For example, x = GEP inbounds a, i+5 => y = GEP a, i // inbounds removed x = GEP inbounds y, 5 // inbounds preserved llvm-svn: 244937	2015-08-13 18:48:49 +00:00
Erik Eckstein	724d66d0e1	[DeadStoreElimination] remove a redundant store even if the load is in a different block. DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location. So far this worked only if the store is in the same block as the load. Now we can also handle stores which are in a different block than the load. Example: define i32 @test(i1, i32) { entry: %l2 = load i32, i32 %1, align 4 br i1 %0, label %bb1, label %bb2 bb1: br label %bb3 bb2: ; This store is redundant store i32 %l2, i32* %1, align 4 br label %bb3 bb3: ret i32 0 } Differential Revision: http://reviews.llvm.org/D11854 llvm-svn: 244901	2015-08-13 15:36:11 +00:00
Charlie Turner	ed8f5c45cb	[InstCombinePHI] Partial simplification of identity operations. Consider this code: BB: %i = phi i32 [ 0, %if.then ], [ %c, %if.else ] %add = add nsw i32 %i, %b ... In this common case the add can be moved to the %if.else basic block, because adding zero is an identity operation. If we go though %if.then branch it's always a win, because add is not executed; if not, the number of instructions stays the same. This pattern applies also to other instructions like sub, shl, shr, ashr \| 0, mul, sdiv, div \| 1. Patch by Jakub Kuderski! llvm-svn: 244887	2015-08-13 12:38:58 +00:00
Renato Golin	8df6537f35	Revert "[LIR] Start leveraging the fundamental guarantees of a loop..." This reverts commit r244879, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244885	2015-08-13 11:25:38 +00:00
Renato Golin	f6dc2a5b25	Revert "[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize." This reverts commit r244880, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244884	2015-08-13 11:25:35 +00:00
Ashutosh Nema	8b2971be66	Test Commit. llvm-svn: 244883	2015-08-13 11:18:35 +00:00
Chandler Carruth	6716a0f08d	[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize. This is what started me staring at this code. Now migrating it with the new AA stuff will be trivial. llvm-svn: 244880	2015-08-13 10:00:53 +00:00
Chandler Carruth	6dc56a2bf7	[LIR] Start leveraging the fundamental guarantees of a loop in simplified form to remove redundant checks and simplify the code for popcount recognition. We don't actually need to handle all of these cases. I've left a FIXME for one in particular until I finish inspecting to make sure we don't actually rely on the predicate in any way. llvm-svn: 244879	2015-08-13 09:56:20 +00:00
Chandler Carruth	29ed2c0465	[LIR] Handle the LoopInfo the same as all the other analyses. No utility really in breaking pattern just for this analysis. llvm-svn: 244878	2015-08-13 09:27:01 +00:00
Simon Pilgrim	6b29a29fa2	[InstCombine] SSE/AVX vector shifts demanded shift amount bits Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 llvm-svn: 244872	2015-08-13 07:39:03 +00:00
Chen Li	cb288bc6fb	[LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks in current loop Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. Reviewers: reames, weimingz, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11997 llvm-svn: 244868	2015-08-13 05:24:29 +00:00
Chandler Carruth	adb2df90fa	[LIR] Make the LoopIdiomRecognize pass get analyses essentially the same way as every other pass. This simplifies the code quite a bit and is also more idiomatic! <ba-dum!> llvm-svn: 244853	2015-08-13 01:03:26 +00:00
Chandler Carruth	3d8b7669f2	[LIR] Remove the dedicated class for popcount recognition and sink the code into methods on LoopIdiomRecognize. This simplifies the code somewhat and also makes it much easier to move the analyses around. Ultimately, the separate class wasn't providing significant value over methods -- it contained the precondition basic block and the current loop. The current loop is already available and the precondition block wasn't needed everywhere and is easy to pass around. In several cases I just moved things to be static functions because they already accepted most of their inputs as arguments. This doesn't fix the way we manage analyses yet, that will be the next patch, but it already makes the code over 50 lines shorter. No functionality changed. llvm-svn: 244851	2015-08-13 00:44:29 +00:00
Chandler Carruth	1c383505c5	[LIR] Move all the helpers to be private and re-order the methods in a way that groups things logically. No functionality changed. llvm-svn: 244845	2015-08-13 00:10:03 +00:00
Chandler Carruth	737d5e3ec5	[LIR] Remove the 'LIRUtils' abstraction which was unnecessary and adding complexity. There is only one function that was called from multiple locations, and that was 'getBranch' which has a reasonable one-line spelling already: dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but it doesn't seem to add much value. Instead, we should avoid calling it so many times on the same basic blocks, but that will be in a subsequent patch. The other functions are only called in one location, so inline them there, and take advantage of this to use direct early exit and reduce indentation. This makes it much more clear what is being tested for, and in fact makes it clear now to me that there are simpler ways to do this work. However, this patch just does the mechanical inlining. I'll clean up the functionality of the code to leverage loop simplified form more effectively in a follow-up. Despite lots of early line breaks due to early-exit, this is still shorter than it was before. llvm-svn: 244841	2015-08-12 23:55:56 +00:00
Chandler Carruth	a59073732a	[LIR] Run clang-format over LoopIdiomRecognize in preparation for a significant code cleanup here. The handling of analyses in this pass is overly complex and can be simplified significantly, but the right way to do that is to simplify all of the code not just the analyses, and that'll require pretty extensive edits that would be noisy with formatting changes mixed into them. llvm-svn: 244828	2015-08-12 23:06:37 +00:00
Philip Reames	b755be339a	[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints To be clear: this is an optimization not a correctness change. CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll. One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust. I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially. Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly. Differential Revision: http://reviews.llvm.org/D11819 llvm-svn: 244821	2015-08-12 22:11:45 +00:00
Philip Reames	15b1aa2963	[RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :) This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future. llvm-svn: 244808	2015-08-12 21:00:20 +00:00
Sanjay Patel	4294d5f8bd	fix typo; NFC llvm-svn: 244805	2015-08-12 20:36:18 +00:00
Chandler Carruth	da8f360fe7	[PM/AA] Add missing static dependency edges from DSE and memdep to TLI. I forgot to add these in r244780 and r244778. Sorry about that. Also order the static dependencies in a lexicographical order. llvm-svn: 244787	2015-08-12 18:10:45 +00:00
Chandler Carruth	2e28175329	[PM/AA] Explicitly depend on TLI rather than getting it out of the AliasAnalysis. Same as the other commits, the TLI access from an alias analysis is going away and isn't very clean -- it is better to explicitly mark the dependencies. llvm-svn: 244785	2015-08-12 18:06:08 +00:00
Chandler Carruth	19b400baf8	[PM/AA] Stop getting the TargetLibraryInfo out of the AliasAnalysis and just depend on it directly. This was particularly frustrating because there was a really wide mixture of using a member variable and re-extracting it from the AA that happened to be around. I think the result is much more clear. I've also deleted all of the pointless null checks and used references across the APIs where I could to make it explicit that this cannot be null in a useful fashion. llvm-svn: 244780	2015-08-12 18:01:44 +00:00
Adam Nemet	df626a149f	[LoopVer] Optionally allow using memchecks from LAA r243382 changed the behavior to always require a set of memchecks to be passed to LoopVer. This change restores the prior behavior as an alternative to the new behavior. This allows the checks to be implicitly taken from the LAA object. Patch by Ashutosh Nema! llvm-svn: 244763	2015-08-12 16:51:19 +00:00
Simon Pilgrim	9c9b4332ca	unused variable warning fix. llvm-svn: 244725	2015-08-12 08:23:36 +00:00
Simon Pilgrim	45d6ddee89	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723	2015-08-12 08:08:56 +00:00
Sanjoy Das	2078fb4d89	Fix PR24354. `InstCombiner::OptimizeOverflowCheck` was asserting an invariant (operands to binary operations are ordered by decreasing complexity) that wasn't really an invariant. Fix this by instead having `InstCombiner::OptimizeOverflowCheck` establish the invariant if it does not hold. llvm-svn: 244676	2015-08-11 21:33:55 +00:00
Sanjay Patel	33647178af	don't repeat function names in comments; NFC llvm-svn: 244672	2015-08-11 21:24:04 +00:00
Sanjay Patel	0510654efe	fix 80-cols; NFC llvm-svn: 244668	2015-08-11 21:11:56 +00:00
Chen Li	9199f20dfe	[LowerSwitch] Skip dead blocks for processSwitchInst() Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time. Reviewers: reames, hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11953 llvm-svn: 244656	2015-08-11 20:16:17 +00:00
Chen Li	13db1247c9	[LowerSwitch] Fix a bug when LowerSwitch deletes the default block Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks. Reviewers: kariddi, resistor, hans, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11852 llvm-svn: 244642	2015-08-11 18:12:26 +00:00
Teresa Johnson	a8fc4e7498	Enable EliminateAvailableExternally pass in the LTO pipeline. Summary: For LTO we need to enable this pass in the LTO pipeline, as it is skipped during the "-flto -c" compile step (when PrepareForLTO is set). Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11919 llvm-svn: 244622	2015-08-11 16:26:41 +00:00
Sanjay Patel	39f0213518	Variable names should start with an upper case letter; NFC llvm-svn: 244618	2015-08-11 16:05:43 +00:00
Sanjay Patel	feea9289bf	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244617	2015-08-11 15:56:31 +00:00
Sanjay Patel	2bbc3b14af	fix code that was accidentally commented out in previous commit llvm-svn: 244610	2015-08-11 15:08:29 +00:00
Sanjay Patel	67e99fd161	fix typos in comments; NFC llvm-svn: 244609	2015-08-11 15:04:51 +00:00
Sanjay Patel	7aac75b6df	fix typo in comment; NFC llvm-svn: 244607	2015-08-11 14:45:08 +00:00
James Molloy	ecd6525b24	Add support for floating-point minnum and maxnum The select pattern recognition in ValueTracking (as used by InstCombine and SelectionDAGBuilder) only knew about integer patterns. This teaches it about minimum and maximum operations. matchSelectPattern() has been extended to return a struct containing the existing Flavor and a new enum defining the pattern's behavior when given one NaN operand. C minnum() is defined to return the non-NaN operand in this case, but the idiomatic C "a < b ? a : b" would return the NaN operand. ARM and AArch64 at least have different instructions for these different cases. llvm-svn: 244580	2015-08-11 09:12:57 +00:00
David Majnemer	fd52148995	[WinEHPrepare] Add rudimentary support for the new EH instructions This adds somewhat basic preparation functionality including: - Formation of funclets via coloring basic blocks. - Cloning of polychromatic blocks to ensure that funclets have unique program counters. - Demotion of values used between different funclets. - Some amount of cleanup once we have removed predecessors from basic blocks. - Verification that we are left with a CFG that makes some amount of sense. N.B. Arguments and numbering still need to be done. Differential Revision: http://reviews.llvm.org/D11750 llvm-svn: 244558	2015-08-11 01:15:26 +00:00
Tyler Nowicki	12178daefd	Print vectorization analysis when loop hint is specified. This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints. llvm-svn: 244555	2015-08-11 01:09:15 +00:00
Tyler Nowicki	1bb9d68857	Moved LoopVectorizeHints and related functions before LoopVectorizationLegality and LoopVectorizationCostModel. llvm-svn: 244552	2015-08-11 00:52:54 +00:00
Tyler Nowicki	b772c3e968	Simplify processLoop() by moving loop hint verification into Hints::allowVectorization(). llvm-svn: 244550	2015-08-11 00:35:44 +00:00
Kostya Serebryany	c88d3123b8	[libFuzzer] don't crash if the condition in a switch has unusual type (e.g. i72) llvm-svn: 244544	2015-08-11 00:24:39 +00:00
Adam Nemet	3f19b8eced	[LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC This was requested by Hal in D11205. llvm-svn: 244540	2015-08-11 00:09:37 +00:00
Adam Nemet	97b2779195	[LoopVer] Remove unused pointer partition argument, NFC. llvm-svn: 244527	2015-08-10 23:05:31 +00:00
Tyler Nowicki	3f1d874bb9	Extend late diagnostics to include late test for runtime pointer checks. This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization. llvm-svn: 244523	2015-08-10 23:01:55 +00:00
Simon Pilgrim	65266a8e22	[InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495	2015-08-10 20:21:15 +00:00
Tyler Nowicki	6edbef9016	Late evaluation of the fast-math vectorization requirement. This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. llvm-svn: 244489	2015-08-10 19:51:46 +00:00
Tyler Nowicki	455075e570	Modify diagnostic messages to clearly indicate the why interleaving wasn't done. Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done. llvm-svn: 244485	2015-08-10 19:14:16 +00:00
Igor Laevsky	dc6b4a78a4	[IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarter Differential Revision: http://reviews.llvm.org/D11687 llvm-svn: 244474	2015-08-10 18:23:58 +00:00
Mark Heffernan	ba9e336c90	Add new llvm.loop.unroll.enable metadata. This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466	2015-08-10 17:28:08 +00:00
Silviu Baranga	fccd898d4b	[TTI] Add a hook for specifying per-target defaults for Interleaved Accesses Summary: This adds a hook to TTI which enables us to selectively turn on by default interleaved access vectorization for targets on which we have have performed the required benchmarking. Reviewers: rengolin Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11901 llvm-svn: 244449	2015-08-10 14:50:54 +00:00
Fraser Cormack	c174b92fa2	Prevent the scalarizer from caching incorrect entries The scalarizer can cache incorrect entries when walking up a chain of insertelement instructions. This occurs when it encounters more than one instruction that it is not actively searching for, as it unconditionally caches every element it finds. The fix is to only cache the first element that it isn't searching for so we don't overwrite correct entries. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D11559 llvm-svn: 244448	2015-08-10 14:48:47 +00:00
Benjamin Kramer	69a3fdb314	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
David Majnemer	eabf812a59	[InstCombine] Don't try to sink EH pad instructions Found by inspection, this change should not effect the existing landingpad behavior. llvm-svn: 244391	2015-08-08 03:51:49 +00:00
Matt Arsenault	a7a9e929b5	Remove unnecessary includes llvm-svn: 244382	2015-08-08 00:41:53 +00:00
Adam Nemet	36760f098d	[LAA] Make the set of runtime checks part of the state of LAA, NFC This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense that it now provides the two main results of its analysis precomputed: 1. memory dependences via getDepChecker().getInsterestingDependences() 2. run-time checks via getRuntimePointerCheck().getChecks() However, as a consequence we now compute this information pro-actively. Thus if the client decides to skip the loop based on the dependences we've computed the checks unnecessarily. In order to see whether this was a significant overhead I checked compile time on SPEC2k6 LTO bitcode files. The change was in the noise. The checks are generated in canCheckPtrAtRT, at the same place where we used to call groupChecks to merge checks. llvm-svn: 244368	2015-08-07 22:44:15 +00:00
Chen Li	ef70e358a3	[ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion. Reviewers: sanjoy, weimingz, chenli Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11841 llvm-svn: 244348	2015-08-07 19:30:12 +00:00
Simon Pilgrim	c31bcce147	[InstCombine] Fix SSE2/AVX2 vector logical shift by constant This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) \| 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341	2015-08-07 18:22:50 +00:00
Duncan P. N. Exon Smith	5607ee4237	ValueMapper: Resolve uniquing cycles more aggressively As a follow-up to r244181, resolve uniquing cycles underneath distinct nodes on the fly. This prevents uniquing cycles in early operands from affecting later operands. It also removes an iteration through distinct nodes' operands. No real functional change here, just more prompt resolution of temporary nodes. llvm-svn: 244302	2015-08-07 00:44:55 +00:00
Duncan P. N. Exon Smith	c55e88eaa6	ValueMapper: Pull out helper to resolve cycles, NFC Pull out a helper for resolving uniquing cycles of `Metadata` to remove the boiler-plate of downcasting to `MDNode`. llvm-svn: 244301	2015-08-07 00:39:26 +00:00
David Majnemer	16f420a9d2	Revert accidentally committed WinEHPrepare changes This reverts commit r244272, r244273, r244274, and r244275. llvm-svn: 244278	2015-08-06 21:13:51 +00:00
David Majnemer	16b27ceef8	Handle PHI nodes prefacing EH pads too llvm-svn: 244274	2015-08-06 21:08:32 +00:00
Sanjoy Das	46a633bcfd	[IndVars] Improved logging under DEBUG(); NFC. Before this, we'd print the modified comparision in the "Simplified comparison" case. That looked misleading. llvm-svn: 244264	2015-08-06 20:43:28 +00:00
Pete Cooper	598f1f2fd1	Convert a bunch of loops to foreach. NFC. After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260	2015-08-06 20:22:46 +00:00
Nico Rieck	4acab3dcc9	Rename inst_range() to instructions() for consistency. NFC llvm-svn: 244248	2015-08-06 19:10:45 +00:00
Quentin Colombet	4323bdacbd	[Reassociation] Fix miscompile for va_arg arguments. iisUnmovableInstruction() had a list of instructions hardcoded which are considered unmovable. The list lacked (at least) an entry for the va_arg and cmpxchg instructions. Fix this by introducing a new Instruction::mayBeMemoryDependent() instead of maintaining another instruction list. Patch by Matthias Braun <matze@braunis.de>. Differential Revision: http://reviews.llvm.org/D11577 rdar://problem/22118647 llvm-svn: 244244	2015-08-06 18:44:34 +00:00
Chandler Carruth	a0655c50ee	[PM/AA] Hoist the interface for BasicAA into a header file. This is the first mechanical step in preparation for making this and all the other alias analysis passes available to the new pass manager. I'm factoring out all the totally boring changes I can so I'm moving code around here with no other changes. I've even minimized the formatting churn. I'll reformat and freshen comments on the interface now that its located in the right place so that the substantive changes don't triger this. llvm-svn: 244197	2015-08-06 07:33:15 +00:00
Chandler Carruth	595690977b	[PM/AA] Simplify the AliasAnalysis interface by removing a wrapper around a DataLayout interface in favor of directly querying DataLayout. This wrapper specifically helped handle the case where this no DataLayout, but LLVM now requires it simplifynig all of this. I've updated callers to directly query DataLayout. This in turn exposed a bunch of places where we should have DataLayout readily available but don't which I've fixed. This then in turn exposed that we were passing DataLayout around in a bunch of arguments rather than making it readily available so I've also fixed that. No functionality changed. llvm-svn: 244189	2015-08-06 02:05:46 +00:00
Duncan P. N. Exon Smith	ee028e8d75	ValueMapper: Rotate distinct node remapping algorithm Rotate the algorithm for remapping distinct nodes in order to simplify how uniquing cycles get resolved. This removes some of the recursion, and, most importantly, exposes all uniquing cycles at the top-level. Besides being a little more efficient -- temporary MDNodes won't live as long -- the clearer logic should help protect against bugs like those fixed in r243961 and r243976. What are uniquing cycles? Why do they present challenges when remapping metadata? !0 = !{!1} !1 = !{!0} !0 and !1 form a simple uniquing cycle. When remapping from one metadata graph to another, every uniquing cycle gets "duplicated" through a dance: !0-temp = !{!1?} ; map(!0): clone !0, VM[!0] = !0-temp !1-temp = !{!0?} ; ..map(!1): clone !1, VM[!1] = !1-temp !1-temp = !{!0-temp} ; ..map(!1): remap !1's operands !2 = !{!0-temp} ; ..map(!1): uniquify: !1-temp => !2 !0-temp = !{!2} ; map(!0): remap !0's operands !3 = !{!2} ; map(!0): uniquify: !0-temp => !3 ; Result !2 = !{!3} !3 = !{!2} (In the two "uniquify" steps above, the operands of !X-temp are compared to the operands of !X. If they're the same, then !X-temp gets RAUW'ed to !X; if they're different, then !X-temp is promoted to a new unique node. The latter case always hits in for uniquing cycles, so we duplicate all the nodes involved.) Why is this a problem? Uniquable Metadata nodes that have temporary node as transitive operands keep RAUW support until the temporary nodes get finalized. With non-cycles, this happens automatically: when a uniquable node's count of unresolved operands drops to zero, it immediately sheds its own RAUW support (possibly triggering the same in any node that references it). However, uniquing cycles create a reference cycle, and uniqued nodes that transitively reference a uniquing cycle are "stuck" in an unresolved state until someone calls `MDNode::resolveCycles()` on a node in the unresolved subgraph. Distinct nodes should help here (and mostly do): since they aren't uniqued anywhere, they are guaranteed not to be RAUW'ed. They effectively form a barrier between uniqued nodes, breaking some uniquing cycles, and shielding uniqued nodes from uniquing cycles. Unfortunately, with this barrier in place, the unresolved subgraph(s) can be disjoint from the top-level node. The mapping algorithm needs to find at least one representative from each disjoint subgraph. But which nodes are stuck, and which will get resolved automatically? And which nodes are in the unresolved subgraph? The old logic was conservative. This commit rotates the logic for distinct nodes, so that we have access to unresolved nodes at the top-level call to `llvm::MapMetadata()`. Each time we return to the top-level, we know that all temporaries have been RAUW'ed away. Here, it's safe (and necessary) to call `resolveCycles()` immediately on unresolved operands. This should also perform better than the old algorithm. The recursion stack is shorter, temporary nodes don't live as long, and there are fewer tracking references to unresolved nodes. As the debug info graph introduces more 'distinct' nodes, remapping should incrementally get cheaper and cheaper. Aside from possible performance improvements (and reduced cruft in the `LLVMContext`), there should be no functionality change here. llvm-svn: 244181	2015-08-05 23:52:42 +00:00
Duncan P. N. Exon Smith	41ec46939f	ValueMapper: Simplify remap() helper function, NFC Rename `remap()` to `remapOperands()`, and restrict its contract to remapping operands. Previously, it also called `mapToMetadata()`, but this logic is hard to reason about externally. In particular, this refactors `mapUniquedNode()` to avoid redundant mapping calls, taking advantage of the RAUWs that are already in place. llvm-svn: 244168	2015-08-05 23:22:34 +00:00
Chen Li	349a850e5b	[LoopUnswitch] Preserve make.implicit metadata for unswitched conditions Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header. Reviewers: sanjoy, weimingz Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11769 llvm-svn: 244132	2015-08-05 21:13:26 +00:00
David Blaikie	4b09d11d27	-Wdeprecated cleanup: Make CallGraph movable by default by using unique_ptr members rather than raw pointers. The only place that tries to return a CallGraph by value (CallGraphAnalysis::run) doesn't seem to be used right now, but it's a reasonable bit of cleanup anyway. llvm-svn: 244122	2015-08-05 20:55:50 +00:00
Chandler Carruth	53c34baf45	[Unroll] Switch to using 'int' cost types in preparation for a somewhat more involved change to the cost computation pattern. llvm-svn: 244095	2015-08-05 18:46:21 +00:00
Simon Pilgrim	ca06990ca7	Fixed line endings. llvm-svn: 244021	2015-08-05 08:18:00 +00:00
Tanya Lattner	a72d000c61	Rename all references to old mailing lists to new lists.llvm.org address. llvm-svn: 243999	2015-08-05 03:51:17 +00:00
Sanjay Patel	83e1c48540	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Duncan P. N. Exon Smith	bd33d09021	Fix 80-column llvm-svn: 243977	2015-08-04 13:24:26 +00:00
Duncan P. N. Exon Smith	ad4314c88a	Linker: Fix ASan failure from r243961 r243883 and r243961 made a use-after-free far more likely: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/6041/steps/check-llvm%20asan/logs/stdio Unresolved nodes get inserted into the `Cycles` array. If they later get resolved through RAUW, we need to update the reference. It's interesting that this never hit before (maybe an asan-ified clang bootstrap with `-flto -g` would have hit it, but I admit I haven't tried anything quite that crazy). llvm-svn: 243976	2015-08-04 13:23:30 +00:00
David Majnemer	cce4d2aeb3	Drive-by fixes for LandingPad -> EHPad This change was done as an audit and is by inspection. The new EH system is still very much a work in progress. NFC for the landingpad case. llvm-svn: 243965	2015-08-04 08:21:40 +00:00
Simon Pilgrim	828e3c79fa	[InstCombine] Moved SSE vector shift constant folding into its own helper function. NFCI. This will make some upcoming bugfixes + improvements easier to manage. llvm-svn: 243962	2015-08-04 07:49:58 +00:00
Duncan P. N. Exon Smith	cdcaadaa39	Linker: Fix references to uniqued nodes after r243883 r243883 started moving 'distinct' nodes instead of duplicated them in lib/Linker. This had the side-effect of sometimes not cloning uniqued nodes that reference them. I missed a corner case: !named = !{!0} !0 = !{!1} !1 = distinct !{!0} !0 is the entry point for "remapping", and a temporary clone (say, !0-temp) is created and mapped in case we need to model a uniquing cycle. Recursive descent into !1. !1 is distinct, so we leave it alone, but update its operand to !0-temp. Pop back out to !0. Its only operand, !1, hasn't changed, so we don't need to use !0-temp. !0-temp goes out of scope, and we're finished remapping, but we're left with: !named = !{!0} !0 = !{!1} !1 = distinct !{null} ; uh oh... Previously, if !0 and !0-temp ended up with identical operands, then !0-temp couldn't have been referenced at all. Now that distinct nodes don't get duplicated, that assumption is invalid. We need to !0-temp->replaceAllUsesWith(!0) before freeing !0-temp. I found this while running an internal `-flto -g` bootstrap. Strangely, there was no case of this in the open source bootstrap I'd done before commit... llvm-svn: 243961	2015-08-04 06:42:31 +00:00
Sanjoy Das	c6c98a4732	Revert "[LSR] Generate and use zero extends" This reverts commit r243348 and r243357. They caused PR24347. llvm-svn: 243939	2015-08-04 01:52:05 +00:00
Adam Nemet	d8634fc7f7	[LoopVer] Remove unused needsRuntimeChecks(), NFC The previous commits moved this functionality into the client. Also remove the now unused member variable. llvm-svn: 243920	2015-08-03 23:32:57 +00:00
Chandler Carruth	c7194d9e4d	[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate cost between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900	2015-08-03 20:32:27 +00:00
Duncan P. N. Exon Smith	a6c2e1e60b	Linker: Move distinct MDNodes instead of cloning Instead of cloning distinct `MDNode`s when linking in a module, just move them over. The module linker destroys the source module, so the old node would otherwise just be leaked on the context. Create the new node in place. This also reduces the number of cloned uniqued nodes (since it's less likely their operands have changed). This mapping strategy is only correct when we're discarding the source, so the linker turns it on via a ValueMapper flag, `RF_MoveDistinctMDs`. There's nothing observable in terms of `llvm-link` output here: the linked module should be semantically identical. I'll be adding more 'distinct' nodes to the debug info metadata graph in order to break uniquing cycles, so the benefits of this will partly come in future commits. However, we should get some gains immediately, since we have a fair number of 'distinct' `DILocation`s being linked in. llvm-svn: 243883	2015-08-03 17:09:38 +00:00
Duncan P. N. Exon Smith	e9c24965a6	ValueMapper: Only check for cycles if operands change This is a minor optimization to only check for unresolved operands inside `mapDistinctNode()` if the operands have actually changed. This shouldn't really cause any change in behaviour. I didn't actually see a slowdown in a profile, I was just poking around nearby and saw the opportunity. llvm-svn: 243866	2015-08-03 03:45:32 +00:00
Duncan P. N. Exon Smith	9ee5b983b5	ValueMapper: Use a range-based for, NFC llvm-svn: 243865	2015-08-03 03:27:12 +00:00
Duncan P. N. Exon Smith	3caddc3495	ValueMapper: Reuse local variable, NFC llvm-svn: 243864	2015-08-03 03:24:28 +00:00
Craig Topper	bbb2ce25cc	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
David Majnemer	34ee3789f3	New EH representation for MSVC compatibility This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Differential Revision: http://reviews.llvm.org/D11097 llvm-svn: 243766	2015-07-31 17:58:14 +00:00
Kostya Serebryany	71a4e8ccbf	[libFuzzer] trace switch statements and apply mutations based on the expected case values llvm-svn: 243726	2015-07-31 01:33:06 +00:00
Adhemerval Zanella	ebb2e238a4	Enable dfsan for aarch64 This patch enable DFSan memory transformation for aarch64 (39-bit VMA). llvm-svn: 243684	2015-07-30 20:49:35 +00:00
Wei Mi	9dad2f2ad5	[SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction. The patch changes the SLPVectorizer::vectorizeStores to choose the immediate succeeding or preceding candidate for a store instruction when it has multiple consecutive candidates. In this way it has better chance to find more slp vectorization opportunities. Differential Revision: http://reviews.llvm.org/D10445 llvm-svn: 243666	2015-07-30 17:40:39 +00:00
Adam Nemet	cb086cfc65	[LoopVer] Add missing std::move The reason I was passing this vector by value in the constructor so that I wouldn't have to copy when initializing the corresponding member but then I forgot the std::move. The use-case is LoopDistribution which filters the checks then std::moves it to LoopVersioning's constructor. With this interface we can avoid any copies. llvm-svn: 243616	2015-07-30 04:21:13 +00:00
Adam Nemet	d8a3442dd6	[LDist] Filter the checks locally rather than in LAA, NFC Before, we were passing the pointer partitions to LAA. Now, we get all the checks from LAA and filter out the checks within partitions in LoopDistribution. This effectively concludes the steps to move filtering memchecks from LAA into its clients. There is still some cleanup left to remove the unused interfaces in LAA that still take PtrPartition. (Moving this functionality to LoopDistribution also requires needsChecking on pointers to be made public.) llvm-svn: 243613	2015-07-30 03:29:16 +00:00
Nick Lewycky	224087c041	Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585	2015-07-29 22:32:47 +00:00
Alexey Samsonov	d5852a39f2	[ASan] Disable dynamic alloca and UAR detection in presence of returns_twice calls. Summary: returns_twice (most importantly, setjmp) functions are optimization-hostile: if local variable is promoted to register, and is changed between setjmp() and longjmp() calls, this update will be undone. This is the reason why "man setjmp" advises to mark all these locals as "volatile". This can not be enough for ASan, though: when it replaces static alloca with dynamic one, optionally called if UAR mode is enabled, it adds a whole lot of SSA values, and computations of local variable addresses, that can involve virtual registers, and cause unexpected behavior, when these registers are restored from buffer saved in setjmp. To fix this, just disable dynamic alloca and UAR tricks whenever we see a returns_twice call in the function. Reviewers: rnk Subscribers: llvm-commits, kcc Differential Revision: http://reviews.llvm.org/D11495 llvm-svn: 243561	2015-07-29 19:36:08 +00:00
Evgeniy Stepanov	af5c29d437	[asan] Remove special case mapping on Android/AArch64. ASan shadow on Android starts at address 0 for both historic and performance reasons. This is possible because the platform mandates -pie, which makes lower memory region always available. This is not such a good idea on 64-bit platforms because of MAP_32BIT incompatibility. This patch changes Android/AArch64 mapping to be the same as that of Linux/AAarch64. llvm-svn: 243548	2015-07-29 18:22:25 +00:00
Peter Collingbourne	fa2563134a	LowerBitSets: Add debugging output. Differential Revision: http://reviews.llvm.org/D11583 llvm-svn: 243546	2015-07-29 18:12:36 +00:00
Michael Zolotukhin	cd83973ecc	[Unroll] Handle SwitchInst properly. Previously successor selection was simply wrong. llvm-svn: 243545	2015-07-29 18:10:33 +00:00
Michael Zolotukhin	b9c3487dc2	[Unroll] Don't crash when simplified branch condition is undef. llvm-svn: 243544	2015-07-29 18:10:29 +00:00
Sanjoy Das	04b4f7e9a4	[Statepoints] Let patchable statepoints have a symbolic call target. Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502	2015-07-28 23:50:30 +00:00
Michael Zolotukhin	6f69b18d73	[Unroll] Add debug dumps to loop-unroll analyzer. llvm-svn: 243471	2015-07-28 20:07:29 +00:00
Michael Zolotukhin	88a9ab7f5a	[Unroll] Don't analyze blocks outside the loop. llvm-svn: 243466	2015-07-28 19:21:21 +00:00
Sanjay Patel	c11822e0e8	fix formatting; NFC llvm-svn: 243424	2015-07-28 15:38:43 +00:00
Adam Nemet	cdf7537068	[LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC Before the patch, the checks were generated internally in addRuntimeCheck. Now, we use the new overloaded version of addRuntimeCheck that takes the ready-made set of checks as a parameter. The checks are now generated by the client (LoopDistribution) with the new RuntimePointerChecking::generateChecks API. Also the new printChecks API is used to print out the checks for debugging. This is to continue the transition over to the new model whereby clients will get the full set of checks from LAA, filter it and then pass it to LoopVersioning and in turn to addRuntimeCheck. llvm-svn: 243382	2015-07-28 05:01:53 +00:00
Sanjoy Das	35e5c86626	[LSR] Generate and use zero extends Summary: If a scale or a base register can be rewritten as "Zext({A,+,1})" then LSR will now consider a formula of that form in its normal cost computation. Depends on D9180 Reviewers: qcolombet, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9181 llvm-svn: 243348	2015-07-27 23:27:51 +00:00
Sanjoy Das	4c063981c7	[IndVars] Make loop varying predicates loop invariant. Summary: Was D9784: "Remove loop variant range check when induction variable is strictly increasing" This change re-implements D9784 with the two differences: 1. It does not use SCEVExpander and does not generate new instructions. Instead, it does a quick local search for existing `llvm::Value`s that it needs when modifying the `icmp` instruction. 2. It is more general -- it deals with both increasing and decreasing induction variables. I've added all of the tests included with D9784, and two more. As an example on what this change does (copied from D9784): Given C code: ``` for (int i = M; i < N; i++) // i is known not to overflow if (i < 0) break; a[i] = 0; } ``` This transformation produces: ``` for (int i = M; i < N; i++) if (M < 0) break; a[i] = 0; } ``` Which can be unswitched into: ``` if (!(M < 0)) for (int i = M; i < N; i++) a[i] = 0; } ``` I went back and forth on whether the top level logic should live in `SimplifyIndvar::eliminateIVComparison` or be put into its own routine. Right now I've put it under `eliminateIVComparison` because even though the `icmp` is not eliminated, it no longer is an IV comparison. I'm open to putting it in its own helper routine if you think that is better. Reviewers: reames, nicholas, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11278 llvm-svn: 243331	2015-07-27 21:42:49 +00:00
Simon Pilgrim	eddfa36b82	Fixed signed/unsigned comparison warning. llvm-svn: 243306	2015-07-27 19:07:15 +00:00
Simon Pilgrim	e713c640a4	[InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code. Differential Revision: http://reviews.llvm.org/D11503 llvm-svn: 243303	2015-07-27 18:52:15 +00:00
Pete Cooper	1c911fc71e	Revert "Remove unnecessary null check. NFC." This reverts commit r243167. Duncan pointed out that dyn_cast can return null in these cases, so this was an unsafe commit to make. Sorry for the noise. Worryingly there were no tests which fail... llvm-svn: 243302	2015-07-27 18:37:58 +00:00
Jingyue Wu	91cf96359e	Roll forward r243250 r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build slaves, but I couldn't reproduce this failure locally. Probably a false positive as I saw this test was broken by r243246 or r243247 too but passed later without people fixing anything. llvm-svn: 243253	2015-07-26 19:10:03 +00:00
Jingyue Wu	61ee29a54f	Revert r243250 breaks tests llvm-svn: 243251	2015-07-26 18:30:13 +00:00
Jingyue Wu	f4362fe267	[TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost Summary: This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider addressing modes. It now returns TCC_Free when the GEP can be completely folded to an addresing mode. I started this patch as I refactored SLSR. Function isGEPFoldable looks common and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost. Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best testing bed seems CostModel, but its getInstructionCost method invokes getAddressComputationCost for GEPs which provides very coarse estimation. So this patch also makes getInstructionCost call the updated getGEPCost for GEPs. This change inevitably breaks some tests because the cost model changes, but nothing looks seriously wrong -- if we believe the new cost model is the right way to go, these tests should be updated. This patch is not perfect yet -- the comments in some tests need to be updated. I want to know whether this is a right approach before fixing those details. Reviewers: chandlerc, hfinkel Subscribers: aschwaighofer, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D9819 llvm-svn: 243250	2015-07-26 17:28:13 +00:00
Simon Pilgrim	80ca3df4ed	[InstCombine][SSE4A] Standardized references to Length/Width and Index/Start to match AMD docs. NFCI. llvm-svn: 243226	2015-07-25 20:41:00 +00:00
Chen Li	9a4c684e0c	[LoopUnswitch] Improve loop unswitch pass to find trivial unswitch conditions more effectively Summary: This patch improves trivial loop unswitch. The current trivial loop unswitch only checks if loop header's terminator contains a trivial unswitch condition. But if the loop header only has one reachable successor (due to intentionally or unintentionally missed code simplification), we should consider the successor as part of the loop header. Therefore, instead of stopping at loop header's terminator, we should keep traversing its successors within loop until reach a real conditional branch or switch (whose condition can not be constant folded). This change will enable a single -loop-unswitch pass to unswitch multiple trivial conditions (unswitch one trivial condition could open opportunity to unswitch another one in the same loop), while the old implementation can unswitch only one per pass. Reviewers: reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11481 llvm-svn: 243203	2015-07-25 03:21:06 +00:00
Lawrence Hu	a4603977bc	Handle loop with negtive induction variable increment This patch extend LoopReroll pass to hand the loops which is similar to the following: while (len > 1) { sum4 += buf[len]; sum4 += buf[len-1]; len -= 2; } llvm-svn: 243171	2015-07-24 22:01:49 +00:00
Pete Cooper	e35eeac710	Remove unnecessary null check. NFC. Since both places which set this variable do so with dyn_cast, and not dyn_cast_or_null, its impossible to get a nullptr here, so we can remove the check. llvm-svn: 243167	2015-07-24 21:38:01 +00:00
Pete Cooper	a8e3702859	Use make_range(rbegin(), rend()) to allow foreach loops. NFC. Instead of the pattern for (auto I = x.rbegin(), E = x.end(); I != E; ++I) we can use make_range to construct the reverse range and iterate using that instead. llvm-svn: 243163	2015-07-24 21:13:43 +00:00
Diego Novillo	0a1bb40d4c	Remove unused variable. NFC. llvm-svn: 243145	2015-07-24 19:18:32 +00:00
Jingyue Wu	344082ead8	Remove the user-count threshold when analyzing read attributes Summary: This threshold limited FunctionAttrs ability to prove arguments to be read-only. In NVPTX, a specialized instruction ld.global.nc can be used to load memory with non-coherent texture cache. We notice that in SHOC [1] benchmark, some function arguments are not marked with readonly because FunctionAttrs reaches a hardcoded threshold when analysis uses. Removing this threshold won't cause significant regression in compilation time, because the worst-case time complexity of the algorithm is still O(# of instructions) for each parameter. Patched by Xuetian Weng. [1] https://github.com/vetter/shoc Reviewers: nlewycky, jingyue, nicholas Subscribers: nicholas, test, llvm-commits Differential Revision: http://reviews.llvm.org/D11311 llvm-svn: 243141	2015-07-24 19:05:53 +00:00
Philip Reames	48be953065	[RewriteStatepointsForGC] Adjust naming scheme to be more stable The names for instructions inserted were previous dependent on iteration order. By deriving the names from the original instructions, we can avoid instability in tests without resorting to ordered traversals. It also makes the IR mildly easier to read at large scale. llvm-svn: 243140	2015-07-24 19:01:39 +00:00
Pete Cooper	31257c8c3c	Use foreach loops for StructType::elements(). NFC. We had a few places where we did for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) { but those could instead do for (auto *EltTy : STy->elements()) { llvm-svn: 243136	2015-07-24 18:55:49 +00:00
Michael Zolotukhin	5aaea47e2b	Handle resolvable branches in complete loop unroll heuristic. Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently). Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10206 llvm-svn: 243084	2015-07-24 01:53:04 +00:00
Philip Reames	d193d66cf2	[RewriteStatepointsForGC] Fix release build warning llvm-svn: 243076	2015-07-24 00:42:55 +00:00
Philip Reames	0d86fad371	[RewriteStatepointsForGC] Use a worklist algorithm for first part of base pointer algorithm [NFC] The new code should hopefully be equivalent to the old code; it just uses a worklist to track instructions which need to visited rather than iterating over all instructions visited each time. This should be faster, but the primary benefit is that the purpose should be more clear and the diff of adding another instruction type (forthcoming) much more obvious. Differential Revision: http://reviews.llvm.org/D11480 llvm-svn: 243071	2015-07-24 00:02:11 +00:00
Jingyue Wu	95e704b4b8	[NaryReassociate] remove redundant code This check is already done by findClosestMatchingDominator. llvm-svn: 243065	2015-07-23 23:13:37 +00:00
Philip Reames	2ce9369a79	[RewriteStatepointsForGC] Rename PhiState to reflect that it's associated w/more than just PHIs Today, Select instructions also have associated PhiStates. In the near future, so will ExtractElement and SuffleVector. llvm-svn: 243056	2015-07-23 22:49:14 +00:00
Philip Reames	da3c027d6c	[RewriteStatepointsForGC] Use idomatic mechanisms for debug tracing [NFC] Deleting much of the code using trace-rewrite-statepoints and use idiomatic DEBUG statements instead. This includes adding operator<< to a helper class. llvm-svn: 243054	2015-07-23 22:25:26 +00:00
Philip Reames	78e7af318e	[RewriteStatepointsForGC] Simplify code around meet of PhiStates [NFC] We don't need to pass in the map from BDV to PhiStates; we can instead handle that externally and let the MeetPhiStates helper class just meet PhiStates. llvm-svn: 243045	2015-07-23 21:41:27 +00:00
Matt Wala	145c25bada	[Scalarizer] Fix potential for stale data in Scattered across invocations Summary: Scalarizer has two data structures that hold information about changes to the function, Gathered and Scattered. These are cleared in finish() at the end of runOnFunction() if finish() detects any changes to the function. However, finish() was checking for changes by only checking if Gathered was non-empty. The function visitStore() only modifies Scattered without touching Gathered. As a result, Scattered could have ended up having stale data if Scalarizer only scalarized store instructions. Since the data in Scattered is used during the execution of the pass, this introduced dangling pointer errors. The fix is to check whether both Scattered and Gathered are empty before deciding what to do in finish(). This also fixes a problem where the Function can be modified although the pass returns false. Reviewers: rnk Subscribers: rnk, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D10459 llvm-svn: 243040	2015-07-23 20:53:46 +00:00
Kuba Brecka	eb91722cb9	[asan] Rename the ABI versioning symbol to '__asan_version_mismatch_check' instead of abusing '__asan_init' We currently version `__asan_init` and when the ABI version doesn't match, the linker gives a `undefined reference to '__asan_init_v5'` message. From this, it might not be obvious that it's actually a version mismatch error. This patch makes the error message much clearer by changing the name of the undefined symbol to be `__asan_version_mismatch_check_xxx` (followed by the version string). We obviously don't want the initializer to be named like that, so it's a separate symbol that is used only for the purpose of version checking. Reviewed at http://reviews.llvm.org/D11004 llvm-svn: 243003	2015-07-23 10:54:06 +00:00
Chandler Carruth	efdadcc65a	[GMR] Add a late run of GlobalsModRef to the main pass pipeline behind the general GMR-in-non-LTO flag. Without this, we have the global information during the CGSCC pipeline for GVN and such, but don't have it available during the late loop optimizations such as the vectorizer. Moreover, after the CGSCC pipeline has finished we have substantially more accurate and refined call graph information, function annotations, etc, which will make GMR even more powerful than it is early in the pipelien. Note that we have to play silly games with preserving AliasAnalysis (which is now trivially preserved) in order to let a module analysis magically be preserved into the entire function pass pipeline. Simultaneously we have to not make GMR an immutable pass in order to be able to re-run it and collect fresh data on the final call graph. llvm-svn: 242999	2015-07-23 09:34:01 +00:00
Chandler Carruth	2e896f4f08	[PM/AA] Extract the ModRef enums from the AliasAnalysis class in preparation for de-coupling the AA implementations. In order to do this, they had to become fake-scoped using the traditional LLVM pattern of a leading initialism. These can't be actual scoped enumerations because they're bitfields and thus inherently we use them as integers. I've also renamed the behavior enums that are specific to reasoning about the mod/ref behavior of functions when called. This makes it more clear that they have a very narrow domain of applicability. I think there is a significantly cleaner API for all of this, but I don't want to try to do really substantive changes for now, I just want to refactor the things away from analysis groups so I'm preserving the exact original design and just cleaning up the names, style, and lifting out of the class. Differential Revision: http://reviews.llvm.org/D10564 llvm-svn: 242963	2015-07-22 23:15:57 +00:00
Anthony Pesch	769ee8846e	Revert "Improve merging of stores from static constructors in GlobalOpt" This reverts commit 0a9dee959a30b81b9e7df64c9a58ff9898c24024. llvm-svn: 242954	2015-07-22 22:26:54 +00:00
Anthony Pesch	acd0c70ff9	Revert "IPO: Avoid brace initialization of a map, some versions of libc++ don't like it" This reverts commit fc2dad0c68f8d32273d3c2d790ed496961f829af. llvm-svn: 242953	2015-07-22 22:26:52 +00:00
Justin Bogner	e1590d2f22	IPO: Avoid brace initialization of a map, some versions of libc++ don't like it Should fix the build failure on these darwin bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_build/12427/ http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/10389/ llvm-svn: 242945	2015-07-22 21:41:12 +00:00
Anthony Pesch	7bfb3a910e	Improve merging of stores from static constructors in GlobalOpt Summary: While working on a project I wound up generating a fairly large lookup table (10k entries) of callbacks inside of a static constructor. Clang was taking upwards of ~10 minutes to compile the lookup table. I generated a smaller test case (http://www.inolen.com/static_initializer_test.ll) that, after running with -ftime-report, pointed fingers at GlobalOpt and MemCpyOptimizer. Running globalopt took around ~9 minutes. The slowdown came from how GlobalOpt merged stores from static constructors individually into the global initializer in EvaluateStaticConstructor. For each store it discovered and wanted to commit, it would copy the existing global initializer and then merge in the individual store. I changed this so that stores are now grouped by global, and sorted from most significant to least significant by their GEP indexes (e.g. a store to GEP 0, 0 comes before GEP 0, 0, 1). With this representation, the existing initializer can be copied and all new stores merged into it in a single pass. With this patch and http://reviews.llvm.org/D11198, the lookup table that was taking ~10 minutes to compile now compiles in around 5 seconds. I've ran 'make check' and the test-suite, which all passed. I'm not really sure who to tag as a reviewer, Lang mentioned that Chandler may be appropriate. Reviewers: chandlerc, nlewycky Subscribers: nlewycky, llvm-commits Differential Revision: http://reviews.llvm.org/D11200 llvm-svn: 242935	2015-07-22 21:10:45 +00:00
Hans Wennborg	34fee45808	Fix -Wextra-semi warnings. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D11400 llvm-svn: 242930	2015-07-22 20:46:11 +00:00
Anthony Pesch	78b72fd3c8	Test commit, added blank line llvm-svn: 242923	2015-07-22 18:50:10 +00:00
Chandler Carruth	1665e31f7b	[GMR] Add a flag to enable GlobalsModRef in the normal compilation pipeline. Even before I started improving its runtime, it was already crazy fast once the call graph exists, and if we can get it to be conservatively correct, will still likely catch a lot of interesting and useful cases. So it may well be useful to enable by default. But more importantly for me, this should make it easier for me to test that changes aren't breaking it in fundamental ways by enabling it for normal builds. llvm-svn: 242895	2015-07-22 11:57:28 +00:00
Michael Kuperstein	809b1c325f	Fix mem2reg to correctly handle allocas only used in a single block Currently, a load from an alloca that is used in as single block and is not preceded by a store is replaced by undef. This is not always correct if the single block is inside a loop. Fix the logic so that: 1) If there are no stores in the block, replace the load with an undef, as before. 2) If there is a store (regardless of where it is in the block w.r.t the load), bail out, and let the rest of mem2reg handle this alloca. Patch by: gil.rapaport@intel.com Differential Revision: http://reviews.llvm.org/D11355 llvm-svn: 242884	2015-07-22 10:29:29 +00:00
Kuba Brecka	cc9246c4cd	[asan] Improve moving of non-instrumented allocas In r242510, non-instrumented allocas are now moved into the first basic block. This patch limits that to only move allocas that are present after the first instrumented one (i.e. only move allocas up). A testcase was updated to show behavior in these two cases. Without the patch, an alloca could be moved down, and could cause an invalid IR. Differential Revision: http://reviews.llvm.org/D11339 llvm-svn: 242883	2015-07-22 10:25:38 +00:00
Chandler Carruth	ebae815d81	[PM/AA] Remove all of the dead AliasAnalysis pointers being threaded through APIs that are no longer necessary now that the update API has been removed. This will make changes to the AA interfaces significantly less disruptive (I hope). Either way, it seems like a really nice cleanup. llvm-svn: 242882	2015-07-22 09:52:54 +00:00
Chandler Carruth	cdb8301de0	[PM/AA] Remove the last of the legacy update API from AliasAnalysis as part of simplifying its interface and usage in preparation for porting to work with the new pass manager. Note that this will likely expose that we have dead arguments, members, and maybe even pass requirements for AA. I'll be cleaning those up in seperate patches. This just zaps the actual update API. Differential Revision: http://reviews.llvm.org/D11325 llvm-svn: 242881	2015-07-22 09:49:59 +00:00
Chandler Carruth	4237391aee	[PM/AA] Switch to an early-exit. NFC. This was split out of another change because the diff is useless. I assure you, I just switched to early-return in this function. Cleanup in preparation for my next commit, as requested in code review! llvm-svn: 242880	2015-07-22 09:44:54 +00:00
Chen Li	ca56183986	[LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop() Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change. Reviewers: meheff, reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11276 llvm-svn: 242873	2015-07-22 05:26:29 +00:00
Chandler Carruth	c205d0f3c2	[SROA] Fix a nasty pile of bugs to do with big-endian, different alloca types and loads, loads or stores widened past the size of an alloca, etc. This started off with a bug report about big-endian behavior with bitfields and loads and stores to a { i32, i24 } struct. An initial attempt to fix this was sent for review in D10357, but that didn't really get to the root of the problem. The core issue was that canConvertValue and convertValue in SROA were handling different bitwidth integers by doing a zext of the integer. It wouldn't do a trunc though, only a zext! This would in turn lead SROA to form an i24 load from an i24 alloca, zext it to i32, and then use it. This would at least produce the wrong value for big-endian systems. One of my many false starts here was to correct the computation for big-endian systems by shifting. But this doesn't actually work because the original code has a 64-bit store to the entire 8 bytes, and a 32-bit load of the last 4 bytes, and because the alloc size is 8 bytes, we can't lose that last (least significant if bigendian) byte! The real problem here is that we're forming an i24 load in SROA which is actually not sufficiently wide to load all of the necessary bits here. The source has an i32 load, and SROA needs to form that as well. The straightforward way to do this is to disable the zext logic in canConvertValue and convertValue, forcing us to actually load all 32-bits. This seems like a really good change, but it in turn breaks several other parts of SROA. First in the chain of knock-on failures, we had places where we were doing integer-widening promotion even though some of the integer loads or stores extended past the end of the alloca's memory! There was even a comment about preventing this, but it only prevented the case where the type had a different bit size from its store size. So I added checks to handle the cases where we actually have a widened load or store and to avoid trying to special integer widening promotion in those cases. Second, we actually rely on the ability to promote in the face of loads past the end of an alloca! This is important so that we can (for example) speculate loads around PHI nodes to do more promotion. The bits loaded are garbage, but as long as they aren't used and the alignment is suitable high (which it wasn't in the test case!) this is "fine". And we can't stop promoting here, lots of things stop working well if we do. So we need to add specific logic to handle the extension (and truncation) case, but only where that extension or truncation are over bytes that are outside the alloca's allocated storage and thus totally bogus to load or store. And of course, once we add back this correct handling of extension or truncation, we need to correctly handle bigendian systems to avoid re-introducing the exact bug that started us off on this chain of misery in the first place, but this time even more subtle as it only happens along speculated loads atop a PHI node. I've ported an existing test for PHI speculation to the big-endian test file and checked that we get that part correct, and I've added several more interesting big-endian test cases that should help check that we're getting this correct. Fun times. llvm-svn: 242869	2015-07-22 03:32:42 +00:00
Nick Lewycky	72bee17899	Fix a performance problem in memcpyopt by removing a linear scan over ranges when inserting a new range. No functionality change intended. Patch by Anthony Pesch! llvm-svn: 242843	2015-07-21 21:56:26 +00:00
Philip Reames	233eb1db14	[RewriteStatepointsForGC] minor style cleanup Use a named lambda for readability, common some code, remove a stale comments, and use llvm style variable names. llvm-svn: 242827	2015-07-21 19:04:38 +00:00

... 3 4 5 6 7 ...

13661 Commits