llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Adam Nemet	33fcad4825	[LoopDist] Also emit optimization remark on success (-Rpass=) The option -Rpass=loop-distribute now reports the loops that were distributed. llvm-svn: 268006	2016-04-29 07:10:46 +00:00
David Majnemer	2075b35f09	[SLPVectorizer] Add operand bundles to vectorized functions SLPVectorizing a call site should result in further propagation of its bundles. llvm-svn: 268004	2016-04-29 07:09:51 +00:00
David Majnemer	f7012da323	[LoopVectorize] Add operand bundles to vectorized functions Also, do not crash when calculating a cost model for loop-invariant token values. llvm-svn: 268003	2016-04-29 07:09:48 +00:00
Matt Arsenault	1b74f4bf74	AMDGPU: Stop reporting an addressing mode for unknown addrspace This was being treated the same as private, which has an immediate offset. For unknown, it probably means it's for a computation not actually being used for accessing memory, so it should not have a nontrivial addressing mode. llvm-svn: 268002	2016-04-29 06:25:10 +00:00
David Majnemer	e17dff606d	[ArgumentPromotion] Propagate operand bundles to promoted call sites We neglected to transfer operand bundles when performing argument promotion. This fixes PR27568. llvm-svn: 267986	2016-04-29 04:56:12 +00:00
Michael Zolotukhin	6be07401c1	[PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer. We don't preserve AAResults, because, for one, we don't preserve SCEV-AA. That fixes PR25281. llvm-svn: 267980	2016-04-29 03:31:25 +00:00
Hal Finkel	52782fda9f	[LoopVectorize] Keep hints from original loop on the vector loop We need to keep loop hints from the original loop on the new vector loop. Failure to do this meant that, for example: void foo(int *b) { #pragma clang loop unroll(disable) for (int i = 0; i < 16; ++i) b[i] = 1; } this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the hints that unrolling should be disabled, and then we'd unroll it. llvm-svn: 267970	2016-04-29 01:27:40 +00:00
Adam Nemet	b7cc6fd680	[LoopDist] Emit optimization remarks (-Rpass) I closely followed the precedents set by the vectorizer: With -Rpass-missed, the loop is reported with further details pointing to -Rpass--analysis. * -Rpass-analysis reports the details why distribution has failed. * Regardless of -Rpass*, when distribution fails for a loop where distribution was forced with the pragma, a warning is produced according to -Wpass-failed. In this case the analysis info is also printed even without -Rpass-analysis. llvm-svn: 267952	2016-04-28 23:08:32 +00:00
Hal Finkel	5a6e2ec855	[Inliner] Preserve llvm.mem.parallel_loop_access metadata When inlining a call site with llvm.mem.parallel_loop_access metadata, this metadata needs to be propagated to all cloned memory-accessing instructions. Otherwise, inlining parts of the loop body will invalidate the annotation. With this functionality, we now vectorize the following as expected: void Body(int res, int c, int d, int p, int i) { res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } void Test(int res, int c, int d, int p, int n) { int i; #pragma clang loop vectorize(assume_safety) for (i = 0; i < 1600; i++) { Body(res, c, d, p, i); } } llvm-svn: 267949	2016-04-28 23:00:04 +00:00
Arch D. Robison	606517b553	[SLPVectorizer] Extend SLP Vectorizer to deal with aggregates. The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899	2016-04-28 16:11:45 +00:00
Simon Pilgrim	04933872c3	[InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873	2016-04-28 12:22:53 +00:00
Sanjay Patel	9fec0afb7a	Update test to use FileCheck Also, add some metadata to show what that currently looks like. llvm-svn: 267827	2016-04-28 00:29:27 +00:00
Rong Xu	3a8084117b	more buildbot failure fix to r267792 __llvm_prf_nm length is embedded in llvm_used. Relax llvm_used check. llvm-svn: 267816	2016-04-27 23:23:53 +00:00
Rong Xu	b6a36f2009	[PGO] Promote indirect calls to conditional direct calls with value-profile This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 llvm-svn: 267815	2016-04-27 23:20:27 +00:00
Rong Xu	db7372e42c	Fix buildbot failure due to r267792 Relax the test check as some targets do not have name compression. llvm-svn: 267803	2016-04-27 22:06:35 +00:00
Rong Xu	724ca3e5db	[PGO] Prohibit address recording if the function is both internal and COMDAT Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792	2016-04-27 21:17:30 +00:00
Simon Pilgrim	7ece5e4f29	[InstCombine][AVX2] Add AVX2 per-element vector shift tests At the moment we don't simplify PSRAV/PSRLV/PSLLV intrinsics to generic IR for constant shift amounts, but we could. llvm-svn: 267777	2016-04-27 20:25:34 +00:00
David Majnemer	221bf8bc4d	[CodeGenPrepare] Don't sink a cast past its user The sink cast machinery is supposed to sink casts as close to their user as possible. However, an EH pad is the first instruction in it's basic block. Don't sink if the user is an EH pad. This fixes PR27536. llvm-svn: 267767	2016-04-27 19:36:38 +00:00
Ahmed Bougacha	591dc29993	[LIR] Set attributes on memset_pattern16. "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762	2016-04-27 19:04:50 +00:00
Ahmed Bougacha	efd557cecd	[InferAttrs] Mark memset_pattern16 params nocapture. Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760	2016-04-27 19:04:43 +00:00
Matthew Simpson	71a9d56171	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Gerolf Hoflehner	832835b3e9	[InstCombine] Sharpended test case in pr21210.ll llvm-svn: 267742	2016-04-27 17:19:54 +00:00
Matthew Simpson	995eceaf0c	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Simon Pilgrim	3f41257831	[InstCombine][SSE] Regenerated vector shift tests llvm-svn: 267699	2016-04-27 12:04:44 +00:00
Simon Pilgrim	b8826cf36d	[InstCombine][SSE] Added DemandedBits tests for MOVMSK instructions MOVMSK zeros the upper bits of the gpr - we should be able to use this. llvm-svn: 267686	2016-04-27 09:53:09 +00:00
Adam Nemet	ededcfa020	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Evgeny Stupachenko	9d35c64dc2	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Philip Reames	0926f356af	[LVI] Reduce compile time by lazily scanning blocks if needed When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but then we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case. Instead, we know do the block scan only after we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so. Note that the case where we can't prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one. llvm-svn: 267642	2016-04-27 00:30:55 +00:00
Justin Bogner	1caa3a1a82	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Sanjay Patel	51a5a5649b	[SimplifyCFG] propagate branch metadata when creating select llvm-svn: 267624	2016-04-26 23:15:48 +00:00
Philip Reames	d559635ec4	[LVI] Apply transfer rule for overdefine inputs for binary operators As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch builds on 267609 which did the same thing for unary casts. llvm-svn: 267620	2016-04-26 23:10:35 +00:00
Philip Reames	368d1c8275	[LVI] A better fix for the assertion error introduced by 267609 Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check. llvm-svn: 267618	2016-04-26 22:52:30 +00:00
Sanjay Patel	65b8c3537e	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Philip Reames	fadf8b424b	[LVI] Infer local facts from unary expressions As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations. Differential Revision: http://reviews.llvm.org/D19492 llvm-svn: 267609	2016-04-26 21:48:16 +00:00
David Majnemer	67f60eaa89	Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes" The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605	2016-04-26 21:04:47 +00:00
Elena Demikhovsky	c7f80a2ab8	Masked Store in Loop Vectorizer - bugfix Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597	2016-04-26 20:18:04 +00:00
Justin Bogner	72de028cc8	PM: Port Internalize to the new pass manager llvm-svn: 267596	2016-04-26 20:15:52 +00:00
David Majnemer	5541fe7f24	[SimplifyLibCalls] sprintf doesn't copy null bytes sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580	2016-04-26 18:16:49 +00:00
Dehao Chen	23b3e63a01	Tune basic block annotation algorithm. Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519	2016-04-26 04:59:11 +00:00
Hal Finkel	bee4e0db1c	[SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515	2016-04-26 02:06:06 +00:00
Hal Finkel	7956f36dac	[LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int res, int c, int d, int p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514	2016-04-26 02:00:36 +00:00
Sanjay Patel	b5dda51232	[CodeGenPrepare] don't convert an unpredictable select into control flow Suggested in the review of D19488: http://reviews.llvm.org/D19488 llvm-svn: 267504	2016-04-26 00:47:39 +00:00
Justin Bogner	4168e588c8	PM: Port GlobalOpt to the new pass manager llvm-svn: 267499	2016-04-26 00:28:01 +00:00
Justin Bogner	504105c02c	GlobalOpt: Convert a bunch of tests from grep to FileCheck llvm-svn: 267493	2016-04-25 23:36:50 +00:00
Sanjay Patel	4f931e53db	Add check for "branch_weights" with prof metadata While we're here, fix the comment and variable names to make it clear that these are raw weights, not percentages. llvm-svn: 267491	2016-04-25 23:15:16 +00:00
Arch D. Robison	d92b3378c4	Optimize store of "bitcast" from vector to aggregate. This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482	2016-04-25 22:22:39 +00:00
Chad Rosier	879234fa76	Fix typo from r267432. llvm-svn: 267436	2016-04-25 18:20:27 +00:00
Chad Rosier	95ab8b0612	[ValueTracking] Add an additional test case for r266767 where one operand is a const. llvm-svn: 267432	2016-04-25 17:41:48 +00:00
Chad Rosier	5a0c6f041b	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
James Molloy	4181657e21	[GlobalOpt] Allow constant globals to be SRA'd The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393	2016-04-25 10:48:29 +00:00
Simon Pilgrim	4943bd650f	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359	2016-04-24 18:35:59 +00:00
Simon Pilgrim	1b73eef3b4	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2) Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357	2016-04-24 18:23:14 +00:00
Simon Pilgrim	cd433f92ba	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2) This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356	2016-04-24 18:12:42 +00:00
Duncan P. N. Exon Smith	94512f24de	DebugInfo: Remove MDString-based type references Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around DIType*. It is no longer legal to refer to a DICompositeType by its 'identifier:', and DIBuilder no longer retains all types with an 'identifier:' automatically. Aside from the bitcode upgrade, this is mainly removing logic to resolve an MDString-based reference to an actualy DIType. The commits leading up to this have made the implicit type map in DICompileUnit's 'retainedTypes:' field superfluous. This does not remove DITypeRef, DIScopeRef, DINodeRef, and DITypeRefArray, or stop using them in DI-related metadata. Although as of this commit they aren't serving a useful purpose, there are patchces under review to reuse them for CodeView support. The tests in LLVM were updated with deref-typerefs.sh, which is attached to the thread "[RFC] Lazy-loading of debug info metadata": http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html llvm-svn: 267296	2016-04-23 21:08:00 +00:00
Nico Weber	6dca3ae366	Revert r267210, it makes clang assert (PR27490). llvm-svn: 267232	2016-04-22 22:08:42 +00:00
Peter Collingbourne	d04766ba20	Introduce llvm.load.relative intrinsic. This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that value and returns it. The constant folder specifically recognizes the form of this intrinsic and the constant initializers it may load from; if a loaded constant initializer is known to have the form ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. LLVM provides that the calculation of such a constant initializer will not overflow at link time under the medium code model if ``x`` is an ``unnamed_addr`` function. However, it does not provide this guarantee for a constant initializer folded into a function body. This intrinsic can be used to avoid the possibility of overflows when loading from such a constant. Differential Revision: http://reviews.llvm.org/D18367 llvm-svn: 267223	2016-04-22 21:18:02 +00:00
Philip Reames	53e3092198	[unordered] sink unordered stores at end of blocks The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215	2016-04-22 20:53:32 +00:00
Sanjoy Das	cbed8d8247	Fold compares for distinct allocations Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214	2016-04-22 20:52:25 +00:00
Philip Reames	29f7cc186f	[unordered] Extend load/store type canonicalization to handle unordered operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210	2016-04-22 20:33:48 +00:00
Justin Bogner	0bc097ad38	PM: Port SinkingPass to the new pass manager llvm-svn: 267199	2016-04-22 19:54:10 +00:00
Jun Bum Lim	b80664f84c	[DeadStoreElimination] Shorten beginning of memset overwritten by later stores Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197	2016-04-22 19:51:29 +00:00
Justin Bogner	fad7547388	PM: Port DCE to the new pass manager Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196	2016-04-22 19:40:41 +00:00
Adam Nemet	089177d9da	[LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disable In the next change, I am generalizing the function findStringMetadataForLoop and I want to make sure I don't break this. Looks like there was no coverage for this so far. llvm-svn: 267182	2016-04-22 18:34:50 +00:00
Chad Rosier	44f5b9d174	[SimplifyCFG] Add final missing implications to isImpliedTrueByMatchingCmp. Summary: eq imply [u\|s]ge and [u\|s]le are true. Remove redundant logic by implementing isImpliedFalseByMatchingCmp(Pred1, Pred2) as isImpliedTrueByMatchingCmp(Pred1, getInversePredicate(Pred2)). llvm-svn: 267177	2016-04-22 17:57:34 +00:00
Chad Rosier	492b4554a0	[SimplifyCFG] Add missing implications to isImpliedTrueByMatchingCmp. Summary: [u\|s]gt and [u\|s]lt imply [u\|s]ge and [u\|s]le are true, respectively. I've simplified the existing tests and added additional tests to cover the new cases mentioned above. I've also added tests for all the cases where the first compare doesn't imply anything about the second compare. llvm-svn: 267171	2016-04-22 17:14:12 +00:00
Chad Rosier	aa97cc7ffd	[SimplifyCFG] Simplify code review by temporarily removing this test file. A followup commit will replace these tests with simplified and more inclusive tests. The diff is unreadable if this were to be done in a single commit. llvm-svn: 267170	2016-04-22 17:14:08 +00:00
David Majnemer	0c47b59618	[EarlyCSE] Don't add the overflow flags to the hash We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153	2016-04-22 14:12:50 +00:00
Silviu Baranga	52b925e4c9	[InstCombine] Preserve fast math flags when combining PHIs Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139	2016-04-22 11:21:36 +00:00
David Majnemer	0b1305ba18	[GVN] Respect fast-math-flags on fcmps We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113	2016-04-22 06:37:51 +00:00
David Majnemer	56b0cb8635	[EarlyCSE] Take the intersection of flags on instructions EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111	2016-04-22 06:37:45 +00:00
Sanjoy Das	d82c062070	Folding compares with unescaped allocations Summary: If we know that the pointer allocated within a function does not escape, we can fold away comparisons that are done with global pointers Patch by Anna Thomas! Reviewers: reames, majnemer, sanjoy Subscribers: mgrang, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D19276 llvm-svn: 267035	2016-04-21 19:26:45 +00:00
Philip Reames	1c0459c510	[instcombine][unordered] Extend load(select) transform to handle unordered loads llvm-svn: 267023	2016-04-21 17:59:40 +00:00
Philip Reames	7e0ae29aca	[unordered] unordered loads from null are still unreachable llvm-svn: 267019	2016-04-21 17:45:05 +00:00
Philip Reames	e04aebaa98	[instcombine][unordered] Implement *-load forwarding for unordered atomics This builds on 266999 which made FindAvailableValue do the right thing. Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements. llvm-svn: 267006	2016-04-21 17:03:33 +00:00
Philip Reames	7e4f2fa4ac	[unordered] Add tests and conservative handling in support of future changes [NFCI] This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing. At the moment, the code added is dead, but separating it makes follow on changes far more obvious. llvm-svn: 266999	2016-04-21 16:51:08 +00:00
Sanjoy Das	785700a50c	[SimplifyCFG] Fold `llvm.guard(false)` to unreachable Summary: `llvm.guard(false)` always bails out of the current compilation unit, so we can prune any control flow following it. Reviewers: hfinkel, pcc, reames Subscribers: majnemer, reames, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19245 llvm-svn: 266955	2016-04-21 05:09:12 +00:00
Mehdi Amini	857b9f0901	ThinLTO/ModuleLinker: add a flag to not always pull-in linkonce when performing importing Summary: The function importer already decided what symbols need to be pulled in. Also these magically added ones will not be in the export list for the source module, which can confuse the internalizer for instance. Reviewers: tejohnson, rafael Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19096 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266948	2016-04-21 01:59:39 +00:00
Nick Lewycky	39244705f7	Add optimization for 'icmp slt (or A, B), A' and some related idioms based on knowledge of the sign bit for A and B. No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table: ``` A is +, -, +/- F F F + B is T F ? - ? F ? +/- ``` The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate. There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers. llvm-svn: 266939	2016-04-21 00:53:14 +00:00
Vedant Kumar	d3e6694129	[test/PGOProfile] Make tests independent of the raw profile version (NFC) Differential Revision: http://reviews.llvm.org/D19290 llvm-svn: 266928	2016-04-20 22:24:01 +00:00
Teresa Johnson	e24ec3840e	[ThinLTO] Prevent importing of "llvm.used" values Summary: This patch prevents importing from (and therefore exporting from) any module with a "llvm.used" local value. Local values need to be promoted and renamed when importing, and their presense on the llvm.used variable indicates that there are opaque uses that won't see the rename. One such example is a use in inline assembly. See also the discussion at: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html As part of this, move collectUsedGlobalVariables out of Transforms/Utils and into IR/Module so that it can be used more widely. There are several other places in LLVM that used copies of this code that can be cleaned up as a follow on NFC patch. Reviewers: joker.eph Subscribers: pcc, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18986 llvm-svn: 266877	2016-04-20 14:39:45 +00:00
Mandeep Singh Grang	28ddad394b	[LLVM] Remove unwanted --check-prefix=CHECK from unit tests. NFC. Summary: Removed unwanted --check-prefix=CHECK from numerous unit tests. Reviewers: t.p.northover, dblaikie, uweigand, MatzeB, tstellarAMD, mcrosier Subscribers: mcrosier, dsanders Differential Revision: http://reviews.llvm.org/D19279 llvm-svn: 266834	2016-04-19 23:51:52 +00:00
Marcin Koscielnicki	0919e582e6	[AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic. Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that just return the thread pointer. I have a pending patch that does the same for SystemZ (D19054), and there are many more targets that could benefit from one. This patch merges the ARM and AArch64 intrinsics into a single target independent one that will also be used by subsequent targets. Differential Revision: http://reviews.llvm.org/D19098 llvm-svn: 266818	2016-04-19 20:51:05 +00:00
Chad Rosier	6c28047766	[ValueTracking] Improve isImpliedCondition for conditions with matching operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767	2016-04-19 17:19:14 +00:00
Simon Pilgrim	2006c6eadb	[InstCombine][X86] Added extra tests introduced for D17490 llvm-svn: 266732	2016-04-19 12:59:52 +00:00
Simon Pilgrim	501f5ba3b6	[InstCombine][X86] Regenerate SSE combine tests as part of setup for D17490 Regenerated with utils/update_test_checks.py llvm-svn: 266731	2016-04-19 12:56:46 +00:00
Tim Northover	5f6de253c5	ARM: use a pseudo-instruction for cmpxchg at -O0. The fast register-allocator cannot cope with inter-block dependencies without spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions where any value produced within a block is dead by the end, but not for cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded after regalloc. Fortunately this is at -O0 so we don't have to care about performance. This simplifies the various axes of expansion considerably: we assume a strong seq_cst operation and ensure ordering via the always-present DMB instructions rather than v8 acquire/release instructions. Should fix the 32-bit part of PR25526. llvm-svn: 266679	2016-04-18 21:48:55 +00:00
Chad Rosier	067146c0fb	[ValueTracking] Correct lit test comments. NFC. llvm-svn: 266657	2016-04-18 19:11:45 +00:00
Eric Liu	4b68ab25ac	Revert "Replace the use of MaxFunctionCount module flag" This reverts commit r266477. This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData", while "ProfileData" depends on "Object", which depends on "BitCode", which depends on "Analysis". llvm-svn: 266619	2016-04-18 15:31:11 +00:00
Renato Golin	4628935835	[ARM] AArch32 v8 NEON is still not IEEE-754 compliant llvm-svn: 266603	2016-04-18 12:06:47 +00:00
Sanjoy Das	157d1c4576	Fix a typo in rL265762 I accidentally replaced `mayBeOverridden` with `!isInterposable`. Remove the negation and add a test case that would've caught this. Many thanks to Håkan Hjort for spotting this! llvm-svn: 266551	2016-04-17 04:30:43 +00:00
Mehdi Amini	d3e6360b27	ThinLTO: Make aliases explicit in the summary To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266517	2016-04-16 06:56:44 +00:00
Evgeniy Stepanov	4c8a1f75c0	[cfi] Support explicit sections for functions in cfi-icall. Allow explicit section for indirectly called functions in cfi-icall. Jumptables for functions in the same type class must be contiguous, so they always go to the default text section. Fixes PR25079. llvm-svn: 266486	2016-04-15 22:55:38 +00:00
Adrian Prantl	b940fef6b9	Convert this sample-based-profiling testcase to use a NoDebug CU. llvm-svn: 266481	2016-04-15 22:05:38 +00:00
Easwaran Raman	086e6e3f9e	Replace the use of MaxFunctionCount module flag Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477	2016-04-15 21:39:58 +00:00
Tim Northover	4396c4b8cf	ARM: don't try to hoist constant RHS out of a division. Divisions by a constant can be converted into multiplies which are usually cheaper, but this isn't possible if the constant gets separated (particularly in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is cheap. I considered making the check generic, but neither AArch64 (strangely) nor x86 showed any benefit on the tests I had. llvm-svn: 266464	2016-04-15 18:17:18 +00:00
David Majnemer	77f3549d90	[InstCombine] Don't transform compares of calls to functions named fabs{f,l,} InstCombine wants to optimize compares of calls to fabs with zero. However, we didn't have the necessary legality checking to verify that the function call had the same behavior as fabs. llvm-svn: 266452	2016-04-15 17:21:03 +00:00
Adrian Prantl	fb3abba237	[PR27284] Reverse the ownership between DICompileUnit and DISubprogram. Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446	2016-04-15 15:57:41 +00:00
Sanjay Patel	d65ab200a1	[SimplifyCFG] propagate branch metadata when creating select (PR27344) This is almost identical to: http://reviews.llvm.org/rL264527 This doesn't solve PR27344; it just allows the profile weights to survive. To solve the bug, we need to use the profile weights in the backend. llvm-svn: 266442	2016-04-15 15:32:12 +00:00
Sanjay Patel	737ce217d1	[SimplifyCFG] add metadata to show failure to propagate (PR27344) llvm-svn: 266435	2016-04-15 14:53:35 +00:00
Justin Lebar	d4f9b5931f	Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX target. llvm-svn: 266403	2016-04-15 01:20:52 +00:00
Justin Lebar	c6bd85bac2	[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398	2016-04-15 00:32:09 +00:00
Vedant Kumar	8b67388584	[test] Require 'asserts' for a test which uses -debug-only Without this line, bots which run check-all on Release compilers will break. llvm-svn: 266386	2016-04-14 23:32:40 +00:00
Michael Kuperstein	80e3983b83	[AliasSetTracker] Correctly handle changing the size of an entry If the size of an AST entry changes, we also need to make sure we perform necessary alias set merges, as the new size may overlap pointers in other sets. We happen to run into this with memset, because memset allows an entry for a i8* pointer to have a decidedly non-i8 size. This fixes PR27262. Differential Revision: http://reviews.llvm.org/D18939 llvm-svn: 266381	2016-04-14 22:00:11 +00:00
Renato Golin	bf4b959200	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Sanjay Patel	1d8eaf1ecc	[InstCombine] remove constant by inverting compare + logic (PR27105) https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362	2016-04-14 20:17:40 +00:00
Dehao Chen	5e2f05be2a	Update discriminator assignment algorithm to handle nested call correctly. Summary: Add discriminator for nested call correctly. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19127 llvm-svn: 266354	2016-04-14 18:37:18 +00:00
Adam Nemet	aa40b369a6	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282	2016-04-14 08:47:17 +00:00
Tim Northover	3d26dcef22	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Easwaran Raman	0251d7cd34	Test case for r265852. llvm-svn: 266237	2016-04-13 19:43:31 +00:00
Betul Buyukkurt	0fc66f4b8f	[PGO] Remove redundant VP instrumentation LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229	2016-04-13 18:52:19 +00:00
Mehdi Amini	6968b7bc43	Revert "Make aliases explicit in the summary" Inadvertently commited... This reverts commit e618ec93786d99df2ddf280ad2d5e02f5516cecf. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266215	2016-04-13 17:20:07 +00:00
Mehdi Amini	d96d756a2f	Make aliases explicit in the summary Summary: To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266214	2016-04-13 17:18:42 +00:00
David L Kreitzer	cc655b3f3d	Simplify strlen to a subtraction for certain cases. Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200	2016-04-13 14:31:06 +00:00
Petar Jovanovic	9f56ef2b97	Calculate __builtin_object_size when pointer depends on a condition This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193	2016-04-13 12:25:25 +00:00
David Majnemer	9e0160358c	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175	2016-04-13 06:55:52 +00:00
Matt Arsenault	6d1a51615f	AMDGPU: Remove leftover ShaderType attributes in tests llvm-svn: 266155	2016-04-13 00:39:48 +00:00
Sanjay Patel	ba900a7bb3	[x86, InstCombine] fix masked load pass-through operand to be a zero vector This bug was introduced with: http://reviews.llvm.org/rL262269 AVX masked loads are specified to set vector lanes to zero when the high bit of the mask element for that lane is zero: "If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form." --Intel manual Differential Revision: http://reviews.llvm.org/D19017 llvm-svn: 266148	2016-04-12 23:16:23 +00:00
Mehdi Amini	6875fa5607	Add a pass to name anonymous/nameless function Summary: For correct handling of alias to nameless function, we need to be able to refer them through a GUID in the summary. Here we name them using a hash of the non-private global names in the module. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18883 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266132	2016-04-12 21:35:28 +00:00
Mehdi Amini	03a2cdacee	Move summary creation out of llvm-as into opt Summary: Let keep llvm-as "dumb": it converts textual IR to bitcode. This commit removes the dependency from llvm-as to libLLVMAnalysis. We'll add back summary in llvm-as if we get to a textual representation for it at some point. In the meantime, opt seems like a better place for that. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19032 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266131	2016-04-12 21:35:18 +00:00
James Y Knight	a55b68a75e	Add __atomic_* lowering to AtomicExpandPass. (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115	2016-04-12 20:18:48 +00:00
Artur Pilipenko	d9a8d9a1af	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086	2016-04-12 15:58:04 +00:00
Rafael Espindola	b04bc032f0	This reverts commit r266002, r266011 and r266016. They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062	2016-04-12 12:30:25 +00:00
George Burgess IV	4e2d8dca99	Add the allocsize attribute to LLVM. `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032	2016-04-12 01:05:35 +00:00
JF Bastien	457191ce8d	MergeFunctions: test alloca better r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022	2016-04-12 00:03:26 +00:00
Mehdi Amini	bf78d0ed58	ThinLTO renaming: use module hash instead of position in the summary This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018	2016-04-11 23:26:46 +00:00
Evgeniy Stepanov	b10de1bbaa	[safestack] Add canary to unsafe stack frames Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004	2016-04-11 22:27:48 +00:00
James Y Knight	003ee915ba	Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002	2016-04-11 22:22:33 +00:00
Davide Italiano	eff78dd58c	[DebugInfo/Test] Add CU as required. llvm-svn: 265999	2016-04-11 21:16:48 +00:00
Matthew Simpson	94955b1eeb	[LoopUtils, LV] Fix PR27246 (first-order recurrences) This patch ensures that when we detect first-order recurrences, we reject a phi node if its previous value is also a phi node. During vectorization the initial and previous values of the recurrence are shuffled together to create the value for the current iteration. However, phi nodes are not widened like other instructions. This fixes PR27246. Differential Revision: http://reviews.llvm.org/D18971 llvm-svn: 265983	2016-04-11 19:48:18 +00:00
Davide Italiano	50a1bb5adb	[DebugInfo] Fix even more tests to include DICompileunit. llvm-svn: 265980	2016-04-11 18:53:27 +00:00
Adrian Prantl	b333fd2831	Fix missing DICompileUnits in testcases llvm-svn: 265974	2016-04-11 18:15:44 +00:00
Sanjay Patel	1bd30061be	[InstCombine] consolidate tests for related bugs llvm-svn: 265973	2016-04-11 17:58:37 +00:00
Adrian Prantl	b0e1a93e69	Update discriminator testcases to use proper NoDebug CUs instead of omitting !llvm.dbg.cu. llvm-svn: 265961	2016-04-11 16:58:35 +00:00
Sanjay Patel	c3b4340d01	[InstCombine] don't try to shift an illegal amount (PR26760) This is the straightforward fix for PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 But we still need to make some changes to generalize this helper function and then send the lshr case into here. llvm-svn: 265960	2016-04-11 16:50:32 +00:00
Adrian Prantl	dac3cdd613	More upgrading of old- and very-old-style debug info in testcases. llvm-svn: 265953	2016-04-11 15:53:44 +00:00
Sanjoy Das	23c7d74ce2	This reverts commit r265913 and r265912 See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" llvm-svn: 265950	2016-04-11 15:26:18 +00:00
Sanjay Patel	f2c766cf68	[InstCombine] replace test that no longer works as intended This is step 1 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265946	2016-04-11 15:19:44 +00:00
Teresa Johnson	d7c9485243	[ThinLTO] Move summary computation from BitcodeWriter to new pass Summary: This is the first step in also serializing the index out to LLVM assembly. The per-module summary written to bitcode is moved out of the bitcode writer and to a new analysis pass (ModuleSummaryIndexWrapperPass). The pass itself uses a new builder class to compute index, and the builder class is used directly in places where we don't have a pass manager (e.g. llvm-as). Because we are computing summaries outside of the bitcode writer, we no longer can use value ids created by the bitcode writer's ValueEnumerator. This required changing the reference graph edge type to use a new ValueInfo class holding a union between a GUID (combined index) and Value* (permodule index). The Value* are converted to the appropriate value ID during bitcode writing. Also, this enables removal of the BitWriter library's dependence on the Analysis library that was previously required for the summary computation. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18763 llvm-svn: 265941	2016-04-11 13:58:45 +00:00
Sanjoy Das	979adb799a	[IndVars] Eliminate op.with.overflow when possible Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 265913	2016-04-10 22:50:31 +00:00
Elena Demikhovsky	8410bf33a6	Loop vectorization with uniform load Vectorization cost of uniform load wasn't correctly calculated. As a result, a simple loop that loads a uniform value wasn't vectorized. Differential Revision: http://reviews.llvm.org/D18940 llvm-svn: 265901	2016-04-10 16:53:19 +00:00
Sanjoy Das	85af6d6f07	Maintain calling convention when inling calls to llvm.deoptimize The behavior here was buggy -- we'd forget the calling convention after inlining a callsite calling llvm.deoptimize. llvm-svn: 265867	2016-04-09 00:22:59 +00:00
Sanjoy Das	224e72eaad	Propagate Undef in llvm.cos Intrinsic Summary: The llvm cos intrinsic currently does not propagate undef's. This change transforms cos(undef) to null value or 0. There are 2 test cases added as well. Patch by Anna Thomas! Reviewers: sanjoy Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18863 llvm-svn: 265825	2016-04-08 18:21:11 +00:00
David Majnemer	08e9ca8730	[InstCombine] Fix miscompile in FoldSPFofSPF We had a select of a cast of a select but attempted to replace the outer select with the inner select dispite their incompatible types. Patch by Anton Korobeynikov! This fixes PR27236. llvm-svn: 265805	2016-04-08 16:51:49 +00:00
David Majnemer	a4716b737d	[LoopVectorize] Register cloned assumptions InstCombine cannot effectively remove redundant assumptions without them registered in the assumption cache. The vectorizer can create identical assumptions but doesn't register them with the cache, resulting in slower compile times because InstCombine tries to reason about a lot more assumptions. Fix this by registering the cloned assumptions. llvm-svn: 265800	2016-04-08 16:37:10 +00:00
Silviu Baranga	a0999051a2	Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV This re-commits r265535 which was reverted in r265541 because it broke the windows bots. The problem was that we had a PointerIntPair which took a pointer to a struct allocated with new. The problem was that new doesn't provide sufficient alignment guarantees. This pattern was already present before r265535 and it just happened to work. To fix this, we now separate the PointerToIntPair from the ExitNotTakenInfo struct into a pointer and a bool. Original commit message: Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265786	2016-04-08 14:29:09 +00:00
Duncan P. N. Exon Smith	89c4487fee	Reapply "ValueMapper: Treat LocalAsMetadata more like function-local Values" This reverts commit r265765, reapplying r265759 after changing a call from LocalAsMetadata::get to ValueAsMetadata::get (and adding a unit test). When a local value is mapped to a constant (like "i32 %a" => "i32 7"), the new debug intrinsic operand may no longer be pointing at a local. http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/19020/ The previous coommit message follows: -- This is a partial re-commit -- maybe more of a re-implementation -- of r265631 (reverted in r265637). This makes RF_IgnoreMissingLocals behave (almost) consistently between the Value and the Metadata hierarchy. In particular: - MapValue returns nullptr or "metadata !{}" for missing locals in MetadataAsValue/LocalAsMetadata bridging paris, depending on the RF_IgnoreMissingLocals flag. - MapValue doesn't memoize LocalAsMetadata-related results. - MapMetadata no longer deals with LocalAsMetadata or RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but I realized during testing it would make the patch simpler with no loss of generality.) r265631 went too far, making both functions universally ignore RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt. Reassociate (and possibly other passes) don't currently maintain dominates-use invariants for metadata operands, resulting in IR like this: define void @foo(i32 %arg) { call void @llvm.some.intrinsic(metadata i32 %x) %x = add i32 1, i32 %arg } If the inliner chooses to inline @foo into another function, then RemapInstruction will call `MapValue(metadata i32 %x)` and assert that the return is not nullptr. I've filed PR27273 to add a Verifier check and fix the underlying problem in the optimization passes. As a workaround, return `!{}` instead of nullptr for unmapped LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match the behaviour of r265631. Original commit message: ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265768	2016-04-08 03:13:22 +00:00
Duncan P. N. Exon Smith	957a52adbf	Revert "ValueMapper: Treat LocalAsMetadata more like function-local Values" This reverts commit r265759, since even this limited version breaks some bots: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/3311 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/17696 This also reverts r265761 "ValueMapper: Unduplicate RF_NoModuleLevelChanges check, NFC", since I had trouble separating it from r265759. llvm-svn: 265765	2016-04-08 00:56:21 +00:00
Sanjoy Das	b20d278ebd	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762	2016-04-08 00:48:30 +00:00
Duncan P. N. Exon Smith	1f42c70b4f	ValueMapper: Treat LocalAsMetadata more like function-local Values This is a partial re-commit -- maybe more of a re-implementation -- of r265631 (reverted in r265637). This makes RF_IgnoreMissingLocals behave (almost) consistently between the Value and the Metadata hierarchy. In particular: - MapValue returns nullptr or "metadata !{}" for missing locals in MetadataAsValue/LocalAsMetadata bridging paris, depending on the RF_IgnoreMissingLocals flag. - MapValue doesn't memoize LocalAsMetadata-related results. - MapMetadata no longer deals with LocalAsMetadata or RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but I realized during testing it would make the patch simpler with no loss of generality.) r265631 went too far, making both functions universally ignore RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt. Reassociate (and possibly other passes) don't currently maintain dominates-use invariants for metadata operands, resulting in IR like this: define void @foo(i32 %arg) { call void @llvm.some.intrinsic(metadata i32 %x) %x = add i32 1, i32 %arg } If the inliner chooses to inline @foo into another function, then RemapInstruction will call `MapValue(metadata i32 %x)` and assert that the return is not nullptr. I've filed PR27273 to add a Verifier check and fix the underlying problem in the optimization passes. As a workaround, return `!{}` instead of nullptr for unmapped LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match the behaviour of r265631. Original commit message: ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265759	2016-04-08 00:33:44 +00:00
Ulrich Weigand	4c20bac964	[GVN] Fix handling of sub-byte types in big-endian mode When GVN wants to re-interpret an already available value in a smaller type, it needs to right-shift the value on big-endian systems to ensure the correct bytes are accessed. The shift value is the difference of the sizes of the two types. This is correct as long as both types occupy multiples of full bytes. However, when one of them is a sub-byte type like i1, this no longer holds true: we still need to shift, but only to access the correct byte. Accessing bits within the byte requires no shift in either endianness; e.g. an i1 resides in the least-significant bit of its containing byte on both big- and little-endian systems. Therefore, the appropriate shift value to be used is the difference of the storage sizes of the two types. This is already handled correctly in one place where such a shift takes place (GetStoreValueForLoad), but is incorrect in two other places: GetLoadValueForLoad and CoerceAvailableValueToLoadType. This patch changes both places to use the storage size as well. Differential Revision: http://reviews.llvm.org/D18662 llvm-svn: 265684	2016-04-07 15:45:02 +00:00

1 2 3 4 5 ...

6664 Commits