llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00

Author	SHA1	Message	Date
Serge Pavlov	978a422aed	Fix type of shuffle resulted from shuffle merge. This fix resolves PR19730. llvm-svn: 208666	2014-05-13 06:07:21 +00:00
Serge Pavlov	1ff1d49a09	Fix type of shuffle obtained from reordering with binary operation In transformation: BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef) type of the undef argument must be same as type of BinOp. llvm-svn: 208531	2014-05-12 10:11:27 +00:00
Serge Pavlov	8e3b52d51f	Fix reordering of shuffles and binary operations Do not apply transformation: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) if operands v1 and v2 are of different size. This change fixes PR19717, which was caused by r208488. llvm-svn: 208518	2014-05-12 05:44:53 +00:00
Benjamin Kramer	bbd9bf5d58	SLPVectorizer: Instead of just performing CSE on dead blocks ignore them completely. Turns out that there is a very cheap way of testing whether a block is dead, just look it up in the DomTree. We have to do this anyways so just ignore unreachable blocks before sorting by domination. This restores a proper ordering for std::stable_sort when dead code is present. Covered by existing tests & buildbots running in STL debug mode (MSVC). llvm-svn: 208492	2014-05-11 10:28:58 +00:00
Serge Pavlov	d043cc92e5	Reorder shuffle and binary operation. This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488	2014-05-11 08:46:12 +00:00
Benjamin Kramer	2039813d0e	SLPVectorizer: When sorting by domination for CSE don't assert on unreachable code. There is no total ordering if the CFG is disconnected. We don't care if we catch all CSE opportunities in dead code either so just exclude ignore them in the assert. PR19646 llvm-svn: 208461	2014-05-09 23:28:49 +00:00
Louis Gerbarg	011e17086c	Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost Since ExtractValue is not included in ComputeSpeculationCost CFGs containing ExtractValueInsts cannot be simplified. In particular this interacts with InstCombineCompare's tendency to insert add.with.overflow intrinsics for certain idiomatic math operations, preventing optimization. This patch adds ExtractValue to the ComputeSpeculationCost. Test case included rdar://14853450 llvm-svn: 208434	2014-05-09 17:02:46 +00:00
Rafael Espindola	421423af16	Use auto and clang-format this snippet. llvm-svn: 208421	2014-05-09 16:01:06 +00:00
Nick Lewycky	4fd21ddfc8	Improve wording to make it sounds more like a change than an analysis. llvm-svn: 208370	2014-05-08 23:04:46 +00:00
Michael Zolotukhin	de5b0a206d	[InstCombine] Some cleanup in optimization of redundant insertvalue instructions. And one more test added. llvm-svn: 208355	2014-05-08 19:50:24 +00:00
Richard Smith	90c10bd484	Simplify and fix incorrect comment. No functionality change. llvm-svn: 208272	2014-05-08 01:08:43 +00:00
Duncan P. N. Exon Smith	f5258e4d2c	GlobalValue: Assert symbols with local linkage have default visibility The change to ExtractGV.cpp has no functionality change except to avoid the asserts. Existing testcases already cover this, so I didn't add a new one. llvm-svn: 208264	2014-05-07 23:00:22 +00:00
Chandler Carruth	36ad1ec2b0	Tidy up whitespace with clang-format prior to making significant changes. llvm-svn: 208229	2014-05-07 17:36:59 +00:00
Michael Zolotukhin	5fe056ff21	[InstCombine] Add optimization of redundant insertvalue instructions. rdar://problem/11861387 llvm-svn: 208214	2014-05-07 14:30:18 +00:00
Evgeniy Stepanov	c02f1a9f96	[msan] Fix -fsanitize=memory -fno-integrated-as. llvm-svn: 208211	2014-05-07 14:10:51 +00:00
Stepan Dyatkovskiy	328bcb73da	MergeFunctions Pass, introduced total ordering among values. This is a third patch of patch series that improves MergeFunctions performance time from O(NN) to O(Nlog(N)). This patch description: Being comparing functions we need to compare values we meet at left and right sides. Its easy to sort things out for external values. It just should be the same value at left and right. But for local values (those were introduced inside function body) we have to ensure they were introduced at exactly the same place, and plays the same role. In short, patch introduces values serial numbering and comparison routine. The last one compares two values by their serial numbers. llvm-svn: 208189	2014-05-07 11:11:39 +00:00
Zinovy Nis	ce225593e1	[BUG][REFACTOR] 1) Fix for printing debug locations for absolute paths. 2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel. Differential Revision: http://reviews.llvm.org/D3513 llvm-svn: 208177	2014-05-07 09:51:22 +00:00
Stepan Dyatkovskiy	7984a44afe	Second patch of patch series that improves MergeFunctions performance time from O(NN) to O(Nlog(N)). The idea is to introduce total ordering among functions set. It allows to build binary tree and perform function look-up procedure in O(log(N)) time. This patch description: Introduced total ordering among constants implemented in cmpConstants method. Method performs lexicographical comparison between constants represented as hypothetical numbers of next format: <bitcastability-trait><raw-bit-contents> Please, read cmpConstants declaration comments for more details. llvm-svn: 208173	2014-05-07 09:05:10 +00:00
Nico Weber	01b4c1bca8	Fix ASan init function detection after clang r208128. llvm-svn: 208141	2014-05-06 23:17:26 +00:00
Richard Smith	55381ccf0a	Re-commit r208025, reverted in r208030, with a fix for a conformance issue which GCC detects and Clang does not! llvm-svn: 208033	2014-05-06 01:44:26 +00:00
Richard Smith	b38145eb67	Revert r208025, which made buildbots unhappy for unknown reasons. llvm-svn: 208030	2014-05-06 01:26:00 +00:00
Richard Smith	e9d2d57a7c	Add llvm::function_ref (and a couple of uses of it), representing a type-erased reference to a callable object. llvm-svn: 208025	2014-05-06 01:01:29 +00:00
Nick Lewycky	75f47267ef	Detabify. llvm-svn: 208019	2014-05-06 00:46:20 +00:00
Nick Lewycky	d7c0e22d5f	Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017	2014-05-05 23:59:03 +00:00
Yi Jiang	8e66fa3904	Reapply: Add slp vectorization to LTO passes. The bug it exposed has been fixed by r207983. <radar://16641956> llvm-svn: 208013	2014-05-05 23:14:46 +00:00
Yi Jiang	86dccb7e97	Always set alignment of vectorized LD/ST in SLP-Vectorizer. <rdar://problem/16812145> llvm-svn: 207983	2014-05-05 17:59:14 +00:00
Duncan P. N. Exon Smith	a7513de2e4	LTO: -internalize sets visibility to default Visibility is meaningless when the linkage is local. Change `-internalize` to reset the visibility to `default`. <rdar://problem/16141113> llvm-svn: 207979	2014-05-05 17:40:44 +00:00
Timur Iskhodzhanov	16ebb61afc	[ASan/Win] Fix issue 305 -- don't instrument .CRT initializer/terminator callbacks See https://code.google.com/p/address-sanitizer/issues/detail?id=305 Reviewed at http://reviews.llvm.org/D3607 llvm-svn: 207968	2014-05-05 14:28:38 +00:00
Benjamin Kramer	8125aa2cd4	LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940	2014-05-04 19:12:38 +00:00
Arnold Schwaighofer	d752a09f8b	SLPVectorizer: Bring back the insertelement patch (r205965) with fixes When can't assume a vectorized tree is rooted in an instruction. The IRBuilder could have constant folded it. When we rebuild the build_vector (the series of InsertElement instructions) use the last original InsertElement instruction. The vectorized tree root is guaranteed to be before it. Also, we can't assume that the n-th InsertElement inserts the n-th element into a vector. This reverts r207746 which reverted the revert of the revert of r205018 or so. Fixes the test case in PR19621. llvm-svn: 207939	2014-05-04 17:10:15 +00:00
Benjamin Kramer	fbeb105fa6	SLPVectorizer: Lazily allocate the map for block numbering. There is no point in creating it if we're not going to vectorize anything. Creating the map is expensive as it creates large values. No functionality change. llvm-svn: 207916	2014-05-03 15:50:37 +00:00
Karthik Bhat	4591b173c7	Vectorize intrinsic math function calls in SLPVectorizer. This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer. Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559 llvm-svn: 207901	2014-05-03 09:59:54 +00:00
Eric Christopher	b8d435f9a6	Clean up constructor logic and member access for LoopVectorizeHints. There are public functions that mutate various members as well as another private member already, so make all the members private to avoid the discontinuity and add accessors for the values. Should be no functional change. llvm-svn: 207868	2014-05-02 20:40:04 +00:00
Nico Weber	fdced47a40	Teach GlobalDCE how to remove empty global_ctor entries. This moves most of GlobalOpt's constructor optimization code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The public interface is a single function OptimizeGlobalCtorsList() that takes a predicate returning which constructors to remove. GlobalOpt calls this with a function that statically evaluates all constructors, just like it did before. This part of the change is behavior-preserving. Also add a call to this from GlobalDCE with a filter that removes global constructors that contain a "ret" instruction and nothing else – this fixes PR19590. llvm-svn: 207856	2014-05-02 18:35:25 +00:00
Akira Hatanaka	f64b00f1d7	[GVN] Pass the phi-translated address of a load instead of the untranslated address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. llvm-svn: 207853	2014-05-02 17:59:17 +00:00
Nick Lewycky	e4f0d18fe0	Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 times in a bootstrap of clang. llvm-svn: 207828	2014-05-02 04:11:45 +00:00
Benjamin Kramer	040af9d6b4	Update and sort CMakeLists. llvm-svn: 207785	2014-05-01 18:59:11 +00:00
Eli Bendersky	0602e236ae	Add an optimization that does CSE in a group of similar GEPs. This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783	2014-05-01 18:38:36 +00:00
Chandler Carruth	fa8592dc61	Revert r205965, which essentially reverts r205018 for the second time. =[ Turns out that this was the root cause of PR19621. We found a crasher only recently (likely due to improvements elsewhere in the SLP vectorizer) but the reduced test case failed all the way back to here. I've confirmed that reverting this patch both fixes the reduced test case in PR19621 and the actual source file that led to it, so it seems to really be rooted here. I've replied to the commit thread with discussion of my (feeble) attempts to debug this. Didn't make it very far, so reverting now that we have a good test case so that things can get back to healthy while the debugging carries on. llvm-svn: 207746	2014-05-01 11:24:11 +00:00
Gerolf Hoflehner	aec10987f2	Patch for function cloning to inline all blocks whose address is taken Not all address taken blocks get inlined. The reason is that a blocks new address is known only when it is cloned. But e.g. a branch instruction in a different block could need that address earlier while it gets cloned. The solution is to collect the set of all blocks that can potentially get inlined and compute a new block address up front. Then clone and cleanup. rdar://16427209 llvm-svn: 207713	2014-04-30 22:05:02 +00:00
Yi Jiang	5faccf9d45	Revert r207571 - Add slp vectorization to LTO passes llvm-svn: 207693	2014-04-30 19:27:24 +00:00
Carlo Kok	3c208bd22a	[IPO/MergeFunctions] changes so it doesn't try to bitcast a struct return type but instead recreates it with insert/extract value. llvm-svn: 207679	2014-04-30 17:53:04 +00:00
Benjamin Kramer	805d786f28	Add a <tuple> include to more files that aren't getting it transitively on MSVC. llvm-svn: 207617	2014-04-30 07:21:01 +00:00
NAKAMURA Takumi	45cffbdde3	ConstantHoisting.cpp: Add <tuple> for std::tie, since r207593 removed FileSystem.h, it includes <tuple>. llvm-svn: 207614	2014-04-30 06:44:50 +00:00
Jim Grosbach	c9daf08519	Tidy up. llvm-svn: 207585	2014-04-29 22:41:58 +00:00
Jim Grosbach	d4614a7a4c	Spelling. llvm-svn: 207584	2014-04-29 22:41:55 +00:00
Rafael Espindola	f399f03ccc	Also handle ConstantAggregateZero when optimizing vpermilvar*. llvm-svn: 207582	2014-04-29 22:20:40 +00:00
Rafael Espindola	b4468db535	Remove tabs. Sorry, new machine and I forgot to change the editor setting. llvm-svn: 207578	2014-04-29 21:02:37 +00:00
Rafael Espindola	7372093fcc	Two fixes to the vpermilvar optimization. The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect. The _256 variants have indexes into the individual 128 bit lanes and in all cases it also has to mask out unused bits. llvm-svn: 207577	2014-04-29 20:41:54 +00:00
Diego Novillo	210bb92cec	Fix vectorization remarks. This patch changes the vectorization remarks to also inform when vectorization is possible but not beneficial. Added tests to exercise some loop remarks. llvm-svn: 207574	2014-04-29 20:06:10 +00:00
Yi Jiang	f658582852	Continue slp vectorization even the BB already has vectorized store radar://16641956 llvm-svn: 207572	2014-04-29 19:37:20 +00:00
Yi Jiang	1c75ac2f65	Add slp vectorization to LTO passes llvm-svn: 207571	2014-04-29 19:35:39 +00:00
Adam Nemet	109f756317	Reapply r207271 without the testcase PR19608 was filed to find a suitable testcase. llvm-svn: 207569	2014-04-29 18:25:28 +00:00
Diego Novillo	81a83e3d17	Add optimization remarks to the loop unroller and vectorizer. Summary: This calls emitOptimizationRemark from the loop unroller and vectorizer at the point where they make a positive transformation. For the vectorizer, it reports vectorization and interleave factors. For the loop unroller, it reports all the different supported types of unrolling. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3456 llvm-svn: 207528	2014-04-29 14:27:31 +00:00
Zinovy Nis	26320a9150	[BUG] Fix -Wunused-variable warning in Release mode. Thnx to Kostya Serebryany for pointing. llvm-svn: 207516	2014-04-29 09:45:08 +00:00
Kostya Serebryany	6630aee422	fix -Wunused-variable warning in Release mode llvm-svn: 207514	2014-04-29 09:33:02 +00:00
Zinovy Nis	a7f05bffeb	[OPENMP][LV][D3423] Respect Hints.Force meta-data for loops in LoopVectorizer llvm-svn: 207512	2014-04-29 08:55:11 +00:00
Michael Zolotukhin	354ce4fae7	Fix a typo in comment llvm-svn: 207499	2014-04-29 07:35:33 +00:00
Chandler Carruth	718a254f79	Revert r207271 for now. This commit introduced a test case that ran clang directly from the LLVM test suite! That doesn't work. I've followed up on the review thread to try and get a viable solution sorted out, but trying to get the tree clean here. llvm-svn: 207462	2014-04-28 23:07:49 +00:00
Hans Wennborg	6405294846	InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569) llvm-svn: 207426	2014-04-28 17:40:03 +00:00
Chandler Carruth	d81a0d614d	Fix rampant quadratic behavior in UpdatePHINodes. The operation of mapping from a basic block to an incoming value, either for removal or just lookup, is linear in the number of predecessors, and we were doing this for every entry in the 'Preds' list which is in many cases almost all of them! Unfortunately, the fixes are quite ugly. PHI nodes just don't make this operation easy. The efficient way to fix this is to have a clever 'remove_if' operation on PHI nodes that lets us do a single pass over all the incoming values of the original PHI node, extracting the ones we care about. Then we could quickly construct the new phi node from this list. This would remove the remaining underlying quadratic movement of unrelated incoming values and the need for silly backwards looping to "minimize" how often we hit the quadratic case. This is the last obvious fix for PR19499. It shaves another 20% off the compile time for me, and while UpdatePHINodes remains in the profile, most of the time is now stemming from the well known inefficiencies of LVI and jump threading. llvm-svn: 207409	2014-04-28 10:37:30 +00:00
Craig Topper	b663bffa27	[C++] Use 'nullptr'. llvm-svn: 207394	2014-04-28 04:05:08 +00:00
Gerolf Hoflehner	13cf626de6	RecursivelyDeleteTriviallyDeadInstructions() could remove more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 Repaired r207302. llvm-svn: 207309	2014-04-26 05:58:11 +00:00
Gerolf Hoflehner	c780c786a3	Restore CloneFunction.cpp which got accidently overwritten by previous backout of r207303 llvm-svn: 207308	2014-04-26 05:43:41 +00:00
Gerolf Hoflehner	99a3e5b9f5	Revert commit r207302 since build failures have been reported. llvm-svn: 207303	2014-04-26 02:03:17 +00:00
Gerolf Hoflehner	54bf53d166	RecursivelyDeleteTriviallyDeadInstructions() could remove more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 llvm-svn: 207302	2014-04-26 01:19:16 +00:00
Andrea Di Biagio	815dfb7574	[InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift right intrinsics. A packed logical shift right with a shift count bigger than or equal to the element size always produces a zero vector. In all other cases, it can be safely replaced by a 'lshr' instruction. llvm-svn: 207299	2014-04-26 01:03:22 +00:00
Adrian Prantl	ca946260a4	Unbreak the gdb buildbot by not lowering dbg.declare intrinsics for arrays. llvm-svn: 207284	2014-04-25 23:00:25 +00:00
Adam Nemet	c4071d355a	[LoopStrengthReduce] Don't trim formula that uses a subset of required registers Consider this use from the new testcase: LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32 reg({1000,+,-1}<nw><%for.body>) -3003 + reg({3,+,3}<nw><%for.body>) -1001 + reg({1,+,1}<nuw><nsw><%for.body>) -1000 + reg({0,+,1}<nw><%for.body>) -3000 + reg({0,+,3}<nuw><%for.body>) reg({-1000,+,1}<nw><%for.body>) reg({-3000,+,3}<nsw><%for.body>) This is the last use we consider for a solution in SolveRecurse, so CurRegs is a large set. (CurRegs is the set of registers that are needed by the previously visited uses in the in-progress solution.) ReqRegs is { {3,+,3}<nw><%for.body>, {1,+,1}<nuw><nsw><%for.body> } This is the intersection of the regs used by any of the formulas for the current use and CurRegs. Now, the code requires a formula to contain all these regs (the comment is simply wrong), otherwise the formula is immediately disqualified. Obviously, no formula for this use contains two regs so they will all get disqualified. The fix modifies the check to allow the formula in this case. The idea is that neither of these formulae is introducing any new registers which is the point of this early pruning as far as I understand. In terms of set arithmetic, we now allow formulas whose used regs are a subset of the required regs not just the other way around. There are few more loops in the test-suite that are now successfully LSRed. I have benchmarked those and found very minimal change. Fixes <rdar://problem/13965777> llvm-svn: 207271	2014-04-25 21:02:21 +00:00
Adrian Prantl	7566e72bb8	This reapplies r207235 with an additional bugfixes caught by the msan buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269	2014-04-25 20:49:25 +00:00
Duncan P. N. Exon Smith	ea68e6a3d5	SCC: Change clients to use const, NFC It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252	2014-04-25 18:24:50 +00:00
Adrian Prantl	319db7c542	Revert "This reapplies r207130 with an additional testcase+and a missing check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250	2014-04-25 18:18:09 +00:00
Manman Ren	5d62183fae	[inline cold threshold] Command line argument for inline threshold will override the default cold threshold. When we use command line argument to set the inline threshold, the default cold threshold will not be used. This is in line with how we use OptSizeThreshold. When we want a higher threshold for all functions, we do not have to set both inline threshold and cold threshold. llvm-svn: 207245	2014-04-25 17:34:55 +00:00
Adrian Prantl	c3a2bdef85	Reapply r207135 without modifications. Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location of the dbg.value. This gets rid of tons of redundant variable DIEs in subscopes. rdar://problem/14874886, rdar://problem/16679936 llvm-svn: 207236	2014-04-25 17:01:04 +00:00
Adrian Prantl	7f9d1e9fd6	This reapplies r207130 with an additional testcase+and a missing check for AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235	2014-04-25 17:01:00 +00:00
Craig Topper	c0a2a29f4e	[C++] Use 'nullptr'. Transforms edition. llvm-svn: 207196	2014-04-25 05:29:35 +00:00
Karthik Bhat	a6070a9b75	Allow vectorization of bit intrinsics in BB Vectorizer. This patch adds support for vectorization of bit intrinsics such as bswap,ctpop,ctlz,cttz. llvm-svn: 207174	2014-04-25 03:33:48 +00:00
Adrian Prantl	0338f80f17	Revert "This reapplies r207130 with an additional testcase+and a missing check for" Typo in testcase. llvm-svn: 207166	2014-04-25 00:42:50 +00:00
Adrian Prantl	bf019d19e9	This reapplies r207130 with an additional testcase+and a missing check for AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165	2014-04-25 00:38:40 +00:00
Adrian Prantl	0b669e8f79	Revert "Debug info for optimized code: Support variables that are on the stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162	2014-04-25 00:04:49 +00:00
Adrian Prantl	ff50aa7bca	Revert "Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location" This reverts commit 207130 for buildbot breakage. llvm-svn: 207159	2014-04-24 23:53:29 +00:00
Adrian Prantl	daee7e4c17	Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location of the dbg.value. This gets rid of tons of redundant variable DIEs in subscopes. rdar://problem/14874886, rdar://problem/16679936 llvm-svn: 207135	2014-04-24 18:44:15 +00:00
Adrian Prantl	807e5d8a9a	Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine-intrinsics testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207130	2014-04-24 17:41:45 +00:00
Karthik Bhat	fd6c53ce06	Allow vectorization of few missed llvm intrinsic calls in BBVectorizor by handling them in isVectorizableIntrinsic function. llvm-svn: 207085	2014-04-24 07:29:55 +00:00
Michael J. Spencer	65065bf94f	[InstCombine][x86] Constant fold psll intrinsics. This excludes avx512 as I don't have hardware to verify. It excludes _dq variants because they are represented in the IR as <{2,4} x i64> when it's actually a byte shift of the entire i{128,265}. This also excludes _dq_bs as they aren't at all supported by the backend. There are also no corresponding instructions in the ISA. I have no idea why they exist... llvm-svn: 207058	2014-04-24 00:58:18 +00:00
Filipe Cabecinhas	696e2aae90	Optimize some special cases for SSE4a insertqi Summary: Since the upper 64 bits of the destination register are undefined when performing this operation, we can substitute it and let the optimizer figure out that only a copy is needed. Also added range merging, if an instruction copies a range that can be merged with a previous copied range. Added test cases for both optimizations. Reviewers: grosbach, nadav CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3357 llvm-svn: 207055	2014-04-24 00:38:14 +00:00
Matt Arsenault	bc017b7e11	Handle addrspacecast when looking at memcpys from globals llvm-svn: 207054	2014-04-24 00:01:09 +00:00
Matt Arsenault	b37a455d1d	Remove more default address space argument usage. These places are inconsequential in practice. llvm-svn: 207021	2014-04-23 20:58:57 +00:00
Matt Arsenault	71fe4a9cad	Don't use default address space arguments in GlobalOpt llvm-svn: 207019	2014-04-23 20:36:10 +00:00
Alexander Potapenko	eb29afd291	[ASan] Move the shadow range on 32-bit iOS (and iOS Simulator) to 0x40000000-0x60000000 to avoid address space clash with system libraries. The solution has been proposed by tahabekireren@gmail.com in https://code.google.com/p/address-sanitizer/issues/detail?id=210 This is also known to fix some Chromium iOS tests. llvm-svn: 207002	2014-04-23 17:14:45 +00:00
Matt Arsenault	457de98b62	Remove dead code in instcombine. Don't replace shifts greater than the type with the maximum shift. This isn't hit anywhere in the tests, and somewhere else is replacing these with undef. llvm-svn: 207000	2014-04-23 16:48:40 +00:00
Evgeniy Stepanov	87ddc3340b	Fix handling of missing DataLayout in sanitizers. Pass::doInitialization is supposed to return False when it did not change the program, not when a fatal error occurs. llvm-svn: 206975	2014-04-23 12:51:32 +00:00
Alexander Musman	721e76d2fa	[LV] Statistics numbers for LoopVectorize introduced: a number of analyzed loops & a number of vectorized loops. Use -stats to see how many loops were analyzed for possible vectorization and how many of them were actually vectorized. Patch by Zinovy Nis Differential Revision: http://reviews.llvm.org/D3438 llvm-svn: 206956	2014-04-23 08:40:37 +00:00
Juergen Ributzka	e6dc2fa469	[Constant Hoisting] Materialize the constant before the cloned cast instruction. In the case where the constant comes from a cloned cast instruction, the materialization code has to go before the cloned cast instruction. This commit fixes the method that finds the materialization insertion point by making it aware of this case. This fixes <rdar://problem/15532441> llvm-svn: 206913	2014-04-22 18:06:58 +00:00
Juergen Ributzka	7287943879	[Constant Hoisting] Print the instructions in the correct order for debugging. No functional change. llvm-svn: 206912	2014-04-22 18:06:51 +00:00
Kostya Serebryany	291de5fd50	[asan] Support outline instrumentation for wide types and delete dead code, patch by Yuri Gribov llvm-svn: 206883	2014-04-22 11:19:45 +00:00
Chandler Carruth	6f9ba6a633	[Modules] Fix potential ODR violations by sinking the DEBUG_TYPE definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844	2014-04-22 02:55:47 +00:00
Chandler Carruth	15c7b91ac2	[Modules] Make Support/Debug.h modular. This requires it to not change behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro after header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822	2014-04-21 22:55:11 +00:00
Rafael Espindola	5bfe46aee5	Simplify a vpermil* with constant mask. With a constant mask a vpermil* is just a shufflevector. This patch implements that simplification. This allows us to produce denser code. It should also allow more folding down the line. llvm-svn: 206801	2014-04-21 22:06:04 +00:00
David Blaikie	091f1b9444	Use unique_ptr to handle GlobalOpt's Evaluator members llvm-svn: 206790	2014-04-21 20:49:36 +00:00
Reid Kleckner	5013d7ca2d	Fix PR7272 in -tailcallelim instead of the inliner The -tailcallelim pass should be checking if byval or inalloca args can be captured before marking calls as tail calls. This was the real root cause of PR7272. With a better fix in place, revert the inliner change from r105255. The test case it introduced still passes and has been moved to test/Transforms/Inline/byval-tail-call.ll. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3403 llvm-svn: 206789	2014-04-21 20:48:47 +00:00
David Blaikie	057cb4407d	Simplify expression that was explicitly naming an operator overload in a call. llvm-svn: 206788	2014-04-21 20:43:51 +00:00
David Blaikie	f04826c7d5	Use unique_ptr to handle ownership of GCOVFunctions in GCOVProfiler. llvm-svn: 206786	2014-04-21 20:41:55 +00:00
Chandler Carruth	e07407deea	[Modules] Sink all the DEBUG_TYPE defines for InstCombine out of the header files and into the cpp files. These files will require more touches as the header files actually use DEBUG(). Eventually, I'll have to introduce a matched #define and #undef of DEBUG_TYPE for the header files, but that comes as step N of many to clean all of this up. llvm-svn: 206777	2014-04-21 19:51:41 +00:00
Evgeniy Stepanov	b8b4d1d879	[msan] Enable out-of-line instrumentation for large functions by default. llvm-svn: 206759	2014-04-21 15:04:05 +00:00
Kostya Serebryany	0ca459b956	[asan] add a run-time flag detect_container_overflow=true/false llvm-svn: 206756	2014-04-21 14:35:00 +00:00
Kostya Serebryany	8369b857f6	[asan] instead of inserting inline instrumentation around memset/memcpy/memmove, replace the intrinsic with __asan_memset/etc. This makes the memset/etc handling more complete and consistent with what we do in msan. It may slowdown some cases (when the intrinsic was actually inlined) and speedup other cases (when it was not inlined) llvm-svn: 206746	2014-04-21 11:50:42 +00:00
Kostya Serebryany	25679a433d	[asan] temporary disable generating __asan_loadN/__asan_storeN llvm-svn: 206741	2014-04-21 10:28:13 +00:00
Kostya Serebryany	0405013a8c	[asan] insert __asan_loadN/__asan_storeN as out-lined asan checks, llvm part llvm-svn: 206734	2014-04-21 07:10:43 +00:00
Alp Toker	faee7c31dd	Remove some empty statements Cleanup only. llvm-svn: 206710	2014-04-19 23:56:35 +00:00
Nick Lewycky	bd9ff641e7	Check whether functions have any lines associated before emitting coverage info for them. This isn't just a size/time saving, gcov may crash on these. llvm-svn: 206671	2014-04-18 23:32:28 +00:00
Evgeniy Stepanov	de38078fd6	[msan] Add -msan-instrumentation-with-call-threshold. This flag replaces inline instrumentation for checks and origin stores with calls into MSan runtime library. This is a workaround for PR17409. Disabled by default. llvm-svn: 206585	2014-04-18 12:17:20 +00:00
Kostya Serebryany	2b02920109	[asan] one more workaround for PR17409: don't do BB-level coverage instrumentation if there are more than N (=1500) basic blocks. This makes ASanCoverage work on libjpeg_turbo/jchuff.c used by Chrome, which has 1824 BBs llvm-svn: 206564	2014-04-18 08:02:42 +00:00
Duncan P. N. Exon Smith	7063f2846a	PMBuilder: Expose an option to disable tail calls Adds API to allow frontends to disable tail calls in PassManagerBuilder. <rdar://problem/16050591> llvm-svn: 206542	2014-04-18 01:05:15 +00:00
Diego Novillo	45811c5ea3	Fix bug 19437 - Only add discriminators for DWARF 4 and above. Summary: This prevents the discriminator generation pass from triggering if the DWARF version being used in the module is prior to 4. Reviewers: echristo, dblaikie CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3413 llvm-svn: 206507	2014-04-17 22:33:50 +00:00
Nuno Lopes	4a36b584a3	remove some dead code lib/Analysis/IPA/InlineCost.cpp \| 18 ------------------ lib/Analysis/RegionPass.cpp \| 1 - lib/Analysis/TypeBasedAliasAnalysis.cpp \| 1 - lib/Transforms/Scalar/LoopUnswitch.cpp \| 21 --------------------- lib/Transforms/Utils/LCSSA.cpp \| 2 -- lib/Transforms/Utils/LoopSimplify.cpp \| 6 ------ utils/TableGen/AsmWriterEmitter.cpp \| 13 ------------- utils/TableGen/DFAPacketizerEmitter.cpp \| 7 ------- utils/TableGen/IntrinsicEmitter.cpp \| 2 -- 9 files changed, 71 deletions(-) llvm-svn: 206506	2014-04-17 22:26:44 +00:00
NAKAMURA Takumi	51c35adf06	Inliner::OptimizationRemark: Fix crash in clang/test/Frontend/optimization-remark.c on some hosts, including --vg. DebugLoc in Callsite would not live after Inliner. It should be copied before Inliner. llvm-svn: 206459	2014-04-17 12:22:14 +00:00
Kostya Serebryany	9bec638044	[asan] add two new hidden compile-time flags for asan: asan-instrumentation-with-call-threshold and asan-memory-access-callback-prefix. This is part of the workaround for PR17409 (instrument huge functions with callbacks instead of inlined code). These flags will also help us experiment with kasan (kernel-asan) and clang llvm-svn: 206383	2014-04-16 12:12:19 +00:00
Julien Lerouge	dd5842a2e5	Add lifetime markers for allocas created to hold byval arguments, make them appear in the InlineFunctionInfo. llvm-svn: 206308	2014-04-15 18:06:46 +00:00
Julien Lerouge	9ecdd0ce5b	Split byval argument initialization so the memcpy(s) are injected at the beginning of the first new block after inlining. llvm-svn: 206307	2014-04-15 18:01:54 +00:00
Duncan P. N. Exon Smith	571c11d959	LTO: Add more loop simplification passes to LTO Similar to r202051, add missing loop simplification passes to the LTO optimization pipeline. Patch by Rafael Espindola. llvm-svn: 206306	2014-04-15 17:48:15 +00:00
Duncan P. N. Exon Smith	58154f2238	verify-di: Implement DebugInfoVerifier Implement DebugInfoVerifier, which steals verification relying on DebugInfoFinder from Verifier. - Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps DebugInfoVerifier. Uses -verify-di command-line flag. - Change verifyModule() to invoke DebugInfoVerifier as well as Verifier. - Add a call to createDebugInfoVerifierPass() wherever there was a call to createVerifierPass(). This implementation as a module pass should sidestep efficiency issues, allowing us to turn debug info verification back on. <rdar://problem/15500563> llvm-svn: 206300	2014-04-15 16:27:38 +00:00
Alexey Bataev	135cfee77c	D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadata llvm-svn: 206266	2014-04-15 09:37:30 +00:00
Matt Arsenault	2a6aada789	Revert "Revert r206045, "Fix shift by constants for vector."" Fix cases where the Value itself is used, and not the constant value. llvm-svn: 206214	2014-04-14 21:50:37 +00:00
NAKAMURA Takumi	1a21e608ca	Whitespace. llvm-svn: 206154	2014-04-14 07:03:13 +00:00
NAKAMURA Takumi	c6fb0494ea	Revert r206045, "Fix shift by constants for vector." It broke some builders, at least, i686. llvm-svn: 206153	2014-04-14 07:02:57 +00:00
Serge Pavlov	c145d314a1	Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that. llvm-svn: 206144	2014-04-14 02:20:19 +00:00
Serge Pavlov	816d014c52	Recognize test for overflow in integer multiplication. If multiplication involves zero-extended arguments and the result is compared as in the patterns: %mul32 = trunc i64 %mul64 to i32 %zext = zext i32 %mul32 to i64 %overflow = icmp ne i64 %mul64, %zext or %overflow = icmp ugt i64 %mul64 , 0xffffffff then the multiplication may be replaced by call to umul.with.overflow. This change fixes PR4917 and PR4918. Differential Revision: http://llvm-reviews.chandlerc.com/D2814 llvm-svn: 206137	2014-04-13 18:23:41 +00:00
Matt Arsenault	c399a3f659	Fix shift by constants for vector. ashr <N x iM>, <N x iM> M -> undef llvm-svn: 206045	2014-04-11 17:57:53 +00:00
David Blaikie	1573e6e09f	Implement depth_first and inverse_depth_first range factory functions. Also updated as many loops as I could find using df_begin/idf_begin - strangely I found no uses of idf_begin. Is that just used out of tree? Also a few places couldn't use df_begin because either they used the member functions of the depth first iterators or had specific ordering constraints (I added a comment in the latter case). Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T> where you needed iterator_range<idf_iterator<T>>) llvm-svn: 206016	2014-04-11 01:50:01 +00:00
Arnold Schwaighofer	c65ae6074a	Reapply "SLPVectorizer: Ignore users that are insertelements we can reschedule them" This commit reapplies 205018. After 205855 we should correctly vectorize intrinsics. llvm-svn: 205965	2014-04-10 13:41:35 +00:00
Alp Toker	111bd28e59	Fix some doc and comment typos llvm-svn: 205899	2014-04-09 14:47:27 +00:00
Arnold Schwaighofer	1a503c9322	SLPVectorizer: Only vectorize intrinsics whose operands are widened equally The vectorizer only knows how to vectorize intrinics by widening all operands by the same factor. Patch by Tyler Nowicki! llvm-svn: 205855	2014-04-09 14:20:47 +00:00
Diego Novillo	224c7b79fe	Add support for optimization reports. Summary: This patch adds backend support for -Rpass=, which indicates the name of the optimization pass that should emit remarks stating when it made a transformation to the code. Pass names are taken from their DEBUG_NAME definitions. When emitting an optimization report diagnostic, the lack of debug information causes the diagnostic to use "<unknown>:0:0" as the location string. This is the back end counterpart for http://llvm-reviews.chandlerc.com/D3226 Reviewers: qcolombet CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3227 llvm-svn: 205774	2014-04-08 16:42:34 +00:00
Eric Christopher	23b79cb873	Add NDEBUG markers around debug only function. llvm-svn: 205706	2014-04-07 12:46:30 +00:00
Eric Christopher	06a9cfdefa	Add debug location information to the vectorizer debug statements. Patch by Zinovy Nis. llvm-svn: 205705	2014-04-07 12:32:17 +00:00
David Blaikie	e0b9857e92	Fixing typo. Differential Revision: http://reviews.llvm.org/D3154 llvm-svn: 205674	2014-04-05 20:30:31 +00:00
Eli Bendersky	be453afe99	Fix PR19270 - type mismatch caused by invalid optimization. Patch by Jingyue Wu. llvm-svn: 205547	2014-04-03 17:51:58 +00:00
Juergen Ributzka	da301c01ab	Revert "[Constant Hoisting] Lazily compute the idom and cache the result." This code is no longer usefull, because we only compute and use the IDom once. There is no benefit in caching it anymore. llvm-svn: 205498	2014-04-03 01:38:47 +00:00
Duncan P. N. Exon Smith	7f2af7c18b	Revert "Reapply "LTO: add API to set strategy for -internalize"" This reverts commit r199244. Conflicts: include/llvm-c/lto.h include/llvm/LTO/LTOCodeGenerator.h lib/LTO/LTOCodeGenerator.cpp llvm-svn: 205471	2014-04-02 22:05:57 +00:00
Tim Northover	466b3a39e1	SLPVectorizer: compare entire intrinsic for SLP compatibility. Some Intrinsics are overloaded to the extent that return type equality (all that's been checked up to now) does not guarantee that the arguments are the same. In these cases SLP vectorizer should not recurse into the operands, which can be achieved by comparing them as "Function *" rather than simply the ID. llvm-svn: 205424	2014-04-02 14:39:02 +00:00
Hal Finkel	5a327230eb	[LoopVectorizer] Count dependencies of consecutive pointers as uniforms For the purpose of calculating the cost of the loop at various vectorization factors, we need to count dependencies of consecutive pointers as uniforms (which means that the VF = 1 cost is used for all overall VF values). For example, the TSVC benchmark function s173 has: ... %3 = add nsw i64 %indvars.iv, 16000 %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3 ... and we must realize that the add will be a scalar in order to correctly deduce it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all dependencies of a consecutive pointer must be a scalar (uniform), and so we simply need to add all consecutive pointers to the worklist that currently detects collects uniforms. Fixes PR19296. llvm-svn: 205387	2014-04-02 02:34:49 +00:00
Hal Finkel	b3f2a21eed	Add some additional fields to TTI::UnrollingPreferences In preparation for an upcoming commit implementing unrolling preferences for x86, this adds additional fields to the UnrollingPreferences structure: - PartialThreshold and PartialOptSizeThreshold - Like Threshold and OptSizeThreshold, but used when not fully unrolling. These are necessary because we need different thresholds for full unrolling from those used when partially unrolling (the full unrolling thresholds are generally going to be larger). - MaxCount - A cap on the unrolling factor when partially unrolling. This can be used by a target to prevent the unrolled loop from exceeding some resource limit independent of the loop size (such as number of branches). There should be no functionality change for any in-tree targets. llvm-svn: 205347	2014-04-01 18:50:30 +00:00
Hal Finkel	dc0e116444	Move partial/runtime unrolling late in the pipeline The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. llvm-svn: 205264	2014-03-31 23:23:51 +00:00
Arnold Schwaighofer	219f6a43e0	Revert "SLPVectorizer: Ignore users that are insertelements we can reschedule them" This reverts commit r205018. Conflicts: lib/Transforms/Vectorize/SLPVectorizer.cpp test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll This is breaking libclc build. llvm-svn: 205260	2014-03-31 23:05:56 +00:00
Rafael Espindola	18c992ab85	Add a missing break. Patch by Tobias Güntner. I tried to write a test, but the only difference is the Changed value that gets returned. It can be tested with "opt -debug-pass=Executions -functionattrs, but that doesn't seem worth it. llvm-svn: 205121	2014-03-30 03:26:17 +00:00
Tim Northover	2f13163a84	ARM64: initial backend import This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. llvm-svn: 205090	2014-03-29 10:18:08 +00:00
Arnold Schwaighofer	bf6c68c0be	SLPVectorizer: Take credit for free extractelement instructions Extract element instructions that will be removed when vectorzing lower the cost. Patch by Arch D. Robison! llvm-svn: 205020	2014-03-28 17:21:32 +00:00
Arnold Schwaighofer	ffb5e31163	SLPVectorizer: Fix typos Patch by Arch D. Robison! llvm-svn: 205019	2014-03-28 17:21:27 +00:00
Arnold Schwaighofer	8510d16f52	SLPVectorizer: Ignore users that are insertelements we can reschedule them Patch by Arch D. Robison! llvm-svn: 205018	2014-03-28 17:21:22 +00:00
Erik Verbruggen	11e61b79e5	Revert "InstCombine: merge constants in both operands of icmp." This reverts commit r204912, and follow-up commit r204948. This introduced a performance regression, and the fix is not completely clear yet. llvm-svn: 205010	2014-03-28 14:50:57 +00:00
Erik Verbruggen	06cf5cbf74	Revert "GVN: merge overflow intrinsics with non-overflow instructions." This reverts commit r203553, and follow-up commits r203558 and r203574. I will follow this up on the mailinglist to do it in a way that won't cause subtle PRE bugs. llvm-svn: 205009	2014-03-28 14:42:34 +00:00
Adrian Prantl	3324d55193	C++11: convert verbose loops to range-based loops. llvm-svn: 204981	2014-03-27 23:30:04 +00:00
Reid Kleckner	c826dce075	InstCombine: Don't combine constants on unsigned icmps Fixes a miscompile introduced in r204912. It would miscompile code like (unsigned)(a + -49) <= 5U. The transform would turn this into (unsigned)a < 55U, which would return true for values in [0, 49], when it should not. llvm-svn: 204948	2014-03-27 17:49:27 +00:00
Rafael Espindola	5c8926deed	Prevent alias from pointing to weak aliases. This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934	2014-03-27 15:26:56 +00:00
Erik Verbruggen	5e4efd4306	InstCombine: merge constants in both operands of icmp. Transform: icmp X+Cst2, Cst into: icmp X, Cst-Cst2 when Cst-Cst2 does not overflow, and the add has nsw. llvm-svn: 204912	2014-03-27 11:16:05 +00:00
Nick Lewycky	a6e0e1eae1	Treat lifetime.start'd memory like we treat freshly alloca'd memory. Patch by Björn Steinbrink! llvm-svn: 204876	2014-03-26 23:45:15 +00:00
Reid Kleckner	509530b2ae	CloneFunction: Clone all attributes, including the CC Summary: Tested with a unit test because we don't appear to have any transforms that use this other than ASan, I think. Fixes PR17935. Reviewers: nicholas CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3194 llvm-svn: 204866	2014-03-26 22:26:35 +00:00
Rafael Espindola	63a8ff6883	Revert "Prevent alias from pointing to weak aliases." This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784	2014-03-26 06:14:40 +00:00
Rafael Espindola	c9179b8b50	Prevent alias from pointing to weak aliases. Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781	2014-03-26 04:48:47 +00:00
Juergen Ributzka	822c198051	[Constant Hoisting] Make the constant candidate map local to the collectConstantCandidates method. llvm-svn: 204758	2014-03-25 21:21:10 +00:00
Richard Osborne	fd123c2caf	[InstCombine] Don't fold bitcast into store if it would need addrspacecast Summary: Previously the code didn't check if the before and after types for the store were pointers to different address spaces. This resulted in instcombine using a bitcast to convert between pointers to different address spaces, causing an assertion due to the invalid cast. It is not be appropriate to use addrspacecast this case because it is not guaranteed to be a no-op cast. Instead bail out and do not do the transformation. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3117 llvm-svn: 204733	2014-03-25 17:21:41 +00:00
Richard Osborne	db5d56840b	Reuse earlier variables to make it clear the types involved in the cast. No functionality change. llvm-svn: 204732	2014-03-25 17:21:35 +00:00
Evgeniy Stepanov	ad64faed33	[msan] More precise instrumentation of select IR. Some bits of select result may be initialized even if select condition is not. https://code.google.com/p/memory-sanitizer/issues/detail?id=50 llvm-svn: 204716	2014-03-25 13:08:34 +00:00
Andrew Trick	16d04697fd	SLP vectorizer: Don't hoist vector extracts of phis. Extracts coming from phis were being hoisted, while all others were sunk to their uses. This was inconsistent and didn't seem to serve a purpose. Changing all extracts to be sunk to uses is a prerequisite for adding block frequency to the SLP vectorizer's cost model. I benchmarked the change in isolation (without block frequency). I only saw noise on x86 and some potentially significant improvements on ARM. No major regressions is good enough for me. llvm-svn: 204699	2014-03-25 02:18:47 +00:00
Nuno Lopes	79d18a66ec	remove a bunch of unused private methods found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h \| 1 include/llvm/IR/DebugInfo.h \| 3 lib/CodeGen/MachineSSAUpdater.cpp \| 10 -- lib/CodeGen/PostRASchedulerList.cpp \| 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp \| 10 -- lib/IR/DebugInfo.cpp \| 12 -- lib/MC/MCAsmStreamer.cpp \| 2 lib/Support/YAMLParser.cpp \| 39 --------- lib/TableGen/TGParser.cpp \| 16 --- lib/TableGen/TGParser.h \| 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp \| 9 -- lib/Target/ARM/ARMCodeEmitter.cpp \| 12 -- lib/Target/ARM/ARMFastISel.cpp \| 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp \| 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp \| 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp \| 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h \| 2 lib/Target/PowerPC/PPCFastISel.cpp \| 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp \| 2 lib/Transforms/Instrumentation/BoundsChecking.cpp \| 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp \| 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp \| 8 - lib/Transforms/Scalar/SCCP.cpp \| 1 utils/TableGen/CodeEmitterGen.cpp \| 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560	2014-03-23 17:09:26 +00:00
Lang Hames	75ea6aebf8	Revert r204076 for now - it caused significant regressions in a number of benchmarks. <rdar://problem/16368461> llvm-svn: 204558	2014-03-23 04:22:31 +00:00
Juergen Ributzka	9a985a07a3	[Constant Hoisting] Erase dead cast instructions. The cleanup code that removes dead cast instructions only removed them from the basic block, but didn't delete them. This fix erases them now too. llvm-svn: 204538	2014-03-22 01:49:30 +00:00
Juergen Ributzka	a05de79cbb	[Constant Hoisting] Fix multiple entries for the same basic block in PHI nodes. A PHI node usually has only one value/basic block pair per incoming basic block. In the case of a switch statement it is possible that a following PHI node may have more than one such pair per incoming basic block. E.g.: %0 = phi i64 [ 123456, %case2 ], [ 654321, %Entry ], [ 654321, %Entry ] This is valid and the verfier doesn't complain, because both values are the same. Constant hoisting materializes the constant for each operand separately and the value is still the same, but the variable names have changed. As a result the verfier can't recognize anymore that they are the same value and complains. This fix adds special update code for PHI node in constant hoisting to prevent this corner case. This fixes <rdar://problem/16394449> llvm-svn: 204537	2014-03-22 01:49:27 +00:00
Arnaud A. de Grandmaison	c417a5d328	Remove some dead assignements found by scan-build llvm-svn: 204526	2014-03-21 21:54:46 +00:00
Tom Stellard	fe1239b8cb	Sink: Don't sink static allocas from the entry block CodeGen treats allocas outside the entry block as dynamically sized stack objects. llvm-svn: 204473	2014-03-21 15:51:51 +00:00
Juergen Ributzka	4470e9c92d	[Constant Hoisting] Make the constant materialization cost operand dependent Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> llvm-svn: 204435	2014-03-21 06:04:45 +00:00
Juergen Ributzka	a65ce371c7	[Constant Hoisting] Lazily compute the idom and cache the result. Related to <rdar://problem/16381500> llvm-svn: 204434	2014-03-21 06:04:39 +00:00
Juergen Ributzka	b52c2c678c	[Constant Hoisting] Change the algorithm to only track constants for instructions. Originally the algorithm would search for expensive constants and track their users, which could be instructions and constant expressions. This change only tracks the constants for instructions, but constant expressions are indirectly covered too. If an operand is an constant expression, then we look through the expression to find anny expensive constants. The algorithm keep now track of the instruction and the operand index where the constant is used. This allows more precise hoisting of constant materialization code for PHI instructions, because we only hoist to the basic block of the incoming operand. Before we had to find the idom of all PHI operands and hoist the materialization code there. This also makes updating of instructions easier. Before we had to keep track of the original constant, find it in the instructions, and then replace it. Now we can just simply update the operand. Related to <rdar://problem/16381500> llvm-svn: 204433	2014-03-21 06:04:36 +00:00
Juergen Ributzka	2e77fbe182	[Constant Hoisting] Fix capitalization of function names. llvm-svn: 204432	2014-03-21 06:04:33 +00:00
Juergen Ributzka	60d9807b0e	[Constant Hoisting] Replace the MapVector with a separate Map and Vector to keep track of constant candidates. This simplifies working with the constant candidates and removes the tight coupling between the map and the vector. Related to <rdar://problem/16381500> llvm-svn: 204431	2014-03-21 06:04:30 +00:00
Juergen Ributzka	c55e0f3fc7	Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass." I will break this up into smaller pieces for review and recommit. llvm-svn: 204393	2014-03-20 20:17:13 +00:00
Juergen Ributzka	7dae5f7baa	[Constant Hoisting] Extend coverage of the constant hoisting pass. This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> llvm-svn: 204389	2014-03-20 19:55:52 +00:00
Mark Seaborn	ade468f2c3	Remove LowerInvoke's obsolete "-enable-correct-eh-support" option This option caused LowerInvoke to generate code using SJLJ-based exception handling, but there is no code left that interprets the jmp_buf stack that the resulting code maintained (llvm.sjljeh.jblist). This option has been obsolete for a while, and replaced by SjLjEHPrepare. This leaves the default behaviour of LowerInvoke, which is to convert invokes to calls. Differential Revision: http://llvm-reviews.chandlerc.com/D3136 llvm-svn: 204388	2014-03-20 19:54:47 +00:00
Alexander Potapenko	0eb130e34f	[ASan] Do not instrument globals from the llvm.metadata section. Fixes https://code.google.com/p/address-sanitizer/issues/detail?id=279. llvm-svn: 204331	2014-03-20 10:48:34 +00:00
Evgeniy Stepanov	17d50b69f6	Set debug info for instructions inserted in SplitBlockAndInsertIfThen. llvm-svn: 204230	2014-03-19 12:56:38 +00:00
Duncan P. N. Exon Smith	6fc0b70a7b	Fix use_iterator crash in ObjCArc from r203364 The use_iterator redesign in r203364 introduced an increment past the end of a range in -objc-arc-contract. Added an explicit check for the end of the range. <rdar://problem/16333235> llvm-svn: 204195	2014-03-18 22:32:43 +00:00
Chandler Carruth	536fb8893d	[LV] While I'm here, use range based for loops which are so much cleaner for this kind of walk. llvm-svn: 204188	2014-03-18 22:00:32 +00:00
Chandler Carruth	9fa85c489d	[LV] The actual change I intended to commit in r204148. Sorry for the noise. Original commit log: Replace some dead code with an assert. When I first ported this pass from a loop pass to a function pass I did so in the naive, recursive way. It doesn't actually work, we need a worklist instead. When I switched to the worklist I didn't delete the naive recursion. That recursion was also buggy because it was dead and never really exercised. llvm-svn: 204187	2014-03-18 21:58:38 +00:00
Chandler Carruth	4ac3e74751	[LV] Replace some dead code with an assert. When I first ported this pass from a loop pass to a function pass I did so in the naive, recursive way. It doesn't actually work, we need a worklist instead. When I switched to the worklist I didn't delete the naive recursion. That recursion was also buggy because it was dead and never really exercised. llvm-svn: 204184	2014-03-18 21:51:46 +00:00
Evgeniy Stepanov	4e42dcfe00	[msan] Origin tracking with history. LLVM part of MSan implementation of advanced origin tracking, when we record not only creation point, but all locations where an uninitialized value was stored to memory, too. llvm-svn: 204151	2014-03-18 13:30:56 +00:00
Diego Novillo	119221ccbc	Tolerate unmangled names in sample profiles. Summary: The compiler does not always generate linkage names. If a function has been inlined and its body elided, its linkage name may not be generated. When the binary executes, the profiler will use its unmangled name when attributing samples. This results in unmangled names in the input profile. We are currently failing hard when this happens. However, in this case all that happens is that we fail to attribute samples to the inlined function. While this means fewer optimization opportunities, it should not cause a compilation failure. This patch accepts all valid function names, regardless of whether they were mangled or not. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3087 llvm-svn: 204142	2014-03-18 12:03:12 +00:00
Evgeniy Stepanov	cfd1cf2b01	[msan] Kill -msan-store-clean-origin flag. Not only is it slower than the alternative, but also subtly broken. This commit does not change the default behavior. llvm-svn: 204131	2014-03-18 09:47:06 +00:00
Alon Mishne	70ba46ff38	[C++11] Change DebugInfoFinder to use range-based loops Also changes the iterators to return actual DI type over MDNode. llvm-svn: 204130	2014-03-18 09:41:07 +00:00
Evgeniy Stepanov	7ad8a1f5a2	[msan] Remove unused code. llvm-svn: 204125	2014-03-18 08:29:42 +00:00
Dan Gohman	b0339af0e1	Use range metadata instead of introducing selects. When GlobalOpt has determined that a GlobalVariable only ever has two values, it would convert the GlobalVariable to a boolean, and introduce SelectInsts at every load, to choose between the two possible values. These SelectInsts introduce overhead and other unpleasantness. This patch makes GlobalOpt just add range metadata to loads from such GlobalVariables instead. This enables the same main optimization (as seen in test/Transforms/GlobalOpt/integer-bool.ll), without introducing selects. The main downside is that it doesn't get the memory savings of shrinking such GlobalVariables, but this is expected to be negligible. llvm-svn: 204076	2014-03-17 19:57:04 +00:00
Eli Bendersky	631277f3dd	Consistent use of the noduplicate attribute. The "noduplicate" attribute of call instructions is sometimes queried directly and sometimes through the cannotDuplicate() predicate. This patch streamlines all queries to use the cannotDuplicate() predicate. It also adds this predicate to InvokeInst, to mirror what CallInst has. llvm-svn: 204049	2014-03-17 16:19:07 +00:00
David Blaikie	60ddd2b93c	Remove named Twine. While technically correct, we generally disallow any instance of named Twines due to their subtlety. llvm-svn: 204016	2014-03-16 01:36:18 +00:00
Arnaud A. de Grandmaison	4544f80f7c	Remove some dead assignements found by scan-build llvm-svn: 204013	2014-03-15 22:13:15 +00:00
Benjamin Kramer	8e0892b4f3	LSR: Compress a pair (and get rid of the DenseMapInfo for it). Also convert a horrible hash function to use our hashing infrastructure. No functionality change. llvm-svn: 204008	2014-03-15 17:17:48 +00:00
NAKAMURA Takumi	2d8e94aadc	SampleProfile.cpp: Fix take #2 . The issue was abuse of StringRef here. llvm-svn: 203996	2014-03-15 01:56:17 +00:00
NAKAMURA Takumi	8cb5938310	SampleProfile.cpp: Quick fix to r203976 about abuse of Twine. The life of Twine was too short. FIXME: DiagnosticInfoSampleProfile should not hold Twine&. llvm-svn: 203990	2014-03-15 00:10:12 +00:00
Diego Novillo	6368420655	Re-format SampleProfile.cpp with clang-format. No functional changes. llvm-svn: 203977	2014-03-14 22:07:18 +00:00
Diego Novillo	a9a26c6236	Use DiagnosticInfo facility. Summary: The sample profiler pass emits several error messages. Instead of just aborting the compiler with report_fatal_error, we can emit better messages using DiagnosticInfo. This adds a new sub-class of DiagnosticInfo to handle the sample profiler. Reviewers: chandlerc, qcolombet CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3086 llvm-svn: 203976	2014-03-14 21:58:59 +00:00
Alexander Potapenko	a22163b83f	[ASan] Fix https://code.google.com/p/address-sanitizer/issues/detail?id=274 by ignoring globals from __TEXT,__cstring,cstring_literals during instrumenation. Add a regression test. llvm-svn: 203916	2014-03-14 10:41:49 +00:00
Stepan Dyatkovskiy	2789f696ba	MergeFunctions, cmpType: fixed variable names from XXTy1 and XXTy2 to XXTyL and XXTyR. llvm-svn: 203907	2014-03-14 08:48:52 +00:00
Stepan Dyatkovskiy	0504c3145a	MergeFunctions, cmpType: Fixed comments wrapping. llvm-svn: 203905	2014-03-14 08:17:19 +00:00
Owen Anderson	3a006737fe	Fix a bug in InstCombine where we would incorrectly attempt to construct a bitcast between pointers of two different address spaces if they happened to have the same pointer size. llvm-svn: 203862	2014-03-13 22:51:43 +00:00
Evgeniy Stepanov	04442bc559	[msan] Fix handling of byval arguments in VarArg calls. llvm-svn: 203794	2014-03-13 13:17:11 +00:00
Stepan Dyatkovskiy	47660351ae	First patch of patch series that improves MergeFunctions performance time from O(NN) to O(Nlog(N)). The idea is to introduce total ordering among functions set. That allows to build binary tree and perform function look-up procedure in O(log(N)) time. This patch description: Introduced total ordering among Type instances. Actually it is improvement for existing isEquivalentType. 0. Coerce pointer of 0 address space to integer. 1. If left and right types are equal (the same Type* value), return 0 (means equal). 2. If types are of different kind (different type IDs). Return result of type IDs comparison, treating them as numbers. 3. If types are vectors or integers, return result of its pointers comparison (casted to numbers). 4. Check whether type ID belongs to the next group: * Void * Float * Double * X86_FP80 * FP128 * PPC_FP128 * Label * Metadata If so, return 0. 5. If left and right are pointers, return result of address space comparison (numbers comparison). 6. If types are complex. Then both LEFT and RIGHT will be expanded and their element types will be checked with the same way. If we get Res != 0 on some stage, return it. Otherwise return 0. 7. For all other cases put llvm_unreachable. llvm-svn: 203788	2014-03-13 11:54:50 +00:00
Mark Seaborn	91085966b7	Fix typo in comment: "inwoke" -> "invoke" llvm-svn: 203739	2014-03-13 00:04:17 +00:00
Raul E. Silvera	1c39640e2d	Resubmit "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..." This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8. The problems previously found have been resolved through other CLs. llvm-svn: 203707	2014-03-12 20:21:50 +00:00
Hans Wennborg	a2aafbb7b5	Allow switch-to-lookup table for tables with holes by adding bitmask check This allows us to generate table lookups for code such as: unsigned test(unsigned x) { switch (x) { case 100: return 0; case 101: return 1; case 103: return 2; case 105: return 3; case 107: return 4; case 109: return 5; case 110: return 6; default: return f(x); } } Since cases 102, 104, etc. are not constants, the lookup table has holes in those positions. We therefore guard the table lookup with a bitmask check. Patch by Jasper Neumann! llvm-svn: 203694	2014-03-12 18:35:40 +00:00
Evan Cheng	f2d3d2bf92	Revert r203488 and r203520. llvm-svn: 203687	2014-03-12 18:09:37 +00:00
Eli Bendersky	3af4500090	Revive SizeOptLevel-explaining comments that were dropped in r203669 llvm-svn: 203675	2014-03-12 16:44:17 +00:00
Eli Bendersky	fa2b4f20f2	Move duplicated code into a helper function (exposed through overload). There's a bit of duplicated "magic" code in opt.cpp and Clang's CodeGen that computes the inliner threshold from opt level and size opt level. This patch moves the code to a function that lives alongside the inliner itself, providing a convenient overload to the inliner creation. A separate patch can be committed to Clang to use this once it's committed to LLVM. Standalone tools that use the inlining pass can also avoid duplicating this code and fearing it will go out of sync. Note: this patch also restructures the conditinal logic of the computation to be cleaner. llvm-svn: 203669	2014-03-12 16:12:36 +00:00
Alon Mishne	00d720ff32	Cloning a function now also clones its debug metadata if 'ModuleLevelChanges' is true. llvm-svn: 203662	2014-03-12 14:42:51 +00:00
Erik Verbruggen	11cc704d2c	Fix crash in PRE. After r203553 overflow intrinsics and their non-intrinsic (normal) instruction get hashed to the same value. This patch prevents PRE from moving an instruction into a predecessor block, and trying to add a phi node that gets two different types (the intrinsic result and the non-intrinsic result), resulting in a failing assert. llvm-svn: 203574	2014-03-11 15:07:32 +00:00
Tim Northover	68c567a38a	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559	2014-03-11 10:48:52 +00:00
Erik Verbruggen	c2bf18261b	GVN: fix hashing of extractvalue. My last commit did not add the indexes to the hashed value for extractvalue. Adding that back in. llvm-svn: 203558	2014-03-11 10:21:30 +00:00
Erik Verbruggen	638ff95018	GVN: merge overflow intrinsics with non-overflow instructions. When an overflow intrinsic is followed by a non-overflow instruction, replace the latter with an extract. For example: %sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) %sadd3 = add i32 %a, %b Here the add statement will be replaced by an extract. When an overflow intrinsic follows a non-overflow instruction, a clone of the intrinsic is inserted before the normal instruction, which makes it the same as the previous case. Subsequent runs of GVN can then clean up the duplicate instructions and insert the extract. This fixes PR8817. llvm-svn: 203553	2014-03-11 09:36:48 +00:00
Duncan P. N. Exon Smith	f9624311ce	Cleanup whitespace llvm-svn: 203529	2014-03-11 02:44:45 +00:00
Evan Cheng	9a155c5f78	Follow up to r203488. Code clean up to eliminate a lot of copy+paste. llvm-svn: 203520	2014-03-11 00:24:20 +00:00
Diego Novillo	dd37be24ca	Use discriminator information in sample profiles. Summary: When the sample profiles include discriminator information, use the discriminator values to distinguish instruction weights in different basic blocks. This modifies the BodySamples mapping to map <line, discriminator> pairs to weights. Instructions on the same line but different blocks, will use different discriminator values. This, in turn, means that the blocks may have different weights. Other changes in this patch: - Add tests for positive values of line offset, discriminator and samples. - Change data types from uint32_t to unsigned and int and do additional validation. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2857 llvm-svn: 203508	2014-03-10 22:41:28 +00:00
Benjamin Kramer	108d24886e	MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination. The testcase is from PR19092, but I think the bug described there is actually a clang issue. llvm-svn: 203489	2014-03-10 21:05:13 +00:00
Evan Cheng	b0fdca31bc	For functions with ARM target specific calling convention, when simplify-libcall optimize a call to a llvm intrinsic to something that invovles a call to a C library call, make sure it sets the right calling convention on the call. e.g. extern double pow(double, double); double t(double x) { return pow(10, x); } Compiles to something like this for AAPCS-VFP: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x) ret double %0 } declare double @llvm.pow.f64(double, double) #1 Simplify libcall (part of instcombine) will turn the above into: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %__exp10 = call double @__exp10(double %x) #1 ret double %__exp10 } declare double @__exp10(double) The pre-instcombine code works because calls to LLVM builtins are special. Instruction selection will chose the right calling convention for the call. However, the code after instcombine is wrong. The call to __exp10 will use the C calling convention. I can think of 3 options to fix this. 1. Make "C" calling convention just work since the target should know what CC is being used. This doesn't work because each function can use different CC with the "pcs" attribute. 2. Have Clang add the right CC keyword on the calls to LLVM builtin. This will work but it doesn't match the LLVM IR specification which states these are "Standard C Library Intrinsics". 3. Fix simplify libcall so the resulting calls to the C routines will have the proper CC keyword. e.g. %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1 This works and is the solution I implemented here. Both solutions #2 and #3 would work. After carefully considering the pros and cons, I decided to implement #3 for the following reasons. 1. It doesn't change the "spec" of the intrinsics. 2. It's a self-contained fix. There are a couple of potential downsides. 1. There could be other places in the optimizer that is broken in the same way that's not addressed by this. 2. There could be other calling conventions that need to be propagated by simplify-libcall that's not handled. But for now, this is the fix that I'm most comfortable with. llvm-svn: 203488	2014-03-10 20:49:45 +00:00
Benjamin Kramer	488ab03435	SimplifyCFG: Simplify the weight scaling algorithm. No change in functionality. llvm-svn: 203413	2014-03-09 14:42:55 +00:00
Ahmed Charles	e4b10534bd	Fix build break. llvm-svn: 203366	2014-03-09 03:50:36 +00:00
Chandler Carruth	fad39ebe19	[C++11] Add range based accessors for the Use-Def chain of a Value. This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a Use iterator rather than a User iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update all of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over Users rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of Ts, but that can be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364	2014-03-09 03:16:01 +00:00
Benjamin Kramer	aaa10dc26a	[C++11] Revert uses of lambdas with array_pod_sort. Looks like GCC implements the lambda->function pointer conversion differently. llvm-svn: 203294	2014-03-07 21:52:38 +00:00
Benjamin Kramer	f042a6ba0a	[C++11] Convert sort predicates into lambdas. No functionality change. llvm-svn: 203288	2014-03-07 21:35:39 +00:00
Tim Northover	b74aa030d9	InstCombine: form shuffles from wider range of insert/extractelements Sequences of insertelement/extractelements are sometimes used to build vectorsr; this code tries to put them back together into shuffles, but could only produce a completely uniform shuffle types (<N x T> from two <N x T> sources). This should allow shuffles with different numbers of elements on the input and output sides as well. llvm-svn: 203229	2014-03-07 10:24:44 +00:00
Ahmed Charles	52ce0c101e	Replace OwningPtr<T> with std::unique_ptr<T>. This compiles with no changes to clang/lld/lldb with MSVC and includes overloads to various functions which are used by those projects and llvm which have OwningPtr's as parameters. This should allow out of tree projects some time to move. There are also no changes to libs/Target, which should help out of tree targets have time to move, if necessary. llvm-svn: 203083	2014-03-06 05:51:42 +00:00
Chandler Carruth	a48d15a676	[Layering] Move InstVisitor.h into the IR library as it is pretty obviously coupled to the IR. llvm-svn: 203064	2014-03-06 03:23:41 +00:00
Chandler Carruth	0873afae39	[Layering] Move DebugInfo.h into the IR library where its implementation already lives. llvm-svn: 203046	2014-03-06 00:46:21 +00:00
Chandler Carruth	2b135c4e9f	[Layering] Move DIBuilder.h into the IR library where its implementation already lives. llvm-svn: 203038	2014-03-06 00:22:06 +00:00
Arnold Schwaighofer	adebac793b	LoopVectorizer: Preserve fast-math flags Fixes PR19045. llvm-svn: 203008	2014-03-05 21:10:47 +00:00
Chandler Carruth	797ae6fd0d	[Layering] Move DebugLoc.h into the IR library. The implementation already lived there and it is where it belongs -- this is the in-memory debug location representation. This is just cleanup -- Modules can actually cope with this, but that doesn't make it right. After chatting with folks that have out-of-tree stuff, going ahead and moving the rest of the headers seems preferable. llvm-svn: 202960	2014-03-05 10:30:38 +00:00
Chandler Carruth	0e2a8390e0	[C++11] Make this interface accept const Use pointers and use override to ensure we don't mess up any of the overrides. Necessary for cleaning up the Value use iterators and enabling range-based traversing of use lists. llvm-svn: 202958	2014-03-05 10:21:48 +00:00
Ahmed Charles	4a96a15754	[C++11] Replace OwningPtr::take() with OwningPtr::release(). llvm-svn: 202957	2014-03-05 10:19:29 +00:00
Craig Topper	a3683ec835	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 202953	2014-03-05 09:10:37 +00:00
Chandler Carruth	436597fe00	[Modules] Move the ConstantRange class into the IR library. This is a bit surprising, as the class is almost entirely abstracted away from any particular IR, however it encodes the comparsion predicates which mutate ranges as ICmp predicate codes. This is reasonable as they're used for both instructions and constants. Thus, it belongs in the IR library with instructions and constants. llvm-svn: 202838	2014-03-04 12:24:34 +00:00
Chandler Carruth	4b66708834	[Modules] Move the PredIteratorCache into the IR library -- it is hardcoded to use IR BasicBlocks. llvm-svn: 202835	2014-03-04 12:09:19 +00:00
Chandler Carruth	248195469c	[Modules] Move the NoFolder into the IR library as it creates instructions. llvm-svn: 202834	2014-03-04 12:05:47 +00:00
Chandler Carruth	b4f244209e	[Modules] Move the TargetFolder into the Analysis library. Historically, this would have been required because of the use of DataLayout, but that has moved into the IR proper. It is still required because this folder uses the constant folding in the analysis library (which uses the datalayout) as the more aggressive basis of its folder. llvm-svn: 202832	2014-03-04 11:59:06 +00:00
Chandler Carruth	075812f27c	[Modules] Move CFG.h to the IR library as it defines graph traits over IR types. llvm-svn: 202827	2014-03-04 11:45:46 +00:00
Chandler Carruth	63713e9f95	[Modules] Move ValueMap to the IR library. While this class does not directly care about the Value class (it is templated so that the key can be any arbitrary Value subclass), it is in fact concretely tied to the Value class through the ValueHandle's CallbackVH interface which relies on the key type being some Value subclass to establish the value handle chain. Ironically, the unittest is already in the right library. llvm-svn: 202824	2014-03-04 11:26:31 +00:00
Chandler Carruth	649f6270aa	[Modules] Move ValueHandle into the IR library where Value itself lives. Move the test for this class into the IR unittests as well. This uncovers that ValueMap too is in the IR library. Ironically, the unittest for ValueMap is useless in the Support library (honestly, so was the ValueHandle test) and so it already lives in the IR unittests. Mmmm, tasty layering. llvm-svn: 202821	2014-03-04 11:17:44 +00:00
Chandler Carruth	d0657fe39f	[Modules] Move the LLVM IR pattern match header into the IR library, it obviously is coupled to the IR. llvm-svn: 202818	2014-03-04 11:08:18 +00:00
Chandler Carruth	cfb81122cc	[Modules] Move CallSite into the IR library where it belogs. It is abstracting between a CallInst and an InvokeInst, both of which are IR concepts. llvm-svn: 202816	2014-03-04 11:01:28 +00:00
Chandler Carruth	0bf5689f06	[Modules] Move GetElementPtrTypeIterator into the IR library. As its name might indicate, it is an iterator over the types in an instruction in the IR.... You see where this is going. Another step of modularizing the support library. llvm-svn: 202815	2014-03-04 10:40:04 +00:00
Chandler Carruth	d7b36fdea7	[Modules] Move InstIterator out of the Support library, where it had no business. This header includes Function and BasicBlock and directly uses the interfaces of both classes. It has to do with the IR, it even has that in the name. =] Put it in the library it belongs to. This is one step toward making LLVM's Support library survive a C++ modules bootstrap. llvm-svn: 202814	2014-03-04 10:30:26 +00:00
Chandler Carruth	cd48c56575	[cleanup] Re-sort all the includes with utils/sort_includes.py. llvm-svn: 202811	2014-03-04 10:07:28 +00:00
Diego Novillo	2ccb22f509	Pass to emit DWARF path discriminators. DWARF discriminators are used to distinguish multiple control flow paths on the same source location. When this happens, instructions across basic block boundaries will share the same debug location. This pass detects this situation and creates a new lexical scope to one of the two instructions. This lexical scope is a child scope of the original and contains a new discriminator value. This discriminator is then picked up from MCObjectStreamer::EmitDwarfLocDirective to be written on the object file. This fixes http://llvm.org/bugs/show_bug.cgi?id=18270. llvm-svn: 202752	2014-03-03 20:06:11 +00:00
Benjamin Kramer	6b03dd4034	[C++11] Use std::tie to simplify compare operators. No functionality change. llvm-svn: 202751	2014-03-03 19:58:30 +00:00

... 3 4 5 6 7 ...

11638 Commits