llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Quentin Colombet	48ec8d3046	[InstCombiner] Expose opportunities to merge subtract and comparison. Several architectures use the same instruction to perform both a comparison and a subtract. The instruction selection framework does not allow to consider different basic blocks to expose such fusion opportunities. Therefore, these instructions are “merged” by CSE at MI IR level. To increase the likelihood of CSE to apply in such situation, we reorder the operands of the comparison, when they have the same complexity, so that they matches the order of the most frequent subtract. E.g., icmp A, B ... sub B, A <rdar://problem/14514580> llvm-svn: 190352	2013-09-09 20:56:48 +00:00
Matt Arsenault	4c2083b14a	Use type helper functions. llvm-svn: 190113	2013-09-06 00:37:24 +00:00
Matt Arsenault	f022cb3c1a	Consistently use dbgs() in debug printing llvm-svn: 190093	2013-09-05 19:48:28 +00:00
Tim Northover	baf9697e72	InstCombine: allow unmasked icmps to be combined with logical ops "(icmp op i8 A, B)" is equivalent to "(icmp op i8 (A & 0xff), B)" as a degenerate case. Allowing this as a "masked" comparison when analysing "(icmp) &/\| (icmp)" allows us to combine them in more cases. rdar://problem/7625728 llvm-svn: 189931	2013-09-04 11:57:17 +00:00
Tim Northover	9045305fce	InstCombine: look for masked compares with subset relation Even in cases which aren't universally optimisable like "(A & B) != 0 && (A & C) != 0", the masks can make one of the comparisons completely redundant. In this case, since we've gone to the effort of spotting masked comparisons we should combine them. rdar://problem/7625728 llvm-svn: 189930	2013-09-04 11:57:13 +00:00
Matt Arsenault	469d672381	Teach InstCombineLoadCast about address spaces. This is another one that doesn't matter much, but uses the right GEP index types in the first place. llvm-svn: 189854	2013-09-03 21:05:48 +00:00
Matt Arsenault	306cb38abf	Use type form of getIntPtrType in alloca visitor. This doesn't actually matter, since alloca is always 0 address space, but this is more consistent. llvm-svn: 189853	2013-09-03 21:05:15 +00:00
Benjamin Kramer	3a81691558	InstCombine: Check for zero shift amounts before subtracting one causing integer overflow. PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits (those are always undef because we can't represent integer types that large). llvm-svn: 189672	2013-08-30 14:35:35 +00:00
Matt Arsenault	60f6ad8d53	Fix typo. llvm-svn: 189524	2013-08-28 22:17:26 +00:00
Matt Arsenault	95d00423a7	Teach InstCombine about address spaces llvm-svn: 188926	2013-08-21 19:53:10 +00:00
Jakub Staszak	b4a62fef02	Use pop_back_val() instead of both back() and pop_back(). llvm-svn: 188723	2013-08-19 22:47:55 +00:00
Matt Arsenault	77bbadbfcb	Teach InstCombine visitGetElementPtr about address spaces llvm-svn: 188721	2013-08-19 22:17:40 +00:00
Matt Arsenault	2c55fe773b	Cleanup visitGetElementPtr to make address space change easier llvm-svn: 188720	2013-08-19 22:17:34 +00:00
Matt Arsenault	14a3f7be8d	commonPointerCast cleanups to make address space change easier llvm-svn: 188719	2013-08-19 22:17:18 +00:00
Matt Arsenault	22098dc46b	Revert non-test parts of r188507 Re-add the inboundsless tests I didn't add originally llvm-svn: 188710	2013-08-19 21:40:31 +00:00
Jim Grosbach	933ecf8022	InstCombine: Use isAllOnesValue() instead of explicit -1. llvm-svn: 188563	2013-08-16 17:03:36 +00:00
Jim Grosbach	72387340f5	InstCombine: Simplify if(x!=0 && x!=-1). When both constants are positive or both constants are negative, InstCombine already simplifies comparisons like this, but when it's exactly zero and -1, the operand sorting ends up reversed and the pattern fails to match. Handle that special case. Follow up for rdar://14689217 llvm-svn: 188512	2013-08-16 00:15:20 +00:00
Matt Arsenault	9594ef019c	Don't do FoldCmpLoadFromIndexedGlobal for non inbounds GEPs This path wasn't tested before without a datalayout, so add some more tests and re-run with and without one. llvm-svn: 188507	2013-08-15 23:11:07 +00:00
Matt Arsenault	cb3b478d91	Fix always creating GEP with i32 indices Use the pointer size if datalayout is available. Use i64 if it's not, which is consistent with what other places do when the pointer size is unknown. The test doesn't really test this in a useful way since it will be transformed to that later anyway, but this now tests it for non-zero arrays and when datalayout isn't available. The cases in visitGetElementPtrInst should save an extra re-visit to the newly created GEP since it won't need to cleanup after itself. llvm-svn: 188339	2013-08-14 00:24:38 +00:00
Matt Arsenault	6d1daebfff	Use type helper functions instead of cast llvm-svn: 188338	2013-08-14 00:24:34 +00:00
Matt Arsenault	734103e561	Use array initializer, space around operator llvm-svn: 188337	2013-08-14 00:24:05 +00:00
Richard Sandiford	c01b885f5d	Fix big-endian handling of integer-to-vector bitcasts in InstCombine These functions used to assume that the lsb of an integer corresponds to vector element 0, whereas for big-endian it's the other way around: the msb is in the first element and the lsb is in the last element. Fixes MultiSource/Benchmarks/mediabench/gsm/toast for z. llvm-svn: 188155	2013-08-12 07:26:09 +00:00
Matt Arsenault	de2f38a2db	Fix missing -- C++ --s llvm-svn: 187758	2013-08-06 00:16:21 +00:00
Owen Anderson	43c2c4c5e4	Preserve fast-math flags when folding (fsub x, (fneg y)) to (fadd x, y). llvm-svn: 187462	2013-07-30 23:53:17 +00:00
Matt Arsenault	c825ddd4ca	Change behavior of calling bitcasted alias functions. It will now only convert the arguments / return value and call the underlying function if the types are able to be bitcasted. This avoids using fp<->int conversions that would occur before. llvm-svn: 187444	2013-07-30 20:45:05 +00:00
Owen Anderson	e7a1434407	Fix variable name. llvm-svn: 187253	2013-07-26 22:06:21 +00:00
Owen Anderson	841856eb0b	When InstCombine tries to fold away (fsub x, (fneg y)) into (fadd x, y), it is also worthwhile for it to look through FP extensions and truncations, whose application commutes with fneg. llvm-svn: 187249	2013-07-26 21:40:29 +00:00
Stephen Lin	c7791e3e9b	Correct case of m_UIToFp to m_UIToFP to match instruction name, add m_SIToFP for consistency. llvm-svn: 187225	2013-07-26 17:55:00 +00:00
Stephen Lin	5a583ae2f7	InstCombine: call FoldOpIntoSelect for all floating binops, not just fmul llvm-svn: 186759	2013-07-20 07:13:13 +00:00
Stephen Lin	a7fd9682cd	Restore r181216, which was partially reverted in r182499. llvm-svn: 186533	2013-07-17 20:06:03 +00:00
Craig Topper	4e9457fd7d	Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]). llvm-svn: 186301	2013-07-15 04:27:47 +00:00
Craig Topper	58fa7a9b4a	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Nick Lewycky	1897fc3733	Add a microoptimization for urem. llvm-svn: 186235	2013-07-13 01:16:47 +00:00
Joey Gouly	04b7300444	Fix a crash in EvaluateInDifferentElementOrder where it would generate an undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. llvm-svn: 186224	2013-07-12 23:08:06 +00:00
Benjamin Kramer	2880b5c20b	Don't use a potentially expensive shift if all we want is one set bit. No functionality change. llvm-svn: 186095	2013-07-11 16:05:50 +00:00
David Majnemer	f8f57aad0a	InstCombine: Fix typo in comment for visitICmpInstWithInstAndIntCst llvm-svn: 185916	2013-07-09 09:24:35 +00:00
David Majnemer	3bb8099e6d	InstCombine: variations on 0xffffffff - x >= 4 The following transforms are valid if -C is a power of 2: (icmp ugt (xor X, C), ~C) -> (icmp ult X, C) (icmp ult (xor X, C), -C) -> (icmp uge X, C) These are nice, they get rid of the xor. llvm-svn: 185915	2013-07-09 09:20:58 +00:00
David Majnemer	ebf98e0163	InstCombine: X & -C != -C -> X <= u ~C Tests were added in r185910 somehow. llvm-svn: 185912	2013-07-09 08:09:32 +00:00
David Majnemer	90d0b32c9e	Commit r185909 was a misapplied patch, fix it llvm-svn: 185910	2013-07-09 07:58:32 +00:00
David Majnemer	969e1f9c9f	InstCombine: add more transforms C1-X <u C2 -> (X\|(C2-1)) == C1 C1-X >u C2 -> (X\|C2) == C1 X-C1 <u C2 -> (X & -C2) == C1 X-C1 >u C2 -> (X & ~C2) == C1 llvm-svn: 185909	2013-07-09 07:50:59 +00:00
David Majnemer	c84fdc2727	InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1 Back in r179493 we determined that two transforms collided with each other. The fix back then was to reorder the transforms so that the preferred transform would give it a try and then we would try the secondary transform. However, it was noted that the best approach would canonicalize one transform into the other, removing the collision and allowing us to optimize IR given to us in that form. llvm-svn: 185808	2013-07-08 11:53:08 +00:00
David Majnemer	2c72ce123d	InstCombine: (icmp eq B, 0) \| (icmp ult A, B) -> (icmp ule A, B-1) This transform allows us to turn IR that looks like: %1 = icmp eq i64 %b, 0 %2 = icmp ult i64 %a, %b %3 = or i1 %1, %2 ret i1 %3 into: %0 = add i64 %b, -1 %1 = icmp uge i64 %0, %a ret i1 %1 which means we go from lowering: cmpq %rsi, %rdi setb %cl testq %rsi, %rsi sete %al orb %cl, %al ret to lowering: decq %rsi cmpq %rdi, %rsi setae %al ret llvm-svn: 185677	2013-07-05 00:31:17 +00:00
David Majnemer	bd4154c43f	InstCombine: Reimplementation of visitUDivOperand This transform was originally added in r185257 but later removed in r185415. The original transform would create instructions speculatively and then discard them if the speculation was proved incorrect. This has been replaced with a scheme that splits the transform into two parts: preflight and fold. While we preflight, we build up fold actions that inform the folding stage on how to act. llvm-svn: 185667	2013-07-04 21:17:49 +00:00
Craig Topper	783617eba7	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Hal Finkel	142a9fc3fb	Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms) I'm reverting this commit because: 1. As discussed during review, it needs to be rewritten (to avoid creating and then deleting instructions). 2. This is causing optimizer crashes. Specifically, I'm seeing things like this: While deleting: i1 % Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1 opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. I'd guess that these will go away once we're no longer creating/deleting instructions here, but just in case, I'm adding a regression test. Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message: InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185415	2013-07-02 05:21:11 +00:00
Benjamin Kramer	57adb0eb90	InstCombine: Also turn selects fed by an and into arithmetic when the types don't match. Inserting a zext or trunc is sufficient. This pattern is somewhat common in LLVM's pointer mangling code. llvm-svn: 185270	2013-06-29 21:17:04 +00:00
David Majnemer	10db94fa84	InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison Changing the sign when comparing the base pointer would introduce all sorts of unexpected things like: %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0 %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0 %cmp.i = icmp ult i8* %gep.i, %gep2.i %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = icmp ne i1 %cmp.i, %cmp.i1 ret i1 %cmp into: %cmp.i = icmp slt [1 x i8]* %a, %b %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = xor i1 %cmp.i, %cmp.i1 ret i1 %cmp By preserving the original sign, we now get: ret i1 false This fixes PR16483. llvm-svn: 185259	2013-06-29 10:28:04 +00:00
David Majnemer	bfa8c5b795	InstCombine: Small whitespace cleanup in FoldGEPICmp llvm-svn: 185258	2013-06-29 09:45:35 +00:00
David Majnemer	5a0bcf7c2c	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257	2013-06-29 08:40:07 +00:00
David Majnemer	da449ec54b	InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2) We may, after other optimizations, find ourselves with IR that looks like: %shl = shl i32 1, %y %cmp = icmp ult i32 %shl, 32 Instead, we should just compare the shift count: %cmp = icmp ult i32 %y, 5 llvm-svn: 185242	2013-06-28 23:42:03 +00:00
Matt Arsenault	64654e8350	Fix using arg_end() - arg_begin() instead of arg_size() llvm-svn: 185121	2013-06-28 00:25:40 +00:00
Michael Gottesman	cbe62d543c	Revert "Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float."" This reverts commit r185099. Looks like both the ppc-64 and mips bots are still failing after I reverted this change. Since: 1. The mips bot always performs a clean build, 2. The ppc64-bot failed again after a clean build (I asked the ppc-64 maintainers to clean the bot which they did... Thanks Will!), I think it is safe to assume that this change was not the cause of the failures that said builders were seeing. Thus I am recomitting. llvm-svn: 185111	2013-06-27 21:58:19 +00:00
Michael Gottesman	f4d4b7d828	Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float." This reverts commit r185095. This is causing a FileCheck failure on the 3dnow intrinsics on at least the mips/ppc bots but not on the x86 bots. Reverting while I figure out what is going on. llvm-svn: 185099	2013-06-27 20:40:11 +00:00
Michael Gottesman	1b9f5c3f5a	[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float. The category which an APFloat belongs to should be dependent on the actual value that the APFloat has, not be arbitrarily passed in by the user. This will prevent inconsistency bugs where the category and the actual value in APFloat differ. I also fixed up all of the references to this constructor (which were only in LLVM). llvm-svn: 185095	2013-06-27 19:50:52 +00:00
Michael Gottesman	98d0fadcd5	In InstCombine{AddSub,MulDivRem} convert APFloat.isFiniteNonZero() && !APFloat.isDenormal => APFloat.isNormal. llvm-svn: 185037	2013-06-26 23:17:31 +00:00
Michael Gottesman	649738ce71	[APFloat] Converted all references to APFloat::isNormal => APFloat::isFiniteNonZero. Turns out all the references were in llvm and not in clang. llvm-svn: 184356	2013-06-19 21:23:18 +00:00
Jakub Staszak	0c829ecc2b	Simplify code. No functionality change. llvm-svn: 183461	2013-06-06 23:34:59 +00:00
Jakub Staszak	51a9b956bf	Re-apply "Use IRBuilder instead of ConstantInt methods." with the fixed issues. llvm-svn: 183439	2013-06-06 20:18:46 +00:00
Rafael Espindola	e247267603	Revert "Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit." This reverts commit 183328. It caused pr16244 and broke the bots. llvm-svn: 183422	2013-06-06 17:03:05 +00:00
Jakub Staszak	ae7795836e	Remove unneeded cast<>. llvm-svn: 183363	2013-06-06 00:49:57 +00:00
Jakub Staszak	f1847072a4	Use IRBuilder instead of ConstantInt methods. llvm-svn: 183360	2013-06-06 00:37:23 +00:00
Jakub Staszak	a8f4094724	Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit. llvm-svn: 183328	2013-06-05 18:27:02 +00:00
Nick Lewycky	fd5f62b9db	Delete dead safety check. llvm-svn: 183167	2013-06-03 23:15:20 +00:00
Nick Lewycky	8e323b56b7	When determining the new index for an insertelement, we may not assume that an index greater than the size of the vector is invalid. The shuffle may be shrinking the size of the vector. Fixes a crash! Also drop the maximum recursion depth of the safety check for this optimization to five. llvm-svn: 183080	2013-06-01 20:51:31 +00:00
Rafael Espindola	093854c154	Simplify multiplications by vectors whose elements are powers of 2. Patch by Andrea Di Biagio. llvm-svn: 183005	2013-05-31 14:27:15 +00:00
Nick Lewycky	5d48f77ca0	Reapply with r182909 with a fix to the calculation of the new indices for insertelement instructions. llvm-svn: 182976	2013-05-31 00:59:42 +00:00
Evgeniy Stepanov	49b05aa055	Revert r182909. PR/16177 llvm-svn: 182919	2013-05-30 09:40:17 +00:00
Nick Lewycky	f69757c153	Swizzle vector inputs if it helps us eliminate shuffles. llvm-svn: 182909	2013-05-30 04:33:38 +00:00
Michael J. Spencer	c195b8a813	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Joey Gouly	fd81e4e361	Run clang-format over the scalarizePHI function. llvm-svn: 182640	2013-05-24 12:33:28 +00:00
Joey Gouly	a87f26a872	scalarizePHI needs to insert the next ExtractElement in the same block as the BinaryOperator, not in the block where the IRBuilder is currently inserting into. Fixes a bug where scalarizePHI would create instructions that would not dominate all uses. llvm-svn: 182639	2013-05-24 12:29:54 +00:00
Jean-Luc Duprat	eb45c18a9c	This is an update to a previous commit (r181216). The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499	2013-05-22 18:29:31 +00:00
Matt Arsenault	7a573d2632	Add missing -- C++ -- to headers llvm-svn: 182164	2013-05-17 21:43:39 +00:00
Sylvestre Ledru	8b2c792cff	Fix two typo llvm-svn: 181848	2013-05-14 23:36:24 +00:00
David Majnemer	f01cfb5523	InstCombine: Flip the order of two urem transforms There are two transforms in visitUrem that conflict with each other. ) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. ) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668	2013-05-12 00:07:05 +00:00
David Majnemer	6f959e00ef	InstCombine: Turn urem to bitwise-and more often Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661	2013-05-11 09:01:28 +00:00
Benjamin Kramer	c8a8544b79	InstCombine: Don't claim to be able to evaluate any shl in a zexted type. The shift amount may be larger than the type leading to undefined behavior. Limit the transform to constant shift amounts. While there update the bits to clear in the result which may enable additional optimizations. PR15959. llvm-svn: 181604	2013-05-10 16:26:37 +00:00
Benjamin Kramer	bb162fb77a	InstCombine: Verify the type before transforming uitofp into select. PR15952. llvm-svn: 181586	2013-05-10 09:16:52 +00:00
Benjamin Kramer	54122b8028	InstCombine: Don't just copy known bits from the first operand of an srem. That's obviously wrong. Conservatively restrict it to the sign bit, which matches the original intention of this analysis. Fixes PR15940. llvm-svn: 181518	2013-05-09 16:32:32 +00:00
David Majnemer	68574fa9e6	InstCombine: (X ^ signbit) + C -> X + (signbit ^ C) llvm-svn: 181249	2013-05-06 21:21:31 +00:00
Jean-Luc Duprat	5607a72e21	Provide InstCombines for the following 3 cases: A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A(1 - uitofp i1 C) + B(uitofp i1 C) -> select C, A, B llvm-svn: 181216	2013-05-06 16:55:50 +00:00
Nadav Rotem	8564ccca8b	Revert r164763 because it introduces new shuffles. Thanks Nick Lewycky for pointing this out. llvm-svn: 181177	2013-05-06 02:39:09 +00:00
Dmitri Gribenko	82c92dc3dd	Add ArrayRef constructor from None, and do the cleanups that this constructor enables Patch by Robert Wilhelm. llvm-svn: 181138	2013-05-05 00:40:33 +00:00
Nick Lewycky	d0db52e3ac	Tabs to spaces. No functionality change. llvm-svn: 181082	2013-05-04 01:08:15 +00:00
Filip Pizlo	dd62846c56	This patch breaks up Wrap.h so that it does not have to include all of the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881	2013-05-01 20:59:00 +00:00
Jim Grosbach	5002c3f17d	Revert "InstCombine: Fold more shuffles of shuffles." This reverts commit r180802 There's ongoing discussion about whether this is the right place to make this transformation. Reverting for now while we figure it out. llvm-svn: 180834	2013-05-01 00:25:27 +00:00
Jim Grosbach	940f9dc094	InstCombine: Fold more shuffles of shuffles. Always fold a shuffle-of-shuffle into a single shuffle when there's only one input vector in the first place. Continue to be more conservative when there's multiple inputs. rdar://13402653 PR15866 llvm-svn: 180802	2013-04-30 20:43:52 +00:00
David Majnemer	5f05aaa765	Fix a bug in foldSelectICmpAndOr. Differences in bitwidth between X and Y could exist even if C1 and C2 have the same Log2 representation. llvm-svn: 180779	2013-04-30 10:36:33 +00:00
David Majnemer	4b346f9a6e	Fix "Combine bit test + conditional or into simple math" This fixes the optimization introduced in r179748 and reverted in r179750. While the optimization was sound, it did not properly respect differences in bit-width. llvm-svn: 180777	2013-04-30 08:57:58 +00:00
Eric Christopher	beec5d09da	Move C++ code out of the C headers and into either C++ headers or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063	2013-04-22 22:47:22 +00:00
Anat Shemer	0d94c56dac	Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045	2013-04-22 20:51:10 +00:00
Jakub Staszak	068c94ea78	Keep coding stanard. Don't use "else if" after "return". llvm-svn: 179826	2013-04-19 01:18:04 +00:00
Anat Shemer	ca5036302e	In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use. llvm-svn: 179786	2013-04-18 19:56:44 +00:00
Anat Shemer	2d789b4b53	Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location. llvm-svn: 179783	2013-04-18 19:35:39 +00:00
David Majnemer	72034bc02f	Revert "Combine bit test + conditional or into simple math" It is causing stage2 builds to fail, let's get them running again. llvm-svn: 179750	2013-04-18 08:42:33 +00:00
David Majnemer	7dd2b94d65	Combine bit test + conditional or into simple math Simplify: (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) Into: (or (shl (and X, C1), C3), y) Where: C3 = Log(C2) - Log(C1) If: C1 and C2 are both powers of two llvm-svn: 179748	2013-04-18 07:30:07 +00:00
David Majnemer	1dc3d3f7a0	Reorders two transforms that collide with each other One performs: (X == 13 \| X == 14) -> X-13 <u 2 The other: (A == C1 \|\| A == C2) -> (A & ~(C1 ^ C2)) == C1 The problem is that there are certain values of C1 and C2 that trigger both transforms but the first one blocks out the second, this generates suboptimal code. Reordering the transforms should be better in every case and allows us to do interesting stuff like turn: %shr = lshr i32 %X, 4 %and = and i32 %shr, 15 %add = add i32 %and, -14 %tobool = icmp ne i32 %add, 0 into: %and = and i32 %X, 240 %tobool = icmp ne i32 %and, 224 llvm-svn: 179493	2013-04-14 21:15:43 +00:00
Benjamin Kramer	9f30ffb4a7	InstCombine: Check the operand types before merging fcmp ord & fcmp ord. Fixes PR15737. llvm-svn: 179417	2013-04-12 21:56:23 +00:00
David Majnemer	eec2fe2c55	Simplify (A & ~B) in icmp if A is a power of 2 The transform will execute like so: (A & ~B) == 0 --> (A & B) != 0 (A & ~B) != 0 --> (A & B) == 0 llvm-svn: 179386	2013-04-12 17:25:07 +00:00
David Majnemer	82ec1d080e	Optimize icmp involving addition better Allows LLVM to optimize sequences like the following: %add = add nsw i32 %x, 1 %cmp = icmp sgt i32 %add, %y into: %cmp = icmp sge i32 %x, %y as well as: %add1 = add nsw i32 %x, 20 %add2 = add nsw i32 %y, 57 %cmp = icmp sge i32 %add1, %add2 into: %add = add nsw i32 %y, 37 %cmp = icmp sle i32 %cmp, %x llvm-svn: 179316	2013-04-11 20:05:46 +00:00
Benjamin Kramer	3b38288ea2	Fix for wrong instcombine on vector insert/extract When trying to collapse sequences of insertelement/extractelement instructions into single shuffle instructions, there is one specific case where the Instruction Combiner wrongly updates the resulting Mask of shuffle indexes. The problem is in function CollectShuffleElments. If we have a sequence of insert/extract element instructions like the one below: %tmp1 = extractelement <4 x float> %LHS, i32 0 %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1 %tmp3 = extractelement <4 x float> %RHS, i32 2 %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3 Where: . %RHS will have a mask of [4,5,6,7] . %LHS will have a mask of [0,1,2,3] The Mask of shuffle indexes is wrongly computed to [4,1,6,7] instead of [4,0,6,7]. When analyzing %tmp2 in order to compute the Mask for the resulting shuffle instruction, the algorithm forgets to update the mask index at position 1 with the index associated to the element extracted from %LHS by instruction %tmp1. Patch by Andrea DiBiagio! llvm-svn: 179291	2013-04-11 15:10:09 +00:00
Jim Grosbach	e7766f7108	Tidy up a bit. No functional change. llvm-svn: 178915	2013-04-05 21:20:12 +00:00
Akira Hatanaka	724132bda3	Check if Type is a vector before calling function Type::getVectorNumElements. llvm-svn: 178208	2013-03-28 01:28:02 +00:00
Ulrich Weigand	ae3d13ab3f	Make InstCombineCasts.cpp:OptimizeIntToFloatBitCast endian safe. The OptimizeIntToFloatBitCast converts shift-truncate sequences into extractelement operations. The computation of the element index to be used in the resulting operation is currently only correct for little-endian targets. This commit fixes the element index computation to be correct for big-endian targets as well. If the target byte order is unknown, the optimization cannot be performed at all. llvm-svn: 178031	2013-03-26 15:36:14 +00:00
Shuxin Yang	9f502ba0a0	Fix a bug in fast-math fadd/fsub simplification. The problem is that the code mistakenly took for granted that following constructor is able to create an APFloat from a SIGNED integer: APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value) rdar://13486998 llvm-svn: 177906	2013-03-25 20:43:41 +00:00
Arnaud A. de Grandmaison	019bd576ab	Address issues found by Duncan during post-commit review of r177856. llvm-svn: 177863	2013-03-25 11:47:38 +00:00
Arnaud A. de Grandmaison	1fdfeaba38	InstCombine: simplify comparisons to zero of (shl %x, Cst) or (mul %x, Cst) This simplification happens at 2 places : - using the nsw attribute when the shl / mul is used by a sign test - when the shl / mul is compared for (in)equality to zero llvm-svn: 177856	2013-03-25 09:48:49 +00:00
Arnaud A. de Grandmaison	7a4226244b	InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test. The original code used i32, and i64 if legal. This introduced unneeded casts when they aren't legal, or when the index variable i has another type. In order of preference: try to use i's type; use the smallest fitting legal type (using an added DataLayout method); default to i32. A testcase checks that this works when the index gep operand is i16. Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com> Reviewed by : Duncan llvm-svn: 177712	2013-03-22 08:25:01 +00:00
Shuxin Yang	55038cc0b2	Perform factorization as a last resort of unsafe fadd/fsub simplification. Rules include: 1)1 xy +/- xz => x*(y +/- z) (the order of operands dosen't matter) 2) y/x +/- z/x => (y +/- z)/x The transformation is disabled if the new add/sub expr "y +/- z" is a denormal/naz/inifinity. rdar://12911472 llvm-svn: 177088	2013-03-14 18:08:26 +00:00
Arnaud A. de Grandmaison	0810447275	Fix a performance regression when combining to smaller types in icmp (shl %v, C1), C2 : Only combine when the shl is only used by the icmp llvm-svn: 176950	2013-03-13 14:40:37 +00:00
Jakub Staszak	cf6be75b52	Simplify code. No functionality change. llvm-svn: 176765	2013-03-09 11:18:59 +00:00
Jim Grosbach	8dd9a160c8	InstCombine: Don't shrink allocas when combining with a bitcast. When considering folding a bitcast of an alloca into the alloca itself, make sure we don't shrink the amount of memory being allocated, or things rapidly go sideways. rdar://13324424 llvm-svn: 176547	2013-03-06 05:44:53 +00:00
Quentin Colombet	5b2d4c19c9	Fix a bug in instcombine for fmul in fast math mode. The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300	2013-02-28 21:12:40 +00:00
Bill Wendling	cac4751d0b	The transform is: (or (bool?A:B),(bool?C:D)) --> (bool?(or A,C):(or B,D)) By the time the OR is visited, both the SELECTs have been visited and not optimized and the OR itself hasn't been transformed so we do this transform in the hopes that the new ORs will be optimized. The transform is explicitly disabled for vector-selects until "codegen matures to handle them better". Patch by Muhammad Tauqir! llvm-svn: 175380	2013-02-16 23:41:36 +00:00
Arnaud A. de Grandmaison	e63e196b5d	Fix refactoring mistake in "Teach InstCombine to work with smaller legal types..." llvm-svn: 175273	2013-02-15 15:18:17 +00:00
Arnaud A. de Grandmaison	6b2f7add7e	Teach InstCombine to work with smaller legal types in icmp (shl %v, C1), C2 It enables to work with a smaller constant, which is target friendly for those which can compare to immediates. It also avoids inserting a shift in favor of a trunc, which can be free on some targets. This used to work until LLVM-3.1, but regressed with the 3.2 release. llvm-svn: 175270	2013-02-15 14:35:47 +00:00
Arnaud A. de Grandmaison	2c730cd330	Fix comment visitSExt is an adapted copy of the related visitZExt method, so adapt the comment accordingly. llvm-svn: 175019	2013-02-13 00:19:19 +00:00
Michael Ilseman	61bd4eabc9	Optimization: bitcast (<1 x ...> insertelement ..., X, ...) to ... ==> bitcast X to ... llvm-svn: 174905	2013-02-11 21:41:44 +00:00
Andrew Trick	b8764716fa	Revert "Have InstCombine call SipmlifyCall when handling calls. Test case included." This reverts commit 3854a5d90fee52af1065edbed34521fff6cdc18d. This causes a clang unit test to hang: vtable-available-externally.cpp. llvm-svn: 174692	2013-02-08 01:55:39 +00:00
Michael Ilseman	63dd0ecb1e	Have InstCombine call SipmlifyCall when handling calls. Test case included. llvm-svn: 174675	2013-02-07 23:01:35 +00:00
Michael Ilseman	27c81fc400	Preserve fast-math flags after reassociation and commutation. Update test cases llvm-svn: 174571	2013-02-07 01:40:15 +00:00
Benjamin Kramer	7d9ae0ca14	InstCombine: Fix and simplify the inttoptr side too. llvm-svn: 174438	2013-02-05 20:22:40 +00:00
Benjamin Kramer	e615fc610e	InstCombine: Harden code to work with vectors of pointers and simplify it a bit. Found by running instcombine on a fabricated test case for the constant folder. llvm-svn: 174430	2013-02-05 19:21:56 +00:00
Nadav Rotem	696003f5f2	Revert r174152. The shift amount may overflow and in that case this transformation is illegal. llvm-svn: 174156	2013-02-01 07:59:33 +00:00
Nadav Rotem	8e40171f27	Optimize shift lefts of a constant by a value plus constant into a single shift. llvm-svn: 174152	2013-02-01 06:45:40 +00:00
Bill Wendling	c926e2a691	Convert typeIncompatible to return an AttributeSet. There are still places which treat the Attribute object as a collection of attributes. I'm systematically removing them. llvm-svn: 173990	2013-01-30 23:07:40 +00:00
Nadav Rotem	7b1c05a9b7	InstCombine: canonicalize sext-and --> select sext-not-and --> select. Patch by Muhammad Tauqir Ahmad. llvm-svn: 173901	2013-01-30 06:35:22 +00:00
Bill Wendling	e901c4cf0a	Use the AttributeSet instead of AttributeWithIndex. In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the internals of the AttributeSet to outside users, which isn't goodness. llvm-svn: 173602	2013-01-27 02:08:22 +00:00
Bill Wendling	47efd8b988	Remove some introspection functions. The 'getSlot' function and its ilk allow introspection into the AttributeSet class. However, that class should be opaque. Allow access through accessor methods instead. llvm-svn: 173522	2013-01-25 23:09:36 +00:00
Bill Wendling	c5e0fb9b77	Use the new 'getSlotIndex' method to retrieve the attribute's slot index. llvm-svn: 173499	2013-01-25 21:46:52 +00:00
Craig Topper	f358937aff	Remove trailing whitespace. llvm-svn: 173322	2013-01-24 05:22:40 +00:00
Benjamin Kramer	90ac6c1a88	Revert "InstCombine: Clean up weird code that talks about a modulus that's long gone." This causes crashes during the build of compiler-rt during selfhost. Add a testcase for coverage. llvm-svn: 173279	2013-01-23 17:52:29 +00:00
Benjamin Kramer	cb0126b7dc	InstCombine: Clean up weird code that talks about a modulus that's long gone. This does the right thing unless the multiplication overflows, but the old code didn't handle that case either. llvm-svn: 173276	2013-01-23 17:16:22 +00:00
Bill Wendling	c31f99d129	Remove the last of uses that use the Attribute object as a collection of attributes. Collections of attributes are handled via the AttributeSet class now. This finally frees us up to make significant changes to how attributes are structured. llvm-svn: 173228	2013-01-23 06:14:59 +00:00
Bill Wendling	f8c0d4f9cd	Have AttributeSet::getRetAttributes() return an AttributeSet instead of Attribute. This further restricts the use of the Attribute class to the Attribute family of classes. llvm-svn: 173098	2013-01-21 22:44:49 +00:00
Bill Wendling	991cef6573	Make AttributeSet::getFnAttributes() return an AttributeSet instead of an Attribute. This is more code to isolate the use of the Attribute class to that of just holding one attribute instead of a collection of attributes. llvm-svn: 173094	2013-01-21 21:57:28 +00:00
Paul Redmond	3d611dcda9	Transform (sub 0, (zext bool to A)) to (sext bool to A) and (sub 0, (sext bool to A)) to (zext bool to A). Patch by Muhammad Ahmad Reviewed by Duncan Sands llvm-svn: 173093	2013-01-21 21:57:20 +00:00
Bill Wendling	7777bbbbf3	Use AttributeSet accessor methods instead of Attribute accessor methods. Further encapsulation of the Attribute object. Don't allow direct access to the Attribute object as an aggregate. llvm-svn: 172853	2013-01-18 21:53:16 +00:00
Bill Wendling	b5ddc9a5f8	Push some more methods down to hide the use of the Attribute class. Because the Attribute class is going to stop representing a collection of attributes, limit the use of it as an aggregate in favor of using AttributeSet. This replaces some of the uses for querying the function attributes. llvm-svn: 172844	2013-01-18 21:11:39 +00:00
Craig Topper	215a93298b	Check for less than 0 in shuffle mask instead of -1. It's more consistent with other code related to shuffles and easier to implement in compiled code. llvm-svn: 172788	2013-01-18 05:30:07 +00:00
Craig Topper	bf40e524fd	Remove trailing whitespace. Remove new lines between closing brace and 'else' llvm-svn: 172784	2013-01-18 05:09:16 +00:00
Nadav Rotem	24c96fa80c	Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero. llvm-svn: 172576	2013-01-15 23:43:14 +00:00
Shuxin Yang	36c3cb1b3b	1. Hoist minus sign as high as possible in an attempt to reveal some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (XY) X => (XX) Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of XX, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551	2013-01-15 21:09:32 +00:00
Jakub Staszak	24a461467e	Remove trailing spaces. llvm-svn: 172489	2013-01-14 23:16:36 +00:00
Shuxin Yang	974390c4aa	This change is to implement following rules under the condition C_A and/or C_R --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2C1) (if C_A) => X (1/(C2C1)) (if C_A && C_R) rule 2: XC1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(YZ) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (ZY)/X (similar to rule3) rule 5: C1/(XC2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488	2013-01-14 22:48:41 +00:00
David Greene	a7b1d4d895	Fix Casting Bug Add a const version of getFpValPtr to avoid a cast-away-const warning. llvm-svn: 172467	2013-01-14 21:04:40 +00:00
Nick Lewycky	8ca069daaa	Fix typo in comment. llvm-svn: 172460	2013-01-14 20:56:10 +00:00
Owen Anderson	9e81fe53cf	Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc. llvm-svn: 172117	2013-01-10 22:06:52 +00:00
Shuxin Yang	2dc1fb7889	Consider expression "0.0 - X" as the negation of X if - this expression is explicitly marked no-signed-zero, or - no-signed-zero of this expression can be derived from some context. llvm-svn: 171922	2013-01-09 00:13:41 +00:00
Shuxin Yang	1c3b5e4f89	Cosmetical changne in order to conform to coding std. Thank Eric Christopher for figuring out these problems! llvm-svn: 171805	2013-01-07 22:41:28 +00:00

1 2 3 4 5 ...

1027 Commits