llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 14:02:50 +01:00

Author	SHA1	Message	Date
Chandler Carruth	49a7633378	[PM] Move TargetLibraryInfo into the Analysis library. While the term "Target" is in the name, it doesn't really have to do with the LLVM Target library -- this isn't an abstraction which LLVM targets generally need to implement or extend. It has much more to do with modeling the various runtime libraries on different OSes and with different runtime environments. The "target" in this sense is the more general sense of a target of cross compilation. This is in preparation for porting this analysis to the new pass manager. No functionality changed, and updates inbound for Clang and Polly. llvm-svn: 226078	2015-01-15 02:16:27 +00:00
David Majnemer	b89ab3fd88	InstCombine: Don't take A-B<0 into A<B if A-B has other uses This fixes PR22226. llvm-svn: 226023	2015-01-14 19:26:56 +00:00
Matt Arsenault	4b66850c40	Fix fcmp + fabs instcombines when using the intrinsic This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. llvm-svn: 225465	2015-01-08 20:09:34 +00:00
David Majnemer	eb61de555d	Analysis: Reformulate WillNotOverflowUnsignedAdd for reusability WillNotOverflowUnsignedAdd's smarts will live in ValueTracking as computeOverflowForUnsignedAdd. It now returns a tri-state result: never overflows, always overflows and sometimes overflows. llvm-svn: 225329	2015-01-07 00:39:50 +00:00
David Majnemer	d02481ebf3	InstCombine: Just a small tidy-up llvm-svn: 225328	2015-01-07 00:39:42 +00:00
Matt Arsenault	416be52fbf	Convert fcmp with 0.0 from casted integers to icmp This is already handled in general when it is known the conversion can't lose bits with smaller integer types casted into wider floating point types. This pattern happens somewhat often in GPU programs that cast workitem intrinsics to float, which are often compared with 0. Specifically handle the special case of compares with zero which should also be known to not lose information. I had a more general version of this which allows equality compares if the casted float is exactly representable in the integer, but I'm not 100% confident that is always correct. Also fold cases that aren't integers to true / false. llvm-svn: 225265	2015-01-06 15:50:59 +00:00
David Majnemer	82fa22459b	InstCombine: Bitcast call arguments from/to pointer/integer type Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing arguments and return values when DataLayout says it is safe to do so. llvm-svn: 225254	2015-01-06 08:41:31 +00:00
Chandler Carruth	c140bae640	[PM] Split the AssumptionTracker immutable pass into two separate APIs: a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the many utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131	2015-01-04 12:03:27 +00:00
David Majnemer	04c9e5e52d	InstCombine: match can find ConstantExprs, don't assume we have a Value We assumed the output of a match was a Value, this would cause us to assert because we would fail a cast<>. Instead, use a helper in the Operator family to hide the distinction between Value and Constant. This fixes PR22087. llvm-svn: 225127	2015-01-04 07:36:02 +00:00
David Majnemer	78198d7245	InstCombine: Detect when llvm.umul.with.overflow always overflows We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero and LHSKnownOne * RHSKnownOne overflow. llvm-svn: 225077	2015-01-02 07:29:47 +00:00
David Majnemer	a7058e95b3	Analysis: Reformulate WillNotOverflowUnsignedMul for reusability WillNotOverflowUnsignedMul's smarts will live in ValueTracking as computeOverflowForUnsignedMul. It now returns a tri-state result: never overflows, always overflows and sometimes overflows. llvm-svn: 225076	2015-01-02 07:29:43 +00:00
Sanjay Patel	657e61ccbf	InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 llvm-svn: 225050	2014-12-31 22:14:05 +00:00
David Majnemer	83939d3744	InstCombine: try to transform A-B < 0 into A < B We are allowed to move the 'B' to the right hand side if we an prove there is no signed overflow and if the comparison itself is signed. llvm-svn: 225034	2014-12-31 04:21:41 +00:00
Philip Reames	4527f27ca2	Carry facts about nullness and undef across GC relocation This change implements four basic optimizations: If a relocated value isn't used, it doesn't need to be relocated. If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.) If the value being relocated is undef, the relocation is meaningless. If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.) I outlined other planned work in comments. Differential Revision: http://reviews.llvm.org/D6600 llvm-svn: 224968	2014-12-29 23:27:30 +00:00
Philip Reames	a4f427e6e7	Loading from null is valid outside of addrspace 0 This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen. This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces. We really should introduce a hook to control this property on a per target per address space basis. We may be loosing valuable optimizations in some address spaces by being too conservative. Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me. llvm-svn: 224961	2014-12-29 22:46:21 +00:00
David Majnemer	c57dbde8fa	InstCombine: Infer nuw for multiplies A multiply cannot unsigned wrap if there are bitwidth, or more, leading zero bits between the two operands. llvm-svn: 224849	2014-12-26 09:50:35 +00:00
David Majnemer	ce5bd510cb	InstCombe: Infer nsw for multiplies We already utilize this logic for reducing overflow intrinsics, it makes sense to reuse it for normal multiplies as well. llvm-svn: 224847	2014-12-26 09:10:14 +00:00
David Majnemer	3da9d34415	InstCombine: Squash an icmp+select into bitwise arithmetic (X & INT_MIN) == 0 ? X ^ INT_MIN : X into X \| INT_MIN (X & INT_MIN) != 0 ? X ^ INT_MIN : X into X & INT_MAX This fixes PR21993. llvm-svn: 224676	2014-12-20 04:45:35 +00:00
Bruno Cardoso Lopes	b25873afd1	Reapply: [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. Also, fix code to also return the modified switch when only the truncation is performed. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224588	2014-12-19 17:12:35 +00:00
Sanjay Patel	b1bb7100db	use -0.0 when creating an fneg instruction Backends recognize (-0.0 - X) as the canonical form for fneg and produce better code. Eg, ppc64 with 0.0: lis r2, ha16(LCPI0_0) lfs f0, lo16(LCPI0_0)(r2) fsubs f1, f0, f1 blr vs. -0.0: fneg f1, f1 blr Differential Revision: http://reviews.llvm.org/D6723 llvm-svn: 224583	2014-12-19 16:44:08 +00:00
Bruno Cardoso Lopes	ba090bc5c2	Revert "[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr" Reverts commit r224574 to appease buildbots: The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. llvm-svn: 224576	2014-12-19 14:36:24 +00:00
Bruno Cardoso Lopes	afb4b15ab3	[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224574	2014-12-19 14:23:15 +00:00
Sanjay Patel	f3ae90d3db	fix formatting; NFC llvm-svn: 224542	2014-12-18 21:11:09 +00:00
Erik Eckstein	042c032147	Strength reduce intrinsics with overflow into regular arithmetic operations if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. llvm-svn: 224417	2014-12-17 07:29:19 +00:00
Steven Wu	17876438f0	More code format fix from r224133, NFC llvm-svn: 224140	2014-12-12 18:48:37 +00:00
Steven Wu	498e8ef334	Restructure code from r224097. NFC llvm-svn: 224133	2014-12-12 17:21:54 +00:00
Steven Wu	896d3dd47b	Fix another infinite loop in InstCombine Summary: InstCombine infinite-loops for the testcase added It is because InstCombine is generating instructions that can be optimized by itself. Fix by not optimizing frem if the optimized type is the same as original type. rdar://problem/19150820 Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6634 llvm-svn: 224097	2014-12-12 04:34:07 +00:00
Andrea Di Biagio	6186490ec7	[InstCombine][X86] Improved folding of calls to Intrinsic::x86_sse4a_insertqi. This patch teaches the instruction combiner how to fold a call to 'insertqi' if the 'length field' (3rd operand) is set to zero, and if the sum between field 'length' and 'bit index' (4th operand) is bigger than 64. From the AMD64 Architecture Programmer's Manual: 1. If the sum of the bit index + length field is greater than 64, then the results are undefined; 2. A value of zero in the field length is defined as a length of 64. This patch improves the existing combining logic for intrinsic 'insertqi' adding extra checks to address both point 1. and point 2. Differential Revision: http://reviews.llvm.org/D6583 llvm-svn: 224054	2014-12-11 20:44:59 +00:00
Erik Eckstein	4078937e34	Refactor creation of overflow result tuples in InstCombineCalls. Extract the creation of overflow result tuples in a separate function. NFC. llvm-svn: 224006	2014-12-11 08:02:30 +00:00
Chandler Carruth	2100e2da37	Revert r223764 which taught instcombine about integer-based elment extraction patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813	2014-12-09 19:21:16 +00:00
Duncan P. N. Exon Smith	3d57886267	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
Chandler Carruth	318b867199	Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 llvm-svn: 223764	2014-12-09 08:55:32 +00:00
Simon Pilgrim	0b4002e32c	[InstCombine] Minor optimization for bswap with binary ops Added instcombine optimizations for BSWAP with AND/OR/XOR ops: OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) ) OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) ) Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well: fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y)) Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone. Differential Revision: http://reviews.llvm.org/D6407 llvm-svn: 223349	2014-12-04 09:44:01 +00:00
Erik Eckstein	b61d06dbaf	InstCombine: simplify signed range checks Try to convert two compares of a signed range check into a single unsigned compare. Examples: (icmp sge x, 0) & (icmp slt x, n) --> icmp ult x, n (icmp slt x, 0) \| (icmp sgt x, n) --> icmp ugt x, n llvm-svn: 223224	2014-12-03 10:39:15 +00:00
Philip Reames	02104421ff	[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them. With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now. I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it. During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases. In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure. Reviewed by: atrick, ributzka llvm-svn: 223137	2014-12-02 18:50:36 +00:00
David Majnemer	87a17a0975	InstCombine: FoldOrOfICmps harder We may be in a situation where the icmps might not be near each other in a tree of or instructions. Try to dig out related compare instructions and see if they combine. N.B. This won't fire on deep trees of compares because rewritting the tree might end up creating a net increase of IR. We may have to resort to something more sophisticated if this is a real problem. llvm-svn: 222928	2014-11-28 19:58:29 +00:00
Ankur Garg	0bf088f50d	Removed extra line from a comment to test first commit. NFC. llvm-svn: 222916	2014-11-28 10:38:18 +00:00
David Majnemer	9698d3487d	InstCombine: Restore optimizations lost in r210006 This restores our ability to optimize: (X & C) == 0 ? X ^ C : X into X \| C (X & C) != 0 ? X ^ C : X into X & ~C llvm-svn: 222871	2014-11-27 07:25:21 +00:00
David Majnemer	7e9b94486e	Revert "Added inst combine transforms for single bit tests from Chris's note" This reverts commit r210006, it miscompiled libapr which is used in who knows how many projects. A test has been added to ensure that we don't regress again. I'll work on a rewrite of what the optimization was trying to do later. llvm-svn: 222856	2014-11-26 23:00:38 +00:00
Chandler Carruth	6264bc6537	[InstCombine] Change LLVM To canonicalize toward the value type being stored rather than the pointer type. This change is analogous to r220138 which changed the canonicalization for loads. The rationale is the same: memory does not have a type, operations (and thus the values they produce) have a type. We should match that type as closely as possible rather than reading some form of semantics into the pointer type. With this change, loads and stores should no longer be made with nonsensical types for the values that tehy load and store. This is particularly important when trying to match specific loaded and stored types in the process of doing other instcombines, which is what led me down this twisty maze of miscanonicalization. I've put quite some effort into looking through IR to find places where LLVM's optimizer was being unreasonably conservative in the face of mismatched load and store types, however it is possible (let's say, likely!) I have missed some. If you see regressions here, or from r220138, the likely cause is some part of LLVM failing to cope with load and store types differing. Test cases appreciated, it is important that we root all of these out of LLVM. llvm-svn: 222748	2014-11-25 10:09:51 +00:00
Chandler Carruth	7feb19d89c	Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite clearly only exactly equal width ptrtoint and inttoptr casts are no-op casts, it says so right there in the langref. Make the code agree. Original log from r220277: Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 222739	2014-11-25 08:20:27 +00:00
Matt Arsenault	454e837bd2	Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons llvm-svn: 222705	2014-11-24 23:15:18 +00:00
David Majnemer	291966cd3b	InstCombine: Don't create an unused instruction We would create an instruction but not inserting it. Not inserting the unused instruction would lead us to verification failure. This fixes PR21653. llvm-svn: 222659	2014-11-24 16:41:13 +00:00
David Majnemer	2445c6caf5	InstCombine: Don't assume DataLayout is always available We tried to get the result of DataLayout::getLargestLegalIntTypeSize but we didn't have a DataLayout. This resulted in opt crashing. This fixes PR21651. llvm-svn: 222645	2014-11-24 07:26:20 +00:00
David Majnemer	0b413925f3	InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2) llvm-svn: 222625	2014-11-22 20:00:41 +00:00
David Majnemer	ba33e07fad	InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y) llvm-svn: 222624	2014-11-22 20:00:38 +00:00
David Majnemer	e3d9e29780	InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C) llvm-svn: 222623	2014-11-22 20:00:34 +00:00
David Majnemer	23e1540ef9	InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2) llvm-svn: 222620	2014-11-22 18:16:54 +00:00
David Majnemer	1847177b9b	InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y llvm-svn: 222613	2014-11-22 08:57:02 +00:00
David Majnemer	3c7153d5d6	InstCombine: Propagate NSW for -X * -Y -> X * Y llvm-svn: 222612	2014-11-22 07:25:19 +00:00
David Majnemer	6b5df7ef8d	InstCombine: Silence a parenthesis warning llvm-svn: 222609	2014-11-22 06:09:28 +00:00
David Majnemer	c405b87f53	InstCombine: Preserve nsw when folding X*(2^C) -> X << C llvm-svn: 222606	2014-11-22 04:52:55 +00:00
David Majnemer	96d9c67b69	InstCombine: Preserve nsw/nuw for ((X << C2)C1) -> (X (C1 << C2)) llvm-svn: 222605	2014-11-22 04:52:52 +00:00
David Majnemer	6191590b23	InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V) llvm-svn: 222604	2014-11-22 04:52:38 +00:00
Gerolf Hoflehner	cb87bd4853	[InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence) Fixes the self-host fail. Note that this commit activates dominator analysis in the combiner by default (like the original commit did). llvm-svn: 222590	2014-11-21 23:36:44 +00:00
David Blaikie	60e6c80905	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
David Majnemer	e6cc1061cc	InstCombine: Fix another infinite loop caused by visitFPTrunc We would attempt to replace an frem's operand with the same operand. This would cause InstCombine to think real work was done, causing InstCombine to enter an infinite loop. This fixes the second part of PR21576. llvm-svn: 222265	2014-11-18 22:06:45 +00:00
David Majnemer	0c67e78132	Revert "Revert r222040 because of bot failure." This reverts commit r222203, reverting r222040 didn't end up turning the bot green. llvm-svn: 222261	2014-11-18 21:30:02 +00:00
David Majnemer	a43009a5dc	InstCombine: Fold away tautological masked compares It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true. While this sort of reasoning should normally live in InstSimplify, the machinery that derives this result is not trivial to split out. llvm-svn: 222230	2014-11-18 09:31:41 +00:00
David Majnemer	fdefc8c778	InstCombine: Clean up foldLogOpOfMaskedICmps No functional change intended. llvm-svn: 222229	2014-11-18 09:31:36 +00:00
Manman Ren	3d4f707d60	Revert r222040 because of bot failure. http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/ Hopefully, bot will be green. llvm-svn: 222203	2014-11-18 00:33:22 +00:00
David Majnemer	4c69b4d32f	InstCombine: Fix infinite loop caused by visitFPTrunc We would attempt to replace a fptrunc of an frem with an identical fptrunc. This would cause the new fptrunc to be added to the worklist. Of course, this results in an infinite loop because we will keep visiting the newly created fptruncs. This fixes PR21576. llvm-svn: 222040	2014-11-14 21:21:15 +00:00
Bill Schmidt	96b68de282	[PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for PowerPC, which provide programmer access to the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. New LLVM intrinsics are provided to represent these four instructions in IntrinsicsPowerPC.td. These are patterned after the similar intrinsics for lvx and stvx (Altivec). In PPCInstrVSX.td, these intrinsics are tied to the code gen patterns, with additional patterns to allow plain vanilla loads and stores to still generate these instructions. At -O1 and higher the intrinsics are immediately converted to loads and stores in InstCombineCalls.cpp. This will open up more optimization opportunities while still allowing the correct instructions to be generated. (Similar code exists for aligned Altivec loads and stores.) The new intrinsics are added to the code that checks for consecutive loads and stores in PPCISelLowering.cpp, as well as to PPCTargetLowering::getTgtMemIntrinsic(). There's a new test to verify the correct instructions are generated. The loads and stores tend to be reordered, so the test just counts their number. It runs at -O2, as it's not very effective to test this at -O0, when many unnecessary loads and stores are generated. I ended up having to modify vsx-fma-m.ll. It turns out this test case is slightly unreliable, but I don't know a good way to prevent problems with it. The xvmaddmdp instructions read and write the same register, which is one of the multiplicands. Commutativity allows either to be chosen. If the FMAs are reordered differently than expected by the test, the register assignment can be different as a result. Hopefully this doesn't change often. There is a companion patch for Clang. llvm-svn: 221767	2014-11-12 04:19:40 +00:00
Philip Reames	f88f796701	Canonicalize an assume(load != null) into !nonnull metadata We currently have two ways of informing the optimizer that the result of a load is never null: metadata and assume. This change converts the second in to the former. This avoids a need to implement optimizations using both forms. We should probably extend this basic idea to metadata of other forms; in particular, range metadata. We view is that assumes should be considered a "last resort" for when there isn't a more canonical way to represent something. Reviewed by: Hal Differential Revision: http://reviews.llvm.org/D5951 llvm-svn: 221737	2014-11-11 23:33:19 +00:00
Duncan P. N. Exon Smith	8770505e4e	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711	2014-11-11 21:30:22 +00:00
David Majnemer	3508fa333f	InstCombine: Rely on cmpxchg's return code when it's strong Comparing the result of a cmpxchg instruction can be replaced with an extractvalue of the cmpxchg success indicator. llvm-svn: 221498	2014-11-06 23:23:30 +00:00
Mark Heffernan	c918314d30	Revert earlier change removing setPreservesCFG from instcombine (r221223) and change LoopSimplifyPass to be !isCFGOnly. The motivation for the earlier patch (r221223) was that LoopSimplify is not preserved by instcombine though setPreservesCFG indicates that it is. This change fixes the issue by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less invasive. llvm-svn: 221311	2014-11-04 23:02:09 +00:00
Mark Heffernan	271e57605b	Remove setPreservesCFG from instcombine. The pass, in particular, does not preserve LoopSimplify because instcombine may replace branch predicates with undef which loop simplify then replaces with always exit. Replace setPreservesCFG with the more constrained preservation of DomTree and LoopInfo. llvm-svn: 221223	2014-11-04 01:51:01 +00:00
David Majnemer	fff264d2c9	InstCombine: Remove infinite loop caused by FoldOpIntoPhi FoldOpIntoPhi could create an infinite loop if the PHI could potentially reach a BB it was considering inserting instructions into. The instructions it would insert would eventually lead to other combines firing which would, again, lead to FoldOpIntoPhi firing. The solution is to handicap FoldOpIntoPhi so that it doesn't attempt to insert instructions that the PHI might reach. This fixes PR21377. llvm-svn: 221187	2014-11-03 21:55:12 +00:00
David Majnemer	dff3e35d12	InstCombine: Combine (X \| Y) - X to (~X & Y) This implements the transformation from (X \| Y) - X to (~X & Y). Differential Revision: http://reviews.llvm.org/D5791 llvm-svn: 221129	2014-11-03 05:53:55 +00:00
David Majnemer	de57c8824b	InstCombine: Don't assume that m_ZExt matches an Instruction m_ZExt might bind against a ConstantExpr instead of an Instruction. Assuming this, using cast<Instruction>, results in InstCombine crashing. Instead, introduce ZExtOperator to bridge both Instruction and ConstantExpr ZExts. This fixes PR21445. llvm-svn: 221069	2014-11-01 23:46:05 +00:00
David Majnemer	31bb458ba0	InstCombine: Combine (X+cst) < 0 --> X < -cst This can happen pretty often in code that looks like: int foo = bar - 1; if (foo < 0) do stuff In this case, bar < 1 is an equivalent condition. This transform requires that the add instruction be annotated with nsw. llvm-svn: 221045	2014-11-01 09:09:51 +00:00
Duncan P. N. Exon Smith	a826079aaa	IR: MDNode => Value: Instruction::getAllMetadata() Change `Instruction::getAllMetadata()` to modify a vector of `Value` instead of `MDNode` and update call sites. This is part of PR21433. llvm-svn: 221027	2014-11-01 00:26:42 +00:00
Duncan P. N. Exon Smith	7004fd9aac	IR: MDNode => Value: Instruction::getMetadata() Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024	2014-11-01 00:10:31 +00:00
NAKAMURA Takumi	f2d570a79b	Untabify and whitespace cleanups. llvm-svn: 220771	2014-10-28 11:53:30 +00:00
David Majnemer	61455bd9bc	InstCombine: Fix a combine assuming that icmp operands were integers An icmp may have pointer arguments, it isn't limited to integers or vectors of integers. This fixes PR21388. llvm-svn: 220664	2014-10-27 05:47:49 +00:00
Benjamin Kramer	d9a51baf8c	Clean up assume intrinsic pattern matching, no need to check that the argument is a value. Also make it const safe and remove superfluous casting. NFC. llvm-svn: 220616	2014-10-25 18:09:01 +00:00
David Majnemer	c9904e0a8e	InstCombine: Remove overzealous asserts These asserts can trigger if the worklist iteration order is sufficiently unlucky. Instead of adding special case logic to handle these edge conditions, just bail out on trying to transform them: InstSimplify will get them when it reaches them on the worklist. This fixes PR21378. N.B. No test case is included because any test would rely on the fragile worklist iteration order. llvm-svn: 220612	2014-10-25 07:13:13 +00:00
Sanjay Patel	3046d570c6	Handle sqrt() shrinking in SimplifyLibCalls like any other call This patch removes a chunk of special case logic for folding (float)sqrt((double)x) -> sqrtf(x) in InstCombineCasts and handles it in the mainstream path of SimplifyLibCalls. No functional change intended, but I loosened the restriction on the existing sqrt testcases to allow for this optimization even without unsafe-fp-math because that's the existing behavior. I also added a missing test case for not shrinking the llvm.sqrt.f64 intrinsic in case the result is used as a double. Differential Revision: http://reviews.llvm.org/D5919 llvm-svn: 220514	2014-10-23 21:52:45 +00:00
Frederic Riss	fd891e8483	Assert that ValueHandleBase::ValueIsRAUWd doesn't change the tracked Value type. This invariant is enforced in Value::replaceAllUsesWith, thus it seems logical to apply it also to ValueHandles. This commit fixes InstCombine to not trigger the assertion during the removal of constant bitcasts in call instructions. Differential Revision: http://reviews.llvm.org/D5828 llvm-svn: 220468	2014-10-23 04:08:42 +00:00
Sanjay Patel	2e19fa34bb	Shrinkify libcalls: use float versions of double libm functions with fast-math (bug 17850) When a call to a double-precision libm function has fast-math semantics (via function attribute for now because there is no IR-level FMF on calls), we can avoid fpext/fptrunc operations and use the float version of the call if the input and output are both float. We already do this optimization using a command-line option; this patch just adds the ability for fast-math to use the existing functionality. I moved the cl::opt from InstructionCombining into SimplifyLibCalls because it's only ever used internally to that class. Modified the existing test cases to use the unsafe-fp-math attribute rather than repeating all tests. This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850 Differential Revision: http://reviews.llvm.org/D5893 llvm-svn: 220390	2014-10-22 15:29:23 +00:00
Hans Wennborg	b55f26d3d1	Revert "Teach the load analysis to allow finding available values which require" (r220277) This seems to have caused PR21330. llvm-svn: 220349	2014-10-21 23:49:52 +00:00
Matt Arsenault	74dd906076	Add minnum / maxnum intrinsics These are named following the IEEE-754 names for these functions, rather than the libm fmin / fmax to avoid possible ambiguities. Some languages may implement something resembling fmin / fmax which return NaN if either operand is to propagate errors. These implement the IEEE-754 semantics of returning the other operand if either is a NaN representing missing data. llvm-svn: 220341	2014-10-21 23:00:20 +00:00
Philip Reames	d832b51fdd	Preserve 'nonnull' when changing type of the load. When changing the type of a load in Chandler's recent InstCombine changes, we can preserve the new 'nonnull' metadata. I considered adding an assert since 'nonnull' is only valid on pointer types, but casting a pointer to a non-pointer would involve more than a bitcast anyways. If someone extends this transform to handle more than bitcasts, the verifier will report the malformed IR, so a separate assertion isn't needed. Also, the fpmath flags would have the same problem. llvm-svn: 220324	2014-10-21 21:00:03 +00:00
David Majnemer	ebf53c54ae	InstCombine: Simplify FoldICmpCstShrCst This function was complicated by the fact that it tried to perform canonicalizations that were already preformed by InstSimplify. Remove this extra code and move the tests over to InstSimplify. Add asserts to make sure our preconditions hold before we make any assumptions. llvm-svn: 220314	2014-10-21 19:51:55 +00:00
Chandler Carruth	eaa3d973ce	Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 220277	2014-10-21 09:00:40 +00:00
Philip Reames	c3e4c79873	Introduce enum values for previously defined metadata types. (NFC) Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison. This change adds enum value for three existing metadata types: + MD_nontemporal = 9, // "nontemporal" + MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access" + MD_nonnull = 11 // "nonnull" I went through an updated various uses as well. I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate. For example, there were several items in LoopInfo.cpp I chose not to update. llvm-svn: 220248	2014-10-21 00:13:20 +00:00
Chandler Carruth	883dec8d65	Teach the load analysis driving core instcombine logic and other bits of logic to look through pointer casts, making them trivially stronger in the face of loads and stores with intervening pointer casts. I've included a few test cases that demonstrate the kind of folding instcombine can do without pointer casts and then variations which obfuscate the logic through bitcasts. Without this patch, the variations all fail to optimize fully. This is more important now than it has been in the past as I've started moving the load canonicialization to more closely follow the value type requirements rather than the pointer type requirements and thus this needs to be prepared for more pointer casts. When I made the same change to stores several test cases regressed without logic along these lines so I wanted to systematically improve matters first. llvm-svn: 220178	2014-10-20 00:24:14 +00:00
Chandler Carruth	4f93ec40bc	Do a better and more complete job of preserving metadata when combining loads. This handles many more cases than just the AA metadata, some of them suggested by Hal in his review of the AA metadata handling patch. I've tried to test this behavior where tractable to do so. I'll point out that I have specifically not included a test for debuginfo because it was going to require 2 or 3 times as much work to craft some input which would survive the "helpful" stripping of debug info metadata that doesn't match the desired schema. This is another good example of why the current state of write-ability for our debug info metadata is unacceptable. I spent over 30 minutes trying to conjure some test case that would survive, even copying from other debug info tests, but it always failed to survive with no explanation of why or how I might fix it. =[ llvm-svn: 220165	2014-10-19 10:46:46 +00:00
David Majnemer	6e9cd62474	InstCombine: (sub (or A B) (xor A B)) --> (and A B) The following implements the transformation: (sub (or A B) (xor A B)) --> (and A B). Patch by Ankur Garg! Differential Revision: http://reviews.llvm.org/D5719 llvm-svn: 220163	2014-10-19 08:32:32 +00:00
David Majnemer	4c15df3adb	InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1 The following implements the optimization for sequences of the form: icmp eq/ne (shl Const2, A), Const1 Such sequences can be transformed to: icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2)) This handles only the equality operators for now. Other operators need to be handled. Patch by Ankur Garg! llvm-svn: 220162	2014-10-19 08:23:08 +00:00
Chandler Carruth	472e6c1ad4	Preserve AA metadata when combining (cast (load (...))) -> (load (cast (...))). llvm-svn: 220141	2014-10-18 11:00:12 +00:00
Chandler Carruth	b95f276385	[InstCombine] Do an about-face on how LLVM canonicalizes (cast (load ...)) and (load (cast ...)): canonicalize toward the former. Historically, we've tried to load using the type of the pointer, and tried to match that type as closely as possible removing as many pointer casts as we could and trading them for bitcasts of the loaded value. This is deeply and fundamentally wrong. Repeat after me: memory does not have a type! This was a hard lesson for me to learn working on SROA. There is only one thing that should actually drive the type used for a pointer, and that is the type which we need to use to load from that pointer. Matching up pointer types to the loaded value types is very useful because it minimizes the physical size of the IR required for no-op casts. Similarly, the only thing that should drive the type used for a loaded value is how that value is used! Again, this minimizes casts. And in fact, the only thing motivating types in any part of LLVM's IR are the types used by the operations in the IR. We should match them as closely as possible. I've ended up removing some tests here as they were testing bugs or behavior that is no longer present. Mostly though, this is just cleanup to let the tests continue to function as intended. The only fallout I've found so far from this change was SROA and I have fixed it to not be impeded by the different type of load. If you find more places where this change causes optimizations not to fire, those too are likely bugs where we are assuming that the type of pointers is "significant" for optimization purposes. llvm-svn: 220138	2014-10-18 06:36:22 +00:00
Akira Hatanaka	970d54d810	Reapply r219832 - InstCombine: Narrow switch instructions using known bits. The code committed in r219832 asserted when it attempted to shrink a switch statement whose type was larger than 64-bit. llvm-svn: 219902	2014-10-16 06:00:46 +00:00
Akira Hatanaka	0f1151e121	Revert r219832. llvm-svn: 219884	2014-10-16 01:17:02 +00:00
Akira Hatanaka	0f658614c0	InstCombine: Narrow switch instructions using known bits. Truncate the operands of a switch instruction to a narrower type if the upper bits are known to be all ones or zeros. rdar://problem/17720004 llvm-svn: 219832	2014-10-15 19:05:50 +00:00
David Majnemer	cd6d05c0a5	InstCombine: Don't miscompile X % ((Pow2 << A) >>u B) We assumed that A must be greater than B because the right hand side of a remainder operator must be nonzero. However, it is possible for A to be less than B if Pow2 is a power of two greater than 1. Take for example: i32 %A = 0 i32 %B = 31 i32 Pow2 = 2147483648 ((Pow2 << 0) >>u 31) is non-zero but A is less than B. This fixes PR21274. llvm-svn: 219713	2014-10-14 20:28:40 +00:00
Sanjay Patel	78f4588f8d	fix formatting; NFC llvm-svn: 219645	2014-10-14 00:33:23 +00:00
David Majnemer	4db8c93862	InstCombine: Fix miscompile in X % -Y -> X % Y transform We assumed that negation operations of the form (0 - %Z) resulted in a negative number. This isn't true if %Z was originally negative. Substituting the negative number into the remainder operation may result in undefined behavior because the dividend might be INT_MIN. This fixes PR21256. llvm-svn: 219639	2014-10-13 22:37:51 +00:00
David Majnemer	0e491f60c5	InstCombine: Don't miscompile (x lshr C1) udiv C2 We have a transform that changes: (x lshr C1) udiv C2 into: x udiv (C2 << C1) However, it is unsafe to do so if C2 << C1 discards any of C2's bits. This fixes PR21255. llvm-svn: 219634	2014-10-13 21:48:30 +00:00
Benjamin Kramer	8581af15aa	InstCombine: Turn (x != 0 & x <u C) into the canonical range check form (x-1 <u C-1) llvm-svn: 219585	2014-10-12 14:02:34 +00:00
David Majnemer	1c52e81b4a	InstCombine: Simplify commonIDivTransforms A helper routine, MultiplyOverflows, was a less efficient reimplementation of APInt's smul_ov and umul_ov. While we are here, clean up the code so it's more uniform. No functionality change intended. llvm-svn: 219583	2014-10-12 08:34:24 +00:00
David Majnemer	26e62e8e03	InstCombine: Don't fold (X <<s log(INT_MIN)) /s INT_MIN to X Consider the case where X is 2. (2 <<s 31)/s-2147483648 is zero but we would fold to X. Note that this is valid when we are in the unsigned domain because we require NUW: 2 <<u 31 results in poison. This fixes PR21245. llvm-svn: 219568	2014-10-11 10:20:04 +00:00
David Majnemer	475b056c9a	InstCombine, InstSimplify: (%X /s C1) /s C2 isn't always 0 when C1 * C2 overflow consider: C1 = INT_MIN C2 = -1 C1 * C2 overflows without a doubt but consider the following: %x = i32 INT_MIN This means that (%X /s C1) is 1 and (%X /s C1) /s C2 is -1. N. B. Move the unsigned version of this transform to InstSimplify, it doesn't create any new instructions. This fixes PR21243. llvm-svn: 219567	2014-10-11 10:20:01 +00:00
David Majnemer	2d1353223b	InstCombine: mul to shl shouldn't preserve nsw consider: mul i32 nsw %x, -2147483648 this instruction will not result in poison if %x is 1 however, if we transform this into: shl i32 nsw %x, 31 then we will be generating poison because we just shifted into the sign bit. This fixes PR21242. llvm-svn: 219566	2014-10-11 10:19:52 +00:00
Andrea Di Biagio	ebacbba071	[InstCombine] Fix wrong folding of constant comparisons involving ashr and negative values. This patch fixes a bug in method InstCombiner::FoldCmpCstShrCst where we wrongly computed the distance between the highest bits set of two negative values. This fixes PR21222. Differential Revision: http://reviews.llvm.org/D5700 llvm-svn: 219406	2014-10-09 12:41:49 +00:00
Justin Bogner	885b53fc78	Revert "[InstCombine] re-commit r218721 with fix for pr21199" This seems to cause a miscompile when building clang, which causes a bootstrapped clang to fail or crash in several of its tests. See: http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-RA/builds/1184 http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/7813 This reverts commit r219282. llvm-svn: 219317	2014-10-08 16:30:22 +00:00
Suyog Sarda	f043b064a7	Format spacing and remove extra lines to comply with standards. NFC. Differential Revision: http://reviews.llvm.org/D5649 llvm-svn: 219286	2014-10-08 08:37:49 +00:00
Gerolf Hoflehner	2a4154d00e	[InstCombine] re-commit r218721 with fix for pr21199 The icmp-select-icmp optimization targets select-icmp.eq only. This is now ensured by testing the branch predicate explictly. This commit also includes the test case for pr21199. llvm-svn: 219282	2014-10-08 06:42:19 +00:00
Hans Wennborg	d68aaad991	Revert r219175 - [InstCombine] re-commit r218721 icmp-select-icmp optimization This seems to have caused PR21199. llvm-svn: 219264	2014-10-08 01:05:57 +00:00
Suyog Sarda	0861aeac17	Reformat if statement to comply with LLVM standards. NFC. Differential Revision: http://reviews.llvm.org/D5644 llvm-svn: 219203	2014-10-07 12:04:07 +00:00
Suyog Sarda	9d2ff217e8	Reformat to comply with LLVM coding standards using clang-format. NFC. Differential Revision: http://reviews.llvm.org/D5645 llvm-svn: 219202	2014-10-07 11:56:06 +00:00
Tilmann Scheller	4a15c9ecf0	[InstCombine] Reformat if statements to comply with LLVM Coding Standards. Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D5643 llvm-svn: 219198	2014-10-07 10:19:34 +00:00
Gerolf Hoflehner	c49d88c9db	[InstCombine] re-commit r218721 icmp-select-icmp optimization Takes care of the assert that caused build fails. Rather than asserting the code checks now that the definition and use are in the same block, and does not attempt to optimize when that is not the case. llvm-svn: 219175	2014-10-07 00:16:12 +00:00
Hal Finkel	4acb4516d6	[InstCombine] Simplify the logic from r219067 using ValueTracking Joerg suggested on IRC that I look at generalizing the logic from r219067 to handle more general redundancies (like removing an assume(x > 3) dominated by an assume(x > 5)). The way to do this would be to ask ValueTracking to determine the value of the i1 argument. It turns out that ValueTracking is not very good at this right now (although it does get the trivial redundancy case) because it does not understand ICmps. Nevertheless, the resulting code in InstCombine is simpler than r219067, so we might as well do it now. llvm-svn: 219070	2014-10-05 00:53:02 +00:00
Hal Finkel	59318e2605	[InstCombine] Remove redundant @llvm.assume intrinsics For any @llvm.assume intrinsic, if there is another which dominates it and uses the same condition, then it is redundant and can be removed. While this does not alter the semantics of the @llvm.assume intrinsics, it makes subsequent handling more efficient (and the resulting IR easier to read). llvm-svn: 219067	2014-10-04 21:27:06 +00:00
Sanjay Patel	fedbe6dc4e	Optimize square root squared (PR21126). When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X. This can happen in the real world when calculating x ** 3/2. This occurs in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c. Differential Revision: http://reviews.llvm.org/D5584 llvm-svn: 218906	2014-10-02 21:10:54 +00:00
Sanjay Patel	0ba6197f0d	Use the local variable that other clauses around here are already using. llvm-svn: 218876	2014-10-02 15:20:45 +00:00
Evgeniy Stepanov	07f2031151	Revert r218721, r218735. Failing bootstrap on Linux (arm, x86). http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13139/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/470 http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/8518 llvm-svn: 218752	2014-10-01 10:07:28 +00:00
Gerolf Hoflehner	a5e5769ed3	[InstCombine] Fix for assert build failures caused by r218721 The icmp-select-icmp optimization made the implicit assumption that the select-icmp instructions are in the same block and asserted on it. The fix explicitly checks for that condition and conservatively suppresses the optimization when it is violated. llvm-svn: 218735	2014-10-01 03:24:39 +00:00
Gerolf Hoflehner	6bbc18ceb7	[InstCombine] Optimize icmp-select-icmp In special cases select instructions can be eliminated by replacing them with a cheaper bitwise operation even when the select result is used outside its home block. The instances implemented are patterns like %x=icmp.eq %y=select %x,%r, null %z=icmp.eq\|neq %y, null br %z,true, false ==> %x=icmp.ne %y=icmp.eq %r,null %z=or %x,%y br %z,true,false The optimization is integrated into the instruction combiner and performed only when all uses of the select result can be replaced by the select operand proper. For this dominator information is used and dominance is now a required analysis pass in the combiner. The optimization itself is iterative. The critical step is to replace the select result with the non-constant select operand. So the select becomes local and the combiner iteratively works out simpler code pattern and eventually eliminates the select. rdar://17853760 llvm-svn: 218721	2014-10-01 00:13:22 +00:00
David Blaikie	ddb5a46373	Reapply fix in r217988 (reverted in r217989) and remove the alternative fix committed in r217987. This type isn't owned polymorphically (as demonstrated by making the dtor protected and everything still compiling) so just address the warning by protecting the base dtor and making the derived class final. llvm-svn: 217990	2014-09-17 22:27:36 +00:00
David Blaikie	52fbb62253	Revert "Fix -Wnon-virtual-dtor warning introduced in r217982." An alternative fix was already committed. This reverts commit r217988. llvm-svn: 217989	2014-09-17 22:17:59 +00:00
David Blaikie	97537830d4	Fix -Wnon-virtual-dtor warning introduced in r217982. llvm-svn: 217988	2014-09-17 22:15:40 +00:00
Chris Bieneman	b486dadca0	Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code. Summary: This eliminates ~200 lines of code mostly file scoped struct definitions that were unnecessary. Reviewers: chandlerc, resistor Reviewed By: resistor Subscribers: morisset, resistor, llvm-commits Differential Revision: http://reviews.llvm.org/D5364 llvm-svn: 217982	2014-09-17 20:55:46 +00:00
Andrea Di Biagio	99dc03a95d	[InstCombine] Fix wrong folding of constant comparison involving ahsr and negative quantities (PR20945). Example: define i1 @foo(i32 %a) { %shr = ashr i32 -9, %a %cmp = icmp ne i32 %shr, -5 ret i1 %cmp } Before this fix, the instruction combiner wrongly thought that %shr could have never been equal to -5. Therefore, %cmp was always folded to 'true'. However, when %a is equal to 1, then %cmp evaluates to 'false'. Therefore, in this example, it is not valid to fold %cmp to 'true'. The problem was only affecting the case where the comparison was between negative quantities where one of the quantities was obtained from arithmetic shift of a negative constant. This patch fixes the problem with the wrong folding (fixes PR20945). With this patch, the 'icmp' from the example is now simplified to a comparison between %a and 1. This still allows us to get rid of the arithmetic shift (%shr). llvm-svn: 217950	2014-09-17 11:32:31 +00:00
Hal Finkel	5d195fd587	Check for all known bits on ret in InstCombine From a combination of @llvm.assume calls (and perhaps through other means, such as range metadata), it is possible that all bits of a return value might be known. Previously, InstCombine did not check for this (which is understandable given assumptions of constant propagation), but means that we'd miss simple cases where assumptions are involved. llvm-svn: 217346	2014-09-07 21:28:34 +00:00
Hal Finkel	be0364002a	Add additional patterns for @llvm.assume in ValueTracking This builds on r217342, which added the infrastructure to compute known bits using assumptions (@llvm.assume calls). That original commit added only a few patterns (to catch common cases related to determining pointer alignment); this change adds several other patterns for simple cases. r217342 contained that, for assume(v & b = a), bits in the mask that are known to be one, we can propagate known bits from the a to v. It also had a known-bits transfer for assume(a = b). This patch adds: assume(~(v & b) = a) : For those bits in the mask that are known to be one, we can propagate inverted known bits from the a to v. assume(v \| b = a) : For those bits in b that are known to be zero, we can propagate known bits from the a to v. assume(~(v \| b) = a): For those bits in b that are known to be zero, we can propagate inverted known bits from the a to v. assume(v ^ b = a) : For those bits in b that are known to be zero, we can propagate known bits from the a to v. For those bits in b that are known to be one, we can propagate inverted known bits from the a to v. assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can propagate inverted known bits from the a to v. For those bits in b that are known to be one, we can propagate known bits from the a to v. assume(v << c = a) : For those bits in a that are known, we can propagate them to known bits in v shifted to the right by c. assume(~(v << c) = a) : For those bits in a that are known, we can propagate them inverted to known bits in v shifted to the right by c. assume(v >> c = a) : For those bits in a that are known, we can propagate them to known bits in v shifted to the right by c. assume(~(v >> c) = a) : For those bits in a that are known, we can propagate them inverted to known bits in v shifted to the right by c. assume(v >=_s c) where c is non-negative: The sign bit of v is zero assume(v >_s c) where c is at least -1: The sign bit of v is zero assume(v <=_s c) where c is negative: The sign bit of v is one assume(v <_s c) where c is non-positive: The sign bit of v is one assume(v <=_u c): Transfer the known high zero bits assume(v <_u c): Transfer the known high zero bits (if c is know to be a power of 2, transfer one more) A small addition to InstCombine was necessary for some of the test cases. The problem is that when InstCombine was simplifying and, or, etc. it would fail to check the 'do I know all of the bits' condition before checking less specific conditions and would not fully constant-fold the result. I'm not sure how to trigger this aside from using assumptions, so I've just included the change here. llvm-svn: 217343	2014-09-07 19:21:07 +00:00
Hal Finkel	f8bb9b78cf	Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.) This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342	2014-09-07 18:57:58 +00:00
Hal Finkel	6122fb79cb	Add an Assumption-Tracking Pass This adds an immutable pass, AssumptionTracker, which keeps a cache of @llvm.assume call instructions within a module. It uses callback value handles to keep stale functions and intrinsics out of the map, and it relies on any code that creates new @llvm.assume calls to notify it of the new instructions. The benefit is that code needing to find @llvm.assume intrinsics can do so directly, without scanning the function, thus allowing the cost of @llvm.assume handling to be negligible when none are present. The current design is intended to be lightweight. We don't keep track of anything until we need a list of assumptions in some function. The first time this happens, we scan the function. After that, we add/remove @llvm.assume calls from the cache in response to registration calls and ValueHandle callbacks. There are no new direct test cases for this pass, but because it calls it validation function upon module finalization, we'll pick up detectable inconsistencies from the other tests that touch @llvm.assume calls. This pass will be used by follow-up commits that make use of @llvm.assume. llvm-svn: 217334	2014-09-07 12:44:26 +00:00
David Majnemer	86e601dd42	InstCombine: Remove a special case pattern The special case did not work when run under -reassociate and can easily be expressed by a further generalization of an existing pattern. llvm-svn: 217227	2014-09-05 06:09:24 +00:00
David Majnemer	b6abbc640c	Revert "Revert two GEP-related InstCombine commits" This reverts commit r216698 which reverted r216523 and r216598. We would attempt to perform the transformation even if the match() failed because, as a side effect, it would set V. This would trick us into believing that we correctly found a place to correctly apply the transform. An additional test case was added to getelementptr.ll so that we might not regress in the future. llvm-svn: 216890	2014-09-01 21:10:02 +00:00
David Majnemer	114418805a	InstCombine: Respect recursion depth in visitUDivOperand llvm-svn: 216817	2014-08-30 09:19:05 +00:00
David Majnemer	5cf3ee996f	InstCombine: Try harder to combine icmp instructions consider: (and (icmp X, Y), (and Z, (icmp A, B))) It may be possible to combine (icmp X, Y) with (icmp A, B). If we successfully combine, create an 'and' instruction with Z. This fixes PR20814. N.B. There is room for improvement after this change but I'm not convinced it's worth chasing yet. llvm-svn: 216814	2014-08-30 06:18:20 +00:00
David Majnemer	02f74ee06a	Revert two GEP-related InstCombine commits This reverts commit r216523 and r216598; people have reported regressions. llvm-svn: 216698	2014-08-29 00:06:43 +00:00
David Majnemer	1405bb84da	InstCombine: Remove redundant combines InstSimplify already handles icmp (X+Y), X (and things like it) appropriately. The first thing that InstCombine does is run InstSimplify on the instruction. llvm-svn: 216659	2014-08-28 10:08:37 +00:00
David Majnemer	e48fe8e34c	InstSimplify: Move a transform from InstCombine to InstSimplify Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642	2014-08-28 03:34:28 +00:00
David Majnemer	fd14299661	InstCombine: Combine gep X, (Y-X) to Y We try to perform this transform in InstSimplify but we aren't always able to. Sometimes, we need to insert a bitcast if X and Y don't have the same time. llvm-svn: 216598	2014-08-27 20:08:37 +00:00
Craig Topper	43cee2f5fc	Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created. llvm-svn: 216525	2014-08-27 05:25:25 +00:00
David Majnemer	826f5eb297	InstCombine: Optimize GEP's involving ptrtoint better We supported transforming: (gep i8* X, -(ptrtoint Y)) to: (inttoptr (sub (ptrtoint X), (ptrtoint Y))) However, this only fired if 'X' had type i8*. Generalize this to support various types of different sizes. This results in much better CodeGen, especially for pointers to packed structs. llvm-svn: 216523	2014-08-27 05:16:04 +00:00
Dinesh Dwivedi	1b0080d8e1	This patch enables SimplifyUsingDistributiveLaws() to handle following pattens. (X >> Z) & (Y >> Z) -> (X&Y) >> Z for all shifts. (X >> Z) \| (Y >> Z) -> (X\|Y) >> Z for all shifts. (X >> Z) ^ (Y >> Z) -> (X^Y) >> Z for all shifts. These patterns were previously handled separately in visitAnd()/visitOr()/visitXor(). Differential Revision: http://reviews.llvm.org/D4951 llvm-svn: 216443	2014-08-26 08:53:32 +00:00
David Majnemer	5a9ece39af	InstCombine: Properly optimize or'ing bittests together CFE, with -03, would turn: bool f(unsigned x) { bool a = x & 1; bool b = x & 2; return a \| b; } into: %1 = lshr i32 %x, 1 %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 This sort of thing exposes a nasty pathology in GCC, ICC and LLVM. Instead, we would rather want: %1 = and i32 %x, 3 %2 = icmp ne i32 %1, 0 Things get a bit more interesting in the following case: %1 = lshr i32 %x, %y %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 Replacing it with the following sequence is better: %1 = shl nuw i32 1, %y %2 = or i32 %1, 1 %3 = and i32 %2, %x %4 = icmp ne i32 %3, 0 This sequence is preferable because %1 doesn't involve %x and could potentially be hoisted out of loops if it is invariant; only perform this transform in the non-constant case if we know we won't increase register pressure. llvm-svn: 216343	2014-08-24 09:10:57 +00:00
David Majnemer	c3f263a712	InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants Consider: %add = add nuw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nuw' from the instruction. llvm-svn: 216273	2014-08-22 17:11:04 +00:00
David Majnemer	eb5b0c09b7	InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN We can preserve nsw during this transform if -C won't overflow. llvm-svn: 216269	2014-08-22 16:41:23 +00:00
David Majnemer	9edeeb29d3	InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants Consider: %add = add nsw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nsw' from the instruction. This fixes PR20377. llvm-svn: 216261	2014-08-22 07:56:32 +00:00
Craig Topper	65775cc03d	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
David Majnemer	fb4e6230cf	InstCombine: Fold ((A \| B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1 Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216157	2014-08-21 05:14:48 +00:00
Yi Jiang	aee2472606	New InstCombine pattern: (icmp ult/ule (A + C1), C3) \| (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition llvm-svn: 216135	2014-08-20 22:55:40 +00:00
David Majnemer	03fa77d0ce	InstCombine: Annotate sub with nuw when we prove it's safe We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is negative and the right-hand side is non-negative. llvm-svn: 216045	2014-08-20 07:17:31 +00:00
David Majnemer	b02f8f16bc	InstCombine: Annotate sub with nsw when we prove it's safe We can prove that a 'sub' can be a 'sub nsw' under certain conditions: - The sign bits of the operands is the same. - Both operands have more than 1 sign bit. The subtraction cannot be a signed overflow in either case. llvm-svn: 216037	2014-08-19 23:36:30 +00:00
Mayur Pandey	2a3606586c	InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Differential Revision: http://reviews.llvm.org/D4898 llvm-svn: 215974	2014-08-19 08:19:19 +00:00
Mayur Pandey	720f33cd8e	test commit (spelling correction) llvm-svn: 215970	2014-08-19 06:41:55 +00:00
Craig Topper	aa7422b5a6	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	227456e133	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Owen Anderson	7bafde2a45	Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set. While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible! It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level information (the fast math flags), which is undesirable. llvm-svn: 215825	2014-08-17 03:51:29 +00:00
David Majnemer	ec576cc6dc	InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C) While most (X sdiv 1) operations will get caught by InstSimplify, it is still possible for a sdiv to appear in the worklist which hasn't been simplified yet. This means that it is possible for 0 - (X sdiv 1) to get transformed into (X sdiv -1); dividing by -1 can make the transform produce undef values instead of the proper result. Sorry for the lack of testcase, it's a bit problematic because it relies on the exact order of operations in the worklist. llvm-svn: 215818	2014-08-16 09:23:42 +00:00
David Majnemer	797c585502	InstCombine: Combine mul with div. We can combne a mul with a div if one of the operands is a multiple of the other: %mul = mul nsw nuw %a, C1 %ret = udiv %mul, C2 => %ret = mul nsw %a, (C1 / C2) This can expose further optimization opportunities if we end up multiplying or dividing by a power of 2. Consider this small example: define i32 @f(i32 %a) { %mul = mul nuw i32 %a, 14 %div = udiv exact i32 %mul, 7 ret i32 %div } which gets CodeGen'd to: imull $14, %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $2, %eax retq We can now transform this into: define i32 @f(i32 %a) { %shl = shl nuw i32 %a, 1 ret i32 %shl } which gets CodeGen'd to: leal (%rdi,%rdi), %eax retq This fixes PR20681. llvm-svn: 215815	2014-08-16 08:55:06 +00:00
David Majnemer	8e04706b9d	InstCombine: ((A \| ~B) ^ (~A \| B)) to A ^ B Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A \| ~B),(~A \|B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Patch by Mayur Pandey! Differential Revision: http://reviews.llvm.org/D4883 llvm-svn: 215621	2014-08-14 06:46:25 +00:00
David Majnemer	a878644a06	Added InstCombine Transform for ((B \| C) & A) \| B -> B \| (A & C) Transform ((B \| C) & A) \| B --> B \| (A & C) Z3 Link: http://rise4fun.com/Z3/hP6p Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D4865 llvm-svn: 215619	2014-08-14 06:41:38 +00:00
Benjamin Kramer	da144ed5a2	Canonicalize header guards into a common format. Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558	2014-08-13 16:26:38 +00:00
Karthik Bhat	d8ea66ecbf	InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b) Correctness proof of the transform using CVC3- $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR(A \| B, BVXOR(A,B) ) = A & B; $ cvc3 t.cvc Valid. llvm-svn: 215524	2014-08-13 05:13:14 +00:00
Matt Arsenault	9822384258	Allwo bitcast + struct GEP transform to work with addrspacecast llvm-svn: 215467	2014-08-12 19:46:13 +00:00
David Majnemer	5a64cc5b28	InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b) What follows bellow is a correctness proof of the transform using CVC3. $ < t.cvc A, B : BITVECTOR(32); QUERY BVPLUS(32, A & B, A \| B) = BVPLUS(32, A, B); $ cvc3 < t.cvc Valid. llvm-svn: 215400	2014-08-11 22:32:02 +00:00
Suyog Sarda	2935dc4c7e	This patch implements transform for pattern "(A & ~B) ^ (~A) -> ~(A & B)". Differential Revision: http://reviews.llvm.org/D4653 llvm-svn: 214479	2014-08-01 05:07:20 +00:00
Suyog Sarda	8e372e5c2f	This patch implements transform for pattern "(A \| B) & ((~A) ^ B) -> (A & B)". Differential Revision: http://reviews.llvm.org/D4628 llvm-svn: 214478	2014-08-01 04:59:26 +00:00
Suyog Sarda	e84d4ba7d3	This patch implements transform for pattern "( A & (~B)) \| (A ^ B) -> (A ^ B)" Differential Revision: http://reviews.llvm.org/D4652 llvm-svn: 214477	2014-08-01 04:50:31 +00:00
Suyog Sarda	b8765dbda2	This patch implements transform for pattern "(A & B) \| ((~A) ^ B) -> (~A ^ B)". Patch Credit to Ankit Jain ! Differential Revision: http://reviews.llvm.org/D4655 llvm-svn: 214476	2014-08-01 04:41:43 +00:00
David Majnemer	066fbe5798	InstCombine: Correctly propagate NSW/NUW for x-(-A) -> x+A We can only propagate the nsw bits if both subtraction instructions are marked with the appropriate bit. N.B. We only propagate the nsw bit in InstCombine because the nuw case is already handled in InstSimplify. This fixes PR20189. llvm-svn: 214385	2014-07-31 04:49:29 +00:00
David Majnemer	994e0d02b9	InstCombine: Simplify (A ^ B) or/and (A ^ B ^ C) While we can already transform A \| (A ^ B) into A \| B, things get bad once we have (A ^ B) \| (A ^ B ^ Cst) because reassociation will morph this into (A ^ B) \| ((A ^ Cst) ^ B). Our existing patterns fail once this happens. To fix this, we add a new pattern which looks through the tree of xor binary operators to see that, in fact, there exists a redundant xor operation. What follows bellow is a correctness proof of the transform using CVC3. $ cat t.cvc A, B, C : BITVECTOR(64); QUERY BVXOR(A, B) \| BVXOR(BVXOR(B, C), A) = BVXOR(A, B) \| C; QUERY BVXOR(BVXOR(A, C), B) \| BVXOR(A, B) = BVXOR(A, B) \| C; QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C; QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C; $ cvc3 < t.cvc Valid. Valid. Valid. Valid. llvm-svn: 214342	2014-07-30 21:26:37 +00:00
Hal Finkel	a14227ff6e	Canonicalization for @llvm.assume Adds simple logical canonicalization of assumption intrinsics to instcombine, currently: - invariant(a && b) -> invariant(a); invariant(b) - invariant(!(a \|\| b)) -> invariant(!a); invariant(!b) llvm-svn: 213977	2014-07-25 21:45:17 +00:00
Hal Finkel	9be4aefa57	AA metadata refactoring (introduce AAMDNodes) In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859	2014-07-24 12:16:19 +00:00
Suyog Sarda	959fecbe70	This patch implements optimization as mentioned in PR19753: Optimize comparisons with "ashr/lshr exact" of a constanst. It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch. Added code for lshr as well as non-exact right shifts. It implements : (icmp eq/ne (ashr/lshr const2, A), const1)" -> (icmp eq/ne A, Log2(const2/const1)) -> (icmp eq/ne A, Log2(const2) - Log2(const1)) Differential Revision: http://reviews.llvm.org/D4068 llvm-svn: 213678	2014-07-22 19:19:36 +00:00
Suyog Sarda	2092947078	Added InstCombine transform for pattern "(A & B) ^ (A ^ B) -> (A \| B)" Patch idea by Ankit Jain ! Differential Revision: http://reviews.llvm.org/D4618 llvm-svn: 213677	2014-07-22 18:30:54 +00:00
Suyog Sarda	65dba610e3	Added InstCombine Transform for patterns: "((~A & B) \| A) -> (A \| B)" and "((A & B) \| ~A) -> (~A \| B)" Original Patch credit to Ankit Jain !! Differential Revision: http://reviews.llvm.org/D4591 llvm-svn: 213676	2014-07-22 18:09:41 +00:00
Suyog Sarda	7289a7b99e	This patch implements transform for pattern "(A \| B) ^ (~A) -> (A \| ~B)". Patch Credit to Ankit Jain !! Differential Revision: http://reviews.llvm.org/D4588 llvm-svn: 213662	2014-07-22 15:37:39 +00:00
Sanjay Patel	24f9331065	fixed typo in comment llvm-svn: 213614	2014-07-22 04:57:06 +00:00
Duncan P. N. Exon Smith	2ae51d315c	Revert "[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges." This reverts commit r213474 (and r213475), which causes a miscompile on a stage2 LTO build. I'll reply on the list in a moment. llvm-svn: 213562	2014-07-21 17:06:51 +00:00
Manuel Jacob	8e924ddc40	[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges. Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended. Test Plan: All tests (make check-all) still pass. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4481 llvm-svn: 213474	2014-07-20 09:10:11 +00:00
Suyog Sarda	e62b39fcd0	Move ashr optimization from InstCombineShift to InstSimplify. Refactor code, no functionality change, test case moved from instcombine to instsimplify. Differential Revision: http://reviews.llvm.org/D4102 llvm-svn: 213231	2014-07-17 06:28:15 +00:00
Suyog Sarda	fe6fdd5295	Fix Typo (first commit to test commit access) llvm-svn: 213228	2014-07-17 06:09:34 +00:00
Manuel Jacob	e41c2e7cde	Utilize CastInst::CreatePointerBitCastOrAddrSpaceCast here. llvm-svn: 213189	2014-07-16 20:13:45 +00:00
Manuel Jacob	a903d1b422	Fix comment in InstCombiner::visitAddrSpaceCast. In the original version of the patch the behaviour was like described in the comment. This behaviour was changed before committing it without updating the comment. llvm-svn: 213117	2014-07-16 01:34:21 +00:00
Matt Arsenault	0b1cd03db3	Use pointer type cast helpers. llvm-svn: 212963	2014-07-14 17:24:38 +00:00
Aditya Nandakumar	2b32e5e74c	When we sink an instruction, this can open up opportunity for the operands to be sunk - add them to the worklist llvm-svn: 212847	2014-07-11 21:49:39 +00:00
Duncan P. N. Exon Smith	44ec851704	InstCombine: Fix a crash in Descale for multiply-by-zero Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets created as an argument to a GEP partway through an iteration, causing -instcombine to optimize the GEP before the multiply. rdar://problem/17615671 llvm-svn: 212742	2014-07-10 17:13:27 +00:00
Hal Finkel	661274e401	Feeding isSafeToSpeculativelyExecute its DataLayout pointer isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the past, this was mainly used to make better decisions regarding divisions known not to trap, and so was not all that important for users concerned with "cheap" instructions. However, now it also helps look through bitcasts for dereferencable loads, and will also be important if/when we add a dereferencable pointer attribute. This is some initial work to feed a DataLayout pointer through to callers of isSafeToSpeculativelyExecute, generally where one was already available. llvm-svn: 212720	2014-07-10 14:41:31 +00:00
Sanjay Patel	38aa8c3b99	Fix for PR20059 (instcombine reorders shufflevector after instruction that may trap) In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem). This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed. Differential Revision: http://reviews.llvm.org/D4424 llvm-svn: 212629	2014-07-09 16:34:54 +00:00
Sanjay Patel	69a5950ba3	fixed some typos llvm-svn: 212495	2014-07-07 22:13:58 +00:00
Benjamin Kramer	195f0552f0	Make helper functions static. llvm-svn: 212460	2014-07-07 14:47:51 +00:00
Benjamin Kramer	065c70166c	InstCombine: Simplify code, no functionality change. llvm-svn: 212449	2014-07-07 11:01:16 +00:00
Benjamin Kramer	43d91888f7	InstCombine: Strength reduce sadd.with.overflow into a regular nsw add if we can prove that it cannot overflow. PR20194 llvm-svn: 212331	2014-07-04 10:22:21 +00:00
David Majnemer	68ed1a9119	InstCombine: Optimize x/INT_MIN to x==INT_MIN The result of x/INT_MIN is either 0 or 1, we can just use an icmp instead. llvm-svn: 212167	2014-07-02 06:42:13 +00:00
David Majnemer	5449bfbb6f	InstCombine: Don't turn -(x/INT_MIN) -> x/INT_MIN It is not safe to negate the smallest signed integer, doing so yields the same number back. This fixes PR20186. llvm-svn: 212164	2014-07-02 06:07:09 +00:00
Reid Kleckner	83d46a1307	Optimize InstCombine stack memory consumption This patch reduces the stack memory consumption of the InstCombine function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions could overflow the stack because of excessive recursiveness. For example, in a case like this: %0 = alloca [50025 x i32], align 4 %1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0 store i32 0, i32* %1 %2 = getelementptr inbounds i32* %1, i64 1 store i32 1, i32* %2 %3 = getelementptr inbounds i32* %2, i64 1 store i32 2, i32* %3 %4 = getelementptr inbounds i32* %3, i64 1 store i32 3, i32* %4 %5 = getelementptr inbounds i32* %4, i64 1 store i32 4, i32* %5 %6 = getelementptr inbounds i32* %5, i64 1 store i32 5, i32* %6 ... This piece of code crashes llvm when trying to apply instcombine on desktop. On embedded devices this could happen with a much lower limit of recursiveness. Some instructions (getelementptr and bitcasts) make the function recursively call itself on their uses, which is what makes the example above consume so much stack (it becomes a recursive depth-first tree visit with a very big depth). The patch changes the algorithm to be semantically equivalent, but iterative instead of recursive and the visiting order to be from a depth-first visit to a breadth-first visit (visit all the instructions of the current level before the ones of the next one). Now if a lot of memory is required a heap allocation is done instead of the the stack allocation, avoiding the possible crash. Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4355 Patch by Marcello Maggioni! We don't generally commit large stress test that look for out of memory conditions, so I didn't request that one be added to the patch. llvm-svn: 212133	2014-07-01 21:36:20 +00:00
Dinesh Dwivedi	73e3709b2c	Added instruction combine to transform few more negative values addition to subtraction (Part 3) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is odd Differential Revision: http://reviews.llvm.org/D4210 llvm-svn: 211881	2014-06-27 07:47:35 +00:00
Dinesh Dwivedi	9d122cf780	This patch removed duplicate code for matching patterns which are now handled in SimplifyUsingDistributiveLaws() (after r211261) Differential Revision: http://reviews.llvm.org/D4253 llvm-svn: 211768	2014-06-26 08:57:33 +00:00
Dinesh Dwivedi	b98a2e9f49	Added instruction combine to transform few more negative values addition to subtraction (Part 2) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is even Differential Revision: http://reviews.llvm.org/D4209 llvm-svn: 211765	2014-06-26 05:40:22 +00:00
Benjamin Kramer	66a50c1c4d	InstCombine: Disable umul.with.overflow recognition for vectors. It doesn't make a lot on most targets and the code isn't ready for it. PR20113. llvm-svn: 211583	2014-06-24 10:47:52 +00:00
Benjamin Kramer	65c1072e77	InstCombine: Don't try to reorder shuffles where the mask is a ConstantExpr. We can't analyze the individual values of a vector expression. PR20114. llvm-svn: 211581	2014-06-24 10:38:10 +00:00
Dinesh Dwivedi	9d6bf38387	Added instruction combine to transform few more negative values addition to subtraction (Part 1) This patch enables transforms for following patterns. (x + (~(y & c) + 1) --> x - (y & c) (x + (~((y >> z) & c) + 1) --> x - ((y>>z) & c) Differential Revision: http://reviews.llvm.org/D3733 llvm-svn: 211266	2014-06-19 10:36:52 +00:00

... 2 3 4 5 6 ...

1431 Commits