llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00

Author	SHA1	Message	Date
Duncan Sands	9816d42357	If the type legalizer actually legalized anything (this doesn't happen that often, since most code does not use illegal types) then follow it by a DAG combiner run that is allowed to generate illegal operations but not illegal types. I didn't modify the target combiner code to distinguish like this between illegal operations and illegal types, so it will not produce illegal operations as well as not producing illegal types. llvm-svn: 59960	2008-11-24 14:53:14 +00:00
Duncan Sands	f9ea1124c9	Rename SetCCResultContents to BooleanContents. In practice these booleans are mostly produced by SetCC, however the concept is more general. llvm-svn: 59911	2008-11-23 15:47:28 +00:00
Bill Wendling	3175516f94	- Move conversion of [SU]ADDO from DAG combiner into legalizer. - Add "promote integer type" stuff to the legalizer for these nodes. llvm-svn: 59847	2008-11-22 00:22:52 +00:00
Bill Wendling	6c369d3a6f	Default to converting UADDO to the generic form that SADDO is converted to. llvm-svn: 59801	2008-11-21 07:44:30 +00:00
Bill Wendling	e4815d4f45	Remove chains. Unnecessary. llvm-svn: 59783	2008-11-21 02:22:59 +00:00
Bill Wendling	0f9b6c3524	Rename "ADDO" to "SADDO" and "UADDO". The "UADDO" isn't equivalent to "ADDC" because the boolean it returns to indicate an overflow may not be treated like as a flag. It could be stored to memory, for instance. llvm-svn: 59780	2008-11-21 02:12:42 +00:00
Bill Wendling	02db3b99bf	Implement the sadd_with_overflow intrinsic. This is converted into "ISD::ADDO". ISD::ADDO is lowered into a target-independent form that does the addition and then checks if the result is less than one of the operands. (If it is, then there was an overflow.) llvm-svn: 59779	2008-11-21 02:03:52 +00:00
Bill Wendling	9da4535062	Fix for PR3040: The CC was changed, but wasn't checked to see if it was legal if the DAG combiner was being run after legalization. Threw in a couple of checks just to make sure that it's okay. As far as the PR is concerned, no back-end target actually exhibited this problem, so there isn't an associated testcase. llvm-svn: 59035	2008-11-11 08:25:46 +00:00
Mon P Wang	911ee5bf8b	Added support for the following definition of shufflevector <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> llvm-svn: 58964	2008-11-10 04:46:22 +00:00
Evan Cheng	1bde698192	Type of shuffle mask has changed. llvm-svn: 58751	2008-11-05 06:04:18 +00:00
Chris Lattner	508a62823e	Don't produce invalid comparisons after legalize. llvm-svn: 58320	2008-10-28 07:11:07 +00:00
Duncan Sands	65f39e9819	Use a legal integer type for vector shuffle mask elements. Otherwise LegalizeTypes will, reasonably enough, legalize the mask, which may result in it no longer being a BUILD_VECTOR node (LegalizeDAG simply ignores the legality or not of vector masks). llvm-svn: 57782	2008-10-19 14:58:05 +00:00
Dan Gohman	15597f07b2	Teach DAGCombine to fold constant offsets into GlobalAddress nodes, and add a TargetLowering hook for it to use to determine when this is legal (i.e. not in PIC mode, etc.) This allows instruction selection to emit folded constant offsets in more cases, such as the included testcase, eliminating the need for explicit arithmetic instructions. This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp that attempted to achieve the same effect, but wasn't as effective. Also, fix handling of offsets in GlobalAddressSDNodes in several places, including changing GlobalAddressSDNode's offset from int to int64_t. The Mips, Alpha, Sparc, and CellSPU targets appear to be unaware of GlobalAddress offsets currently, so set the hook to false on those targets. llvm-svn: 57748	2008-10-18 02:06:02 +00:00
Dan Gohman	5d83bd89a5	Define patterns for shld and shrd that match immediate shift counts, and patterns that match dynamic shift counts when the subtract is obscured by a truncate node. Add DAGCombiner support for recognizing rotate patterns when the shift counts are defined by truncate nodes. Fix and simplify the code for commuting shld and shrd instructions to work even when the given instruction doesn't have a parent, and when the caller needs a new instruction. These changes allow LLVM to use the shld, shrd, rol, and ror instructions on x86 to replace equivalent code using two shifts and an or in many more cases. llvm-svn: 57662	2008-10-17 01:23:35 +00:00
Evan Cheng	3faedff2de	Rename LoadX to LoadExt. llvm-svn: 57526	2008-10-14 21:26:46 +00:00
Dale Johannesen	9e57068854	Rename APFloat::convertToAPInt to bitcastToAPInt to make it clearer what the function does. No functional change. llvm-svn: 57325	2008-10-09 18:53:47 +00:00
Dan Gohman	989db64c93	Rename ConstantSDNode's getSignExtended to getSExtValue, for consistancy with ConstantInt, and re-implement it in terms of ConstantInt's getSExtValue. llvm-svn: 56700	2008-09-26 21:54:37 +00:00
Bill Wendling	7c60c6e7bf	Reapplying r56550 llvm-svn: 56553	2008-09-24 10:25:02 +00:00
Eric Christopher	8ffa64fdb5	Temporarily revert r56550 until missing commit can be added. llvm-svn: 56551	2008-09-24 08:30:44 +00:00
Bill Wendling	456b33b615	Refactor the constant folding code into it's own function. And call it from both the SelectionDAG and DAGCombiner code. The only functionality change is that now the DAG combiner is performing the constant folding for these operations instead of being a no-op. This is not in response to a bug, so there isn't a testcase. llvm-svn: 56550	2008-09-24 07:11:26 +00:00
Evan Cheng	638ae1be58	Per review feedback: Only perform (srl x, (trunc (and y, c))) -> (srl x, (and (trunc y), c)) etc. when both "trunc" and "and" have single uses. llvm-svn: 56452	2008-09-22 18:19:24 +00:00
Dan Gohman	082879cfde	Change ConstantSDNode and ConstantFPSDNode to use ConstantInt* and ConstantFP* instead of APInt and APFloat directly. This reduces the amount of time to create ConstantSDNode and ConstantFPSDNode nodes when ConstantInt* and ConstantFP* respectively are already available, as is the case in SelectionDAGBuild.cpp. Also, it reduces the amount of time to legalize constants into constant pools, and the amount of time to add ConstantFP operands to MachineInstrs, due to eliminating ConstantInt::get and ConstantFP::get calls. It increases the amount of work needed to create new constants in cases where the client doesn't already have a ConstantInt* or ConstantFP*, such as legalize expanding 64-bit integer constants to 32-bit constants. And it adds a layer of indirection for the accessor methods. But these appear to be outweight by the benefits in most cases. It will also make it easier to make ConstantSDNode and ConstantFPNode more consistent with ConstantInt and ConstantFP. llvm-svn: 56162	2008-09-12 18:08:03 +00:00
Dan Gohman	89660301e3	Rename ConstantSDNode::getValue to getZExtValue, for consistency with ConstantInt. This led to fixing a bug in TargetLowering.cpp using getValue instead of getAPIntValue. llvm-svn: 56159	2008-09-12 16:56:44 +00:00
Dan Gohman	c9fdcfb189	In visitUREM, arrange for the temporary UDIV node to be revisited, consistent with the code in visitSREM. llvm-svn: 55923	2008-09-08 16:59:01 +00:00
Bill Wendling	2239de4290	Revert my previous change -- the subtraction of two constants was a no-op before. This is taken care of in the selection DAG pass. In my opinion, this should be in one place or the other. I.e., it should probably be removed from the DAG combiner (along with the other arithmetic transformations on constants that are essentially no-ops). llvm-svn: 55889	2008-09-08 01:56:32 +00:00
Bill Wendling	91e9abe370	Convert // fold (sub c1, c2) -> c1-c2 from a no-op into an actual transformation. llvm-svn: 55886	2008-09-07 11:34:47 +00:00
Dan Gohman	a3987ed4e2	Fix a search+replace-o. llvm-svn: 55824	2008-09-05 01:58:21 +00:00
Dan Gohman	7ee14837e6	Clean up uses of TargetLowering::getTargetMachine. llvm-svn: 55769	2008-09-04 15:39:15 +00:00
Bill Wendling	3f918b3603	Another situation where ROTR is cheaper than ROTL. llvm-svn: 55577	2008-08-31 01:13:31 +00:00
Bill Wendling	ef64d4333e	For this pattern, ROTR is the cheaper option. llvm-svn: 55576	2008-08-31 01:04:56 +00:00
Bill Wendling	08690f06b2	- Fix comment so that it describes how the code really works: // fold (or (shl x, (ext y)), (srl x, (ext (sub 32, y)))) -> // (rotl x, y) // fold (or (shl x, (ext y)), (srl x, (ext (sub 32, y)))) -> // (rotr x, (sub 32, y)) Example: (x == 0xDEADBEEF and y == 4) (x << 4) \| (x >> 28) => 0xEADBEEF0 \| 0x0000000D => 0xEADBEEFD (rotl x, 4) => 0xEADBEEFD (rotr x, 28) => 0xEADBEEFD - Fix comment and code for second version. It wasn't using the rot* propertly. // fold (or (shl x, (ext (sub 32, y))), (srl x, (ext r))) -> // (rotr x, y) // fold (or (shl x, (ext (sub 32, y))), (srl x, (ext r))) -> // (rotl x, (sub 32, y)) (x << 28) \| (x >> 4) => 0xD0000000 \| 0x0DEADBEE => 0xDDEADBEE (rotl x, 4) => 0xEADBEEFD (rotr x, 28) => (0xEADBEEFD) llvm-svn: 55575	2008-08-31 00:37:27 +00:00
Gabor Greif	2aef1d5e4c	fix some 80-col violations llvm-svn: 55571	2008-08-30 19:29:20 +00:00
Evan Cheng	4bc8c9652e	Transform (x << (y&31)) -> (x << y). This takes advantage of the fact x86 shift instructions 2nd operand (shift count) is limited to 0 to 31 (or 63 in the x86-64 case). llvm-svn: 55558	2008-08-30 02:03:58 +00:00
Evan Cheng	9ee227f1df	Fix 80 col. violations. llvm-svn: 55551	2008-08-29 23:20:46 +00:00
Evan Cheng	2a3e05b519	Back out 55498. It broken Apple style bootstrapping. llvm-svn: 55549	2008-08-29 22:21:44 +00:00
Gabor Greif	86c795a8ca	erect abstraction boundaries for accessing SDValue members, rename Val -> Node to reflect semantics llvm-svn: 55504	2008-08-28 21:40:38 +00:00
Dan Gohman	35a69c106a	Optimize DAGCombiner's worklist processing. Previously it started its work by putting all nodes in the worklist, requiring a big dynamic allocation. Now, DAGCombiner just iterates over the AllNodes list and maintains a worklist for nodes that are newly created or need to be revisited. This allows the worklist to stay small in most cases, so it can be a SmallVector. This has the side effect of making DAGCombine not miss a folding opportunity in alloca-align-rounding.ll. llvm-svn: 55498	2008-08-28 21:01:56 +00:00
Gabor Greif	4b86114f92	disallow direct access to SDValue::ResNo, provide a getter instead llvm-svn: 55394	2008-08-26 22:36:50 +00:00
Dan Gohman	00ddda96c9	Disable DAGCombine's alignment inference in "fast" codegen mode. llvm-svn: 55059	2008-08-20 16:30:28 +00:00
Dan Gohman	b0f5e18201	Improve support for vector casts in LLVM IR and CodeGen. llvm-svn: 54784	2008-08-14 20:04:46 +00:00
Dan Gohman	a27ed39f05	Take the FrameOffset into account when computing the alignment of stack objects. This fixes PR2656. llvm-svn: 54646	2008-08-11 18:27:03 +00:00
Dan Gohman	f691fc703d	Improve dagcombining for sext-loads and sext-in-reg nodes. llvm-svn: 54239	2008-07-31 00:50:31 +00:00
Dan Gohman	9742f7772d	Rename SDOperand to SDValue. llvm-svn: 54128	2008-07-27 21:46:04 +00:00
Dan Gohman	47c5cdbc34	Tidy SDNode::use_iterator, and complete the transition to have it parallel its analogue, Value::value_use_iterator. The operator* method now returns the user, rather than the use. llvm-svn: 54127	2008-07-27 20:43:25 +00:00
Evan Cheng	1aa928a8e6	Fix pr2566: incorrect assumption about bit_convert. It doesn't not have to output a vector value. Patch by Nicolas Capens! llvm-svn: 53932	2008-07-22 20:42:56 +00:00
Dan Gohman	b91bef08a7	Add titles to the various SelectionDAG viewGraph calls that include useful information like the name of the block being viewed and the current phase of compilation. llvm-svn: 53872	2008-07-21 20:00:07 +00:00
Duncan Sands	6e31474e71	Add VerifyNode, a place to put sanity checks on generic SDNode's (nodes with their own constructors should do sanity checking in the constructor). Add sanity checks for BUILD_VECTOR and fix all the places that were producing bogus BUILD_VECTORs, as found by "make check". My favorite is the BUILD_VECTOR with only two operands that was being used to build a vector with four elements! llvm-svn: 53850	2008-07-21 10:20:31 +00:00
Duncan Sands	3d68e2ff9c	Revert 53729, after waking up in the middle of the night realising that it was wrong :) I think the reason the same type was being used for the shufflevec of indices as for the actual indices is so that if one of them needs splitting then so does the other. After my patch it might be that the indices need splitting but not the rest, yet there is no good way of handling that. I think the right solution is to not have the shufflevec be an operand at all: just have it be the list of numbers it actually is, stored as extra info in the node. llvm-svn: 53768	2008-07-18 20:12:05 +00:00
Duncan Sands	08ea7c0351	Use a legal type for elements of the vector_shuffle mask. These are just indices into the shuffled vector so their type is unrelated to the type of the shuffled elements (which is what was being used before). This fixes vec_shuffle-11.ll when using LegalizeTypes. What seems to have happened is that Dan's recent change r53687, which corrected the result type of the shuffle, somehow caused LegalizeTypes to notice that the mask operand was a BUILD_VECTOR with a legal type but elements of an illegal type (i64). LegalizeTypes legalized this by introducing a new BUILD_VECTOR of i32 and bitcasting it to the old type. But the mask operand is not supposed to be a bitcast but a straight BUILD_VECTOR of constants, causing a crash. llvm-svn: 53729	2008-07-17 19:28:41 +00:00
Dan Gohman	0025513482	Fix the result type of a VECTOR_SHUFFLE+BIT_CONVERT dagcombine. This was turned up by some new SelectionDAG assertion checks that I'm working on. llvm-svn: 53687	2008-07-16 16:13:58 +00:00
Dan Gohman	9997cc353f	Use reserve. SelectionDAG::allnodes_size is linear, but that doesn't appear to outweigh the benefit of reducing heap traffic. If it does become a problem, we should teach SelectionDAG to keep a count of how many nodes are live, because there are several other places where that information would be useful as well. llvm-svn: 52926	2008-06-30 21:04:06 +00:00
Dan Gohman	6b87f869e4	When folding a bitcast into a load or store, preserve the alignment information of the original load or store, which is checked to be at least as good, and possibly better. llvm-svn: 52849	2008-06-28 00:45:22 +00:00
Chris Lattner	85cf534e04	duncan points out that isOperationLegal includes a check for type legality. Thanks Duncan! llvm-svn: 52786	2008-06-26 17:16:00 +00:00
Chris Lattner	2b67ff8632	when we know the signbit of an input to uint_to_fp is zero, change it to sint_to_fp on targets where that is cheaper (and visaversa of course). This allows us to compile uint_to_fp to: _test: movl 4(%esp), %eax shrl $23, %eax cvtsi2ss %eax, %xmm0 movl 8(%esp), %eax movss %xmm0, (%eax) ret instead of: .align 3 LCPI1_0: ## double .long 0 ## double least significant word 4.5036e+15 .long 1127219200 ## double most significant word 4.5036e+15 .text .align 4,0x90 .globl _test _test: subl $12, %esp movl 16(%esp), %eax shrl $23, %eax movl %eax, (%esp) movl $1127219200, 4(%esp) movsd (%esp), %xmm0 subsd LCPI1_0, %xmm0 cvtsd2ss %xmm0, %xmm0 movl 20(%esp), %eax movss %xmm0, (%eax) addl $12, %esp ret llvm-svn: 52747	2008-06-26 00:16:49 +00:00
Dan Gohman	84cef04f76	Duncan pointed out this code could be tidied. llvm-svn: 52624	2008-06-23 15:29:14 +00:00
Dan Gohman	a41cf16a8f	Simplify some getNode calls. llvm-svn: 52604	2008-06-21 22:06:07 +00:00
Duncan Sands	78bdcc813e	Allow these transforms for types like i256 while still excluding types like i1 (not byte sized) and i120 (loading an i120 requires loading an i64, an i32, an i16 and an i8, which is expensive). llvm-svn: 52310	2008-06-16 08:14:38 +00:00
Duncan Sands	2dffe1cc15	The transforms in visitEXTRACT_VECTOR_ELT are not valid if the load is volatile. Hopefully all wrong DAG combiner transforms of volatile loads and stores have now been caught. llvm-svn: 52293	2008-06-15 20:12:31 +00:00
Duncan Sands	fa6e02c4dc	Remove a redundant AfterLegalize check. Turn on some code when !AfterLegalize - but since this whole code section is turned off by an "if (0)" it's not really turning anything on. llvm-svn: 52276	2008-06-14 17:48:34 +00:00
Duncan Sands	40c8db881a	Disable some DAG combiner optimizations that may be wrong for volatile loads and stores. In fact this is almost all of them! There are three types of problems: (1) it is wrong to change the width of a volatile memory access. These may be used to do memory mapped i/o, in which case a load can have an effect even if the result is not used. Consider loading an i32 but only using the lower 8 bits. It is wrong to change this into a load of an i8, because you are no longer tickling the other three bytes. It is also unwise to make a load/store wider. For example, changing an i16 load into an i32 load is wrong no matter how aligned things are, since the fact of loading an additional 2 bytes can have i/o side-effects. (2) it is wrong to change the number of volatile load/stores: they may be counted by the hardware. (3) it is wrong to change a volatile load/store that requires one memory access into one that requires several. For example on x86-32, you can store a double in one processor operation, but to store an i64 requires two (two i32 stores). In a multi-threaded program you may want to bitcast an i64 to a double and store as a double because that will occur atomically, and be indivisible to other threads. So it would be wrong to convert the store-of-double into a store of an i64, because this will become two i32 stores - no longer atomic. My policy here is to say that the number of processor operations for an illegal operation is undefined. So it is alright to change a store of an i64 (requires at least two stores; but could be validly lowered to memcpy for example) into a store of double (one processor op). In short, if the new store is legal and has the same size then I say that the transform is ok. It would also be possible to say that transforms are always ok if before they were illegal, whether after they are illegal or not, but that's more awkward to do and I doubt it buys us anything much. However this exposed an interesting thing - on x86-32 a store of i64 is considered legal! That is because operations are marked legal by default, regardless of whether the type is legal or not. In some ways this is clever: before type legalization this means that operations on illegal types are considered legal; after type legalization there are no illegal types so now operations are only legal if they really are. But I consider this to be too cunning for mere mortals. Better to do things explicitly by testing AfterLegalize. So I have changed things so that operations with illegal types are considered illegal - indeed they can never map to a machine operation. However this means that the DAG combiner is more conservative because before it was "accidentally" performing transforms where the type was illegal because the operation was nonetheless marked legal. So in a few such places I added a check on AfterLegalize, which I suppose was actually just forgotten before. This causes the DAG combiner to do slightly more than it used to, which resulted in the X86 backend blowing up because it got a slightly surprising node it wasn't expecting, so I tweaked it. llvm-svn: 52254	2008-06-13 19:07:40 +00:00
Duncan Sands	2a9c84481c	Sometimes (rarely) nodes held in LegalizeTypes maps can be deleted. This happens when RAUW replaces a node N with another equivalent node E, deleting the first node. Solve this by adding (N, E) to ReplacedNodes, which is already used to remap nodes to replacements. This means that deleted nodes are being allowed in maps, which can be delicate: the memory may be reused for a new node which might get confused with the old deleted node pointer hanging around in the maps, so detect this and flush out maps if it occurs (ExpungeNode). The expunging operation is expensive, however it never occurs during a llvm-gcc bootstrap or anywhere in the nightly testsuite. It occurs three times in "make check": Alpha/illegal-element-type.ll, PowerPC/illegal-element-type.ll and X86/mmx-shift.ll. If expunging proves to be too expensive then there are other more complicated ways of solving the problem. In the normal case this patch adds the overhead of a few more map lookups, which is hopefully negligable. llvm-svn: 52214	2008-06-11 11:42:12 +00:00
Duncan Sands	e46308480d	Various tweaks related to apint codegen. No functionality change for non-funky-sized integers. llvm-svn: 52151	2008-06-09 15:48:25 +00:00
Duncan Sands	a487df7710	Remove some DAG combiner assumptions about sizes of integer types. Fix the isMask APInt method to actually work (hopefully) rather than crashing because it adds apints of different bitwidths. It looks like isShiftedMask is also broken, but I'm leaving that one to the APInt people (it is not used anywhere). llvm-svn: 52142	2008-06-09 11:32:28 +00:00
Duncan Sands	fe2a970a5c	Remove comparison methods for MVT. The main cause of apint codegen failure is the DAG combiner doing the wrong thing because it was comparing MVT's using < rather than comparing the number of bits. Removing the < method makes this mistake impossible to commit. Instead, add helper methods for comparing bits and use them. llvm-svn: 52098	2008-06-08 20:54:56 +00:00
Duncan Sands	d634afe3aa	Wrap MVT::ValueType in a struct to get type safety and better control the abstraction. Rename the type to MVT. To update out-of-tree patches, the main thing to do is to rename MVT::ValueType to MVT, and rewrite expressions like MVT::getSizeInBits(VT) in the form VT.getSizeInBits(). Use VT.getSimpleVT() to extract a MVT::SimpleValueType for use in switch statements (you will get an assert failure if VT is an extended value type - these shouldn't exist after type legalization). This results in a small speedup of codegen and no new testsuite failures (x86-64 linux). llvm-svn: 52044	2008-06-06 12:08:01 +00:00
Dan Gohman	c877140168	Add #includes to make some dependencies explicit. llvm-svn: 51496	2008-05-23 20:40:06 +00:00
Dan Gohman	287e750e64	Code simplification. llvm-svn: 51345	2008-05-20 20:56:33 +00:00
Evan Cheng	9e15622879	Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset. pshufd $1, (%rdi), %xmm0 movd %xmm0, %eax => movl 4(%rdi), %eax llvm-svn: 51026	2008-05-13 08:35:03 +00:00
Evan Cheng	fcbdc8bd6e	Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other. llvm-svn: 51008	2008-05-12 23:04:07 +00:00
Dan Gohman	6df962bf9a	Evan pointed out that folding sext to zext may not be correct if the zext is not legal. llvm-svn: 50368	2008-04-28 18:47:17 +00:00
Dan Gohman	389a3ff9c7	Teach DAGCombine to convert (sext x) to (zext x) when the sign-bit of x is known to be zero. llvm-svn: 50357	2008-04-28 16:58:24 +00:00
Roman Levenstein	728d59166f	Ongoing work on improving the instruction selection infrastructure: Rename SDOperandImpl back to SDOperand. Introduce the SDUse class that represents a use of the SDNode referred by an SDOperand. Now it is more similar to Use/Value classes. Patch is approved by Dan Gohman. llvm-svn: 49795	2008-04-16 16:15:27 +00:00
Roman Levenstein	b40d332929	Re-commit of the r48822, where the infinite looping problem discovered by Dan Gohman is fixed. llvm-svn: 49330	2008-04-07 10:06:32 +00:00
Evan Cheng	497c607fae	Backing out 48222 temporarily. llvm-svn: 49124	2008-04-03 03:13:16 +00:00
Dan Gohman	f223eaafcd	Fix a DAGCombiner optimization to respect volatile qualification. llvm-svn: 48994	2008-03-31 20:32:52 +00:00
Roman Levenstein	55b8822511	Use a linked data structure for the uses lists of an SDNode, just like LLVM Value/Use does and MachineRegisterInfo/MachineOperand does. This allows constant time for all uses list maintenance operations. The idea was suggested by Chris. Reviewed by Evan and Dan. Patch is tested and approved by Dan. On normal use-cases compilation speed is not affected. On very big basic blocks there are compilation speedups in the range of 15-20% or even better. llvm-svn: 48822	2008-03-26 12:39:26 +00:00
Evan Cheng	8cb64d8e8b	Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it. llvm-svn: 48792	2008-03-25 20:08:07 +00:00
Evan Cheng	61bf4f6c40	Remove an unneeded test. llvm-svn: 48755	2008-03-24 23:55:16 +00:00
Evan Cheng	874aee2eec	Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE. llvm-svn: 48673	2008-03-22 01:55:50 +00:00
Christopher Lamb	110f66ca68	Check even more carefully before applying this DAGCombine transform. llvm-svn: 48580	2008-03-20 04:31:39 +00:00
Evan Cheng	8ecb189245	Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m))) llvm-svn: 48578	2008-03-20 02:18:41 +00:00
Christopher Lamb	958b0494c3	Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491. llvm-svn: 48542	2008-03-19 08:30:06 +00:00
Bill Wendling	5ea2aec3ac	Temporarily revert r48491. It's breaking test/CodeGen/X86/xorl.ll. llvm-svn: 48510	2008-03-18 22:29:51 +00:00
Christopher Lamb	1d70509b55	Target independent DAG transform to use truncate for field extraction + sign extend on targets where this is profitable. Passes nightly on x86-64. llvm-svn: 48491	2008-03-18 16:46:39 +00:00
Dan Gohman	486f664806	More APInt-ification. llvm-svn: 48344	2008-03-13 22:13:53 +00:00
Evan Cheng	df92afe7d3	Clean up my own mess. X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases. llvm-svn: 48279	2008-03-12 07:02:50 +00:00
Evan Cheng	ee092b2dfc	Total brain cramp. llvm-svn: 48274	2008-03-12 02:05:05 +00:00
Evan Cheng	8b8d58e1aa	Somewhat better solution. llvm-svn: 48170	2008-03-10 19:58:22 +00:00
Scott Michel	bb8e8fca47	Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's return ValueType can depend its operands' ValueType. This is a cosmetic change, no functionality impacted. llvm-svn: 48145	2008-03-10 15:42:14 +00:00
Evan Cheng	554d2c443b	Doh llvm-svn: 48140	2008-03-10 07:59:01 +00:00
Evan Cheng	3c0ddc999f	Avoid creating BUILD_VECTOR of all zero elements of "non-normalized" type (e.g. v8i16 on x86) after legalizer. Instruction selection does not expect to see them. In all likelihood this can only be an issue in a bugpoint reduced test case. llvm-svn: 48136	2008-03-10 07:19:13 +00:00
Evan Cheng	b5b16810ac	Rename isOperand() to isOperandOf() (and other similar methods). It always confuses me. llvm-svn: 47872	2008-03-04 00:41:45 +00:00
Dan Gohman	ef264d521d	Misc. APInt-ification in the DAGCombiner. llvm-svn: 47869	2008-03-03 23:51:38 +00:00
Dan Gohman	689d8cac04	Convert SimplifyDemandedMask and ShrinkDemandedConstant to use APInt. Change several cases in SimplifyDemandedMask that don't ever do any simplifying to reuse the logic in ComputeMaskedBits instead of duplicating it. llvm-svn: 47648	2008-02-27 00:25:32 +00:00
Chris Lattner	1a461075ef	Fix PR2096, a regression introduced with my patch last night. This also fixes cfrac, flops, and 175.vpr llvm-svn: 47605	2008-02-26 17:09:59 +00:00
Chris Lattner	5b4101cf68	Fix isNegatibleForFree to not return true for ConstantFP nodes after legalize. Just because a constant is legal (e.g. 0.0 in SSE) doesn't mean that its negated value is legal (-0.0). We could make this stronger by checking to see if the negated constant is actually legal post negation, but it doesn't seem like a big deal. llvm-svn: 47591	2008-02-26 07:04:54 +00:00
Dan Gohman	012abf0109	Convert MaskedValueIsZero and all its users to use APInt. Also add a SignBitIsZero function to simplify a common use case. llvm-svn: 47561	2008-02-25 21:11:39 +00:00
Dan Gohman	48d03d5a2d	Add explicit keywords. llvm-svn: 47382	2008-02-20 16:44:09 +00:00
Dan Gohman	e83e624526	Convert DAGCombiner to use the APInt form of ComputeMaskedBits. llvm-svn: 47381	2008-02-20 16:33:30 +00:00
Anton Korobeynikov	7dd00942cc	Update gcc 4.3 warnings fix patch with recent head changes llvm-svn: 47368	2008-02-20 11:10:28 +00:00
Evan Cheng	bb577266bf	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00
Chris Lattner	c798882db7	teach dag combiner how to eliminate MERGE_VALUES nodes. llvm-svn: 47052	2008-02-13 07:25:05 +00:00
Duncan Sands	204c89cafa	Add a isBigEndian method to complement isLittleEndian. llvm-svn: 46954	2008-02-11 10:37:04 +00:00
Bill Wendling	20621788d4	Return "(c1 + c2)" instead of yet another ADD node (which made this a no-op). llvm-svn: 46922	2008-02-10 08:10:24 +00:00
Chris Lattner	ce1bde2760	the world doesn't need my debugging code. llvm-svn: 46678	2008-02-03 07:01:05 +00:00
Chris Lattner	6ea106139d	Change the 'global modification' APIs in SelectionDAG to take a new DAGUpdateListener object pointer instead of just returning a vector of deleted nodes. This makes the interfaces more efficient (no more allocating a vector [at least a malloc], filling it in, then walking it) and more clean. This also allows the client to be notified of nodes that are changed but not deleted. llvm-svn: 46677	2008-02-03 06:49:24 +00:00
Dan Gohman	13d1327796	Factor the addressing mode and the load/store VT out of LoadSDNode and StoreSDNode into their common base class LSBaseSDNode. Member functions getLoadedVT and getStoredVT are replaced with the common getMemoryVT to simplify code that will handle both loads and stores. llvm-svn: 46538	2008-01-30 00:15:11 +00:00
Dan Gohman	aad233ea10	Use empty() instead of comparing size() with zero. llvm-svn: 46514	2008-01-29 13:02:09 +00:00
Chris Lattner	26ec133a4d	Fix PowerPC/./2007-10-18-PtrArithmetic.ll llvm-svn: 46424	2008-01-27 23:32:17 +00:00
Chris Lattner	3d4fa8f00f	fix a crash on CodeGen/X86/vector-rem.ll llvm-svn: 46422	2008-01-27 23:21:58 +00:00
Chris Lattner	2ab1fd3824	Implement some dag combines that allow doing fneg/fabs/fcopysign in integer registers if used by a bitconvert or using a bitconvert. This allows us to avoid constant pool loads and use cheaper integer instructions when the values come from or end up in integer regs anyway. For example, we now compile CodeGen/X86/fp-in-intregs.ll to: _test1: movl $2147483648, %eax xorl 4(%esp), %eax ret _test2: movl $1065353216, %eax orl 4(%esp), %eax andl $3212836864, %eax ret Instead of: _test1: movss 4(%esp), %xmm0 xorps LCPI2_0, %xmm0 movd %xmm0, %eax ret _test2: movss 4(%esp), %xmm0 andps LCPI3_0, %xmm0 movss LCPI3_1, %xmm1 andps LCPI3_2, %xmm1 orps %xmm0, %xmm1 movd %xmm1, %eax ret bitconverts can happen due to various calling conventions that require fp values to passed in integer regs in some cases, e.g. when returning a complex. llvm-svn: 46414	2008-01-27 17:42:27 +00:00
Chris Lattner	682346a7b0	Infer alignment of loads and increase their alignment when we can tell they are from the stack. This allows us to compile stack-align.ll to: _test: movsd LCPI1_0, %xmm0 movapd %xmm0, %xmm1 * andpd 4(%esp), %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret instead of: _test: movsd LCPI1_0, %xmm0 movsd 4(%esp), %xmm1 ** andpd %xmm0, %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret llvm-svn: 46401	2008-01-26 19:45:50 +00:00
Chris Lattner	53a98f46fd	Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to delete a node even if it was not dead in some cases. Instead, just add it to the worklist. Also, make sure to use the CombineTo methods, as it was doing things that were unsafe: the top level combine loop could touch dangling memory. This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll llvm-svn: 46384	2008-01-26 01:09:19 +00:00
Chris Lattner	6b56f4031f	reduce indentation llvm-svn: 46377	2008-01-25 23:34:24 +00:00
Chris Lattner	59dc439330	Add skeletal code to increase the alignment of loads and stores when we can infer it. This will eventually help stuff, though it doesn't do much right now because all fixed FI's have an alignment of 1. llvm-svn: 46349	2008-01-25 07:20:16 +00:00
Chris Lattner	81531125a8	clarify a comment, thanks Duncan. llvm-svn: 46313	2008-01-24 17:10:01 +00:00
Chris Lattner	866ce96e6b	Fix this buggy transformation. Two observations: 1. we already know the value is dead, so don't bother replacing it with undef. 2. The very case the comment describes actually makes the load live which asserts in deletenode. If we do the replacement and the node becomes live, just treat it as new. This fixes a failure on X86/2008-01-16-InvalidDAGCombineXform.ll with some local changes in my tree. llvm-svn: 46306	2008-01-24 07:57:06 +00:00
Chris Lattner	bf6550dd56	The dag combiner is missing revisiting nodes that it really should, and thus leaving dead stuff around. This gets fed into the isel pass and causes certain foldings from happening because nodes have extraneous uses floating around. For example, if we turned foo(bar(x)) -> baz(x), we sometimes left bar(x) around. llvm-svn: 46305	2008-01-24 07:18:21 +00:00
Chris Lattner	44e58d7816	fold fp_round(fp_round(x)) -> fp_round(x). llvm-svn: 46304	2008-01-24 06:45:35 +00:00
Chris Lattner	41717f6989	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Chris Lattner	31081fc513	code cleanups, no functionality change. llvm-svn: 46126	2008-01-17 07:20:38 +00:00
Chris Lattner	d033200a8f	* Introduce a new SelectionDAG::getIntPtrConstant method and switch various codegen pieces and the X86 backend over to using it. * Add some comments to SelectionDAGNodes.h * Introduce a second argument to FP_ROUND, which indicates whether the FP_ROUND changes the value of its input. If not it is safe to xform things like fp_extend(fp_round(x)) -> x. llvm-svn: 46125	2008-01-17 07:00:52 +00:00
Evan Cheng	5be34d811c	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00
Chris Lattner	50f1b6f9a5	Factor the ReachesChainWithoutSideEffects out of dag combiner into a public SDOperand::reachesChainWithoutSideEffects method. No functionality change. llvm-svn: 46050	2008-01-16 05:49:24 +00:00
Chris Lattner	c93ad7d569	Make load->store deletion a bit smarter. This allows us to compile this: void test(long long P) { P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762	2008-01-08 23:08:06 +00:00
Chris Lattner	ad9a6ccb83	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Chris Lattner	38687aa2b2	make sure not to zap volatile stores, thanks a lot to Dale for noticing this! llvm-svn: 45402	2007-12-29 07:15:45 +00:00
Chris Lattner	78ae7ff876	don't fold fp_round(fp_extend(load)) -> fp_round(extload) llvm-svn: 45400	2007-12-29 06:55:23 +00:00
Chris Lattner	7cb2de8e48	Delete a store whose input is a load from the same pointer: x = load p store x -> p llvm-svn: 45398	2007-12-29 06:26:16 +00:00
Chris Lattner	a8f6fac7a3	Tell TargetLoweringOpt whether it is running before or after legalize. llvm-svn: 45321	2007-12-22 20:56:36 +00:00
Evan Cheng	694994ba7b	Don't leave newly created nodes around if it turns out they are not needed. llvm-svn: 45186	2007-12-19 01:34:38 +00:00
Dale Johannesen	80dd0c5141	Redo previous patch so optimization only done for i1. Simpler and safer. llvm-svn: 44663	2007-12-06 17:53:31 +00:00
Chris Lattner	64a1a9f502	third time around: instead of disabling this completely, only disable it if we don't know it will be obviously profitable. Still fixme, but less so. :) llvm-svn: 44658	2007-12-06 07:47:55 +00:00
Chris Lattner	bb5fb18af8	Actually, disable this code for now. More analysis and improvements to the X86 backend are needed before this should be enabled by default. llvm-svn: 44657	2007-12-06 07:44:31 +00:00
Chris Lattner	c467b49c96	implement a readme entry, compiling the code into: _foo: movl $12, %eax andl 4(%esp), %eax movl _array(%eax), %eax ret instead of: _foo: movl 4(%esp), %eax shrl $2, %eax andl $3, %eax movl _array(,%eax,4), %eax ret As it turns out, this triggers all the time, in a wide variety of situations, for example, I see diffs like this in various programs: - movl 8(%eax), %eax - shll $2, %eax - andl $1020, %eax - movl (%esi,%eax), %eax + movzbl 8(%eax), %eax + movl (%esi,%eax,4), %eax - shll $2, %edx - andl $1020, %edx - movl (%edi,%edx), %edx + andl $255, %edx + movl (%edi,%edx,4), %edx Unfortunately, I also see stuff like this, which can be fixed in the X86 backend: - andl $85, %ebx - addl _bit_count(,%ebx,4), %ebp + shll $2, %ebx + andl $340, %ebx + addl _bit_count(%ebx), %ebp llvm-svn: 44656	2007-12-06 07:33:36 +00:00
Dale Johannesen	8bc5d4be6a	Fix PR1842. llvm-svn: 44649	2007-12-06 01:43:46 +00:00
Dan Gohman	a9f8208852	Don't lower srem/urem X%C to X-X/C*C unless the division is actually optimized. This avoids creating illegal divisions when the combiner is running after legalize; this fixes PR1815. Also, it produces better code in the included testcase by avoiding the subtract and multiply when the division isn't optimized. llvm-svn: 44341	2007-11-26 23:46:11 +00:00
Duncan Sands	edf7e3b5f4	Move MinAlign to MathExtras.h. llvm-svn: 43944	2007-11-09 13:41:39 +00:00
Duncan Sands	7df7c7aed1	Fix some load/store logic that would be wrong for apints on big-endian machines if the bitwidth is not a multiple of 8. Introduce a new helper, MVT::getStoreSizeInBits, and use it. llvm-svn: 43934	2007-11-09 08:57:19 +00:00
Evan Cheng	d9bab93a44	If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it. llvm-svn: 43888	2007-11-08 09:25:29 +00:00
Evan Cheng	1fef4e369a	Typo. llvm-svn: 43511	2007-10-30 20:11:21 +00:00
Dan Gohman	02b8beff5f	Fix a DAGCombiner abort on a bitcast from a scalar to a vector. llvm-svn: 43470	2007-10-29 20:44:42 +00:00
Evan Cheng	5fe81cf64e	Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465	2007-10-29 19:58:20 +00:00
Duncan Sands	b494fb97a4	The guaranteed alignment of ptr+offset is only the minimum of of offset and the alignment of ptr if these are both powers of 2. While the ptr alignment is guaranteed to be a power of 2, there is no reason to think that offset is. For example, if offset is 12 (the size of a long double on x86-32 linux) and the alignment of ptr is 8, then the alignment of ptr+offset will in general be 4, not 8. Introduce a function MinAlign, lifted from gcc, for computing the minimum guaranteed alignment. I've tried to fix up everywhere under lib/CodeGen/SelectionDAG/. I also changed some places that weren't wrong (because both values were a power of 2), as a defensive change against people copying and pasting the code. Hopefully someone who cares about alignment will review the rest of LLVM and fix up the remaining places. Since I'm on x86 I'm not very motivated to do this myself... llvm-svn: 43421	2007-10-28 12:59:45 +00:00
Dale Johannesen	4ae755d15c	Redo "last ppc long double fix" as Chris wants. llvm-svn: 43189	2007-10-19 20:29:00 +00:00
Dale Johannesen	b23b0bfa8f	More ppcf128 issues (maybe the last)? llvm-svn: 43160	2007-10-19 00:59:18 +00:00
Dale Johannesen	fdb488d4b5	Disable attempts to constant fold PPC f128. Remove the assumption that this will happen from various places. llvm-svn: 43053	2007-10-16 23:38:29 +00:00
Chris Lattner	452ebc199e	One mundane change: Change ReplaceAllUsesOfValueWith to optionally take a deleted nodes vector, instead of requiring it. One more significant change: Implement the start of a legalizer that just works on types. This legalizer is designed to run before the operation legalizer and ensure just that the input dag is transformed into an output dag whose operand and result types are all legal, even if the operations on those types are not. This design/impl has the following advantages: 1. When finished, this will significantly reduce the amount of code in LegalizeDAG.cpp. It will remove all the code related to promotion and expansion as well as splitting and scalarizing vectors. 2. The new code is very simple, idiomatic, and modular: unlike LegalizeDAG.cpp, it has no 3000 line long functions. :) 3. The implementation is completely iterative instead of recursive, good for hacking on large dags without blowing out your stack. 4. The implementation updates nodes in place when possible instead of deallocating and reallocating the entire graph that points to some mutated node. 5. The code nicely separates out handling of operations with invalid results from operations with invalid operands, making some cases simpler and easier to understand. 6. The new -debug-only=legalize-types option is very very handy :), allowing you to easily understand what legalize types is doing. This is not yet done. Until the ifdef added to SelectionDAGISel.cpp is enabled, this does nothing. However, this code is sufficient to legalize all of the code in 186.crafty, olden and freebench on an x86 machine. The biggest issues are: 1. Vectors aren't implemented at all yet 2. SoftFP is a mess, I need to talk to Evan about it. 3. No lowering to libcalls is implemented yet. 4. Various operations are missing etc. 5. There are FIXME's for stuff I hax0r'd out, like softfp. Hey, at least it is a step in the right direction :). If you'd like to help, just enable the #ifdef in SelectionDAGISel.cpp and compile code with it. If this explodes it will tell you what needs to be implemented. Help is certainly appreciated. Once this goes in, we can do three things: 1. Add a new pass of dag combine between the "type legalizer" and "operation legalizer" passes. This will let us catch some long-standing isel issues that we miss because operation legalization often obfuscates the dag with target-specific nodes. 2. We can rip out all of the type legalization code from LegalizeDAG.cpp, making it much smaller and simpler. When that happens we can then reimplement the core functionality left in it in a much more efficient and non-recursive way. 3. Once the whole legalizer is non-recursive, we can implement whole-function selectiondags maybe... llvm-svn: 42981	2007-10-15 06:10:22 +00:00
Chris Lattner	c146f449b5	Enhance the truncstore optimization code to handle shifted values and propagate demanded bits through them in simple cases. This allows this code: void foo(char *P) { strcpy(P, "abc"); } to compile to: _foo: ldrb r3, [r1] ldrb r2, [r1, #+1] ldrb r12, [r1, #+2]! ldrb r1, [r1, #+1] strb r1, [r0, #+3] strb r2, [r0, #+1] strb r12, [r0, #+2] strb r3, [r0] bx lr instead of: _foo: ldrb r3, [r1, #+3] ldrb r2, [r1, #+2] orr r3, r2, r3, lsl #8 ldrb r2, [r1, #+1] ldrb r1, [r1] orr r2, r1, r2, lsl #8 orr r3, r2, r3, lsl #16 strb r3, [r0] mov r2, r3, lsr #24 strb r2, [r0, #+3] mov r2, r3, lsr #16 strb r2, [r0, #+2] mov r3, r3, lsr #8 strb r3, [r0, #+1] bx lr testcase here: test/CodeGen/ARM/truncstore-dag-combine.ll This also helps occasionally for X86 and other cases not involving unaligned load/stores. llvm-svn: 42954	2007-10-13 06:58:48 +00:00
Chris Lattner	133baf6012	Add a simple optimization to simplify the input to truncate and truncstore instructions, based on the knowledge that they don't demand the top bits. llvm-svn: 42952	2007-10-13 06:35:54 +00:00

1 2 3 4 5 ...

589 Commits