llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00

Author	SHA1	Message	Date
Dan Gohman	89660301e3	Rename ConstantSDNode::getValue to getZExtValue, for consistency with ConstantInt. This led to fixing a bug in TargetLowering.cpp using getValue instead of getAPIntValue. llvm-svn: 56159	2008-09-12 16:56:44 +00:00
Eli Friedman	fecea4b498	Fix for PR2687: Add patterns to match sint_to_fp and fp_to_sint for <2 x i32>. This is a little messy, but it works. We should really get rid of the intrinsics, though, since they map perfectly well to standard LLVM instructions. llvm-svn: 55864	2008-09-05 23:07:03 +00:00
Evan Cheng	419506a149	FsFLD0S{S\|D} and V_SETALLONES are as cheap as moves. llvm-svn: 55466	2008-08-28 07:52:25 +00:00
Dan Gohman	ebba07cccf	Tablegen generated code already tests the opcode value, so it's not necessary to use dyn_cast in these predicates. llvm-svn: 55055	2008-08-20 15:24:22 +00:00
Dan Gohman	ac992cdc1c	Add an EXTRACTPSmr pattern to match the pattern that X86ISelLowering creates. llvm-svn: 54544	2008-08-08 18:30:21 +00:00
Evan Cheng	f4d1119fbd	Fix PR2620: Fix X86cmppd selection code so it expects operands to be v2f64. llvm-svn: 54376	2008-08-05 22:19:15 +00:00
Nate Begeman	61f6c21028	Fix a typo in last commit llvm-svn: 53720	2008-07-17 17:04:58 +00:00
Nate Begeman	af01bfff99	SSE codegen for vsetcc nodes llvm-svn: 53719	2008-07-17 16:51:19 +00:00
Evan Cheng	02a618dc56	Fix for PR2472. Use movss to set lower 32-bits of a zero XMM vector. llvm-svn: 53386	2008-07-10 01:08:23 +00:00
Evan Cheng	4e7b7b21a2	Horizontal-add instructions are not commutative. llvm-svn: 52363	2008-06-16 21:16:24 +00:00
Evan Cheng	acd614c262	mpsadbw is commutable. llvm-svn: 52352	2008-06-16 20:25:59 +00:00
Duncan Sands	40c8db881a	Disable some DAG combiner optimizations that may be wrong for volatile loads and stores. In fact this is almost all of them! There are three types of problems: (1) it is wrong to change the width of a volatile memory access. These may be used to do memory mapped i/o, in which case a load can have an effect even if the result is not used. Consider loading an i32 but only using the lower 8 bits. It is wrong to change this into a load of an i8, because you are no longer tickling the other three bytes. It is also unwise to make a load/store wider. For example, changing an i16 load into an i32 load is wrong no matter how aligned things are, since the fact of loading an additional 2 bytes can have i/o side-effects. (2) it is wrong to change the number of volatile load/stores: they may be counted by the hardware. (3) it is wrong to change a volatile load/store that requires one memory access into one that requires several. For example on x86-32, you can store a double in one processor operation, but to store an i64 requires two (two i32 stores). In a multi-threaded program you may want to bitcast an i64 to a double and store as a double because that will occur atomically, and be indivisible to other threads. So it would be wrong to convert the store-of-double into a store of an i64, because this will become two i32 stores - no longer atomic. My policy here is to say that the number of processor operations for an illegal operation is undefined. So it is alright to change a store of an i64 (requires at least two stores; but could be validly lowered to memcpy for example) into a store of double (one processor op). In short, if the new store is legal and has the same size then I say that the transform is ok. It would also be possible to say that transforms are always ok if before they were illegal, whether after they are illegal or not, but that's more awkward to do and I doubt it buys us anything much. However this exposed an interesting thing - on x86-32 a store of i64 is considered legal! That is because operations are marked legal by default, regardless of whether the type is legal or not. In some ways this is clever: before type legalization this means that operations on illegal types are considered legal; after type legalization there are no illegal types so now operations are only legal if they really are. But I consider this to be too cunning for mere mortals. Better to do things explicitly by testing AfterLegalize. So I have changed things so that operations with illegal types are considered illegal - indeed they can never map to a machine operation. However this means that the DAG combiner is more conservative because before it was "accidentally" performing transforms where the type was illegal because the operation was nonetheless marked legal. So in a few such places I added a check on AfterLegalize, which I suppose was actually just forgotten before. This causes the DAG combiner to do slightly more than it used to, which resulted in the X86 backend blowing up because it got a slightly surprising node it wasn't expecting, so I tweaked it. llvm-svn: 52254	2008-06-13 19:07:40 +00:00
Evan Cheng	04c0915a2f	Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. llvm-svn: 51667	2008-05-29 08:22:04 +00:00
Dan Gohman	a5549a2f9c	Fix the encoding for two more "rm" instructions that were using MRMSrcReg. llvm-svn: 51630	2008-05-28 01:50:19 +00:00
Mon P Wang	8e37b2d13e	Fixed X86 encoding error CVTPS2PD and CVTPD2PS when the source operand is a memory location llvm-svn: 51626	2008-05-28 00:42:27 +00:00
Evan Cheng	e5e0b4660d	Eliminate x86.sse2.punpckh.qdq and x86.sse2.punpckl.qdq. llvm-svn: 51533	2008-05-24 02:56:30 +00:00
Evan Cheng	564238c841	Eliminate x86.sse2.movs.d, x86.sse2.shuf.pd, x86.sse2.unpckh.pd, and x86.sse2.unpckl.pd intrinsics. These will be lowered into shuffles. llvm-svn: 51531	2008-05-24 02:14:05 +00:00
Evan Cheng	98a292a302	Remove x86.sse2.loadh.pd and x86.sse2.loadl.pd. These will be lowered into load and shuffle instructions. llvm-svn: 51522	2008-05-24 00:07:29 +00:00
Evan Cheng	4f660778f0	Use movlps / movhps to modify low / high half of 16-byet memory location. llvm-svn: 51501	2008-05-23 21:23:16 +00:00
Evan Cheng	ec8bd19399	Fix a duplicated pattern. llvm-svn: 51490	2008-05-23 18:00:18 +00:00
Dan Gohman	6cc0b4f262	Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add load-folding table entries for PMULDQ and PMULLD. llvm-svn: 51489	2008-05-23 17:49:40 +00:00
Evan Cheng	097e95b1f7	Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks. Also fixed some 80 col. violations. llvm-svn: 51462	2008-05-23 00:37:07 +00:00
Evan Cheng	d1373cd497	Add missing patterns. llvm-svn: 51435	2008-05-22 18:56:56 +00:00
Evan Cheng	d694e78e36	movsd and movq do not require 16-byte alignment. This fixes vec_set-5.ll on Linux. llvm-svn: 51327	2008-05-20 18:24:47 +00:00
Nate Begeman	c290daf581	Fix one more encoding bug. llvm-svn: 51057	2008-05-13 17:52:09 +00:00
Nate Begeman	b9a3d141aa	Fix and encoding error in the psrad xmm, imm8 instruction. llvm-svn: 51020	2008-05-13 01:47:52 +00:00
Nate Begeman	5d939498c3	Teach Legalize how to scalarize VSETCC Teach X86 a few more vsetcc patterns. Custom lowering for unsupported ones is next. llvm-svn: 51009	2008-05-12 23:09:43 +00:00
Nate Begeman	2ae55cecc6	Initial X86 codegen support for VSETCC. llvm-svn: 51000	2008-05-12 20:34:32 +00:00
Evan Cheng	6a3fa28b38	Some clean up. llvm-svn: 50929	2008-05-10 00:59:18 +00:00
Evan Cheng	2adea48f7e	Add a pattern to do move the low element of a v4f32 and zero extend the rest. llvm-svn: 50922	2008-05-09 23:37:55 +00:00
Evan Cheng	3493e43afd	Handle a few more cases of folding load i64 into xmm and zero top bits. Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch. llvm-svn: 50918	2008-05-09 21:53:03 +00:00
Evan Cheng	f824b47188	Use movq to move low half of XMM register and zero-extend the rest. llvm-svn: 50874	2008-05-08 22:35:02 +00:00
Evan Cheng	f97e716511	Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. llvm-svn: 50838	2008-05-08 00:57:18 +00:00
Evan Cheng	c1c2adbfc6	Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code. llvm-svn: 50601	2008-05-03 00:52:09 +00:00
Evan Cheng	583a346ec6	80 column violation. llvm-svn: 50575	2008-05-02 07:53:32 +00:00
Chris Lattner	2c5b96fbee	A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2. llvm-svn: 49986	2008-04-20 05:52:46 +00:00
Dan Gohman	be8f2b452b	Add support for the form of the SSE41 extractps instruction that puts its result in a 32-bit GPR. llvm-svn: 49762	2008-04-16 02:32:24 +00:00
Chris Lattner	3b289289a7	Fix the x86-64 side of PR2108 by adding a v2f64 version of MOVZQI2PQIrr. This would be better handled as a dag combine (with the goal of eliminating the bitconvert) but I don't know how to do that safely. Thoughts welcome. llvm-svn: 49463	2008-04-10 05:13:43 +00:00
Evan Cheng	4d7b2ab16f	Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps. llvm-svn: 49244	2008-04-05 00:30:36 +00:00
Evan Cheng	6323ea8467	Fix some SSE4.1 instruction encoding bugs. llvm-svn: 48815	2008-03-26 08:11:49 +00:00
Evan Cheng	dbdf48276a	- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction. llvm-svn: 48746	2008-03-24 21:52:23 +00:00
Nate Begeman	f9691b8236	Add a couple missing SSE4 instructions llvm-svn: 48430	2008-03-16 21:14:46 +00:00
Evan Cheng	11d2c09adc	Replace all target specific implicit def instructions with a target independent one: TargetInstrInfo::IMPLICIT_DEF. llvm-svn: 48380	2008-03-15 00:03:38 +00:00
Evan Cheng	877c5ecabd	Fix some 80 col violations. llvm-svn: 48361	2008-03-14 07:46:48 +00:00
Evan Cheng	fc6645a382	Fix a number of encoding bugs. SSE 4.1 instructions MPSADBWrri, PINSRDrr, etc. have 8-bits immediate field (ImmT == Imm8). llvm-svn: 48360	2008-03-14 07:39:27 +00:00
Evan Cheng	df92afe7d3	Clean up my own mess. X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases. llvm-svn: 48279	2008-03-12 07:02:50 +00:00
Evan Cheng	dba1dfe962	Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0\|1\|2} and prefetchnta instructions. llvm-svn: 48042	2008-03-08 00:58:38 +00:00
Evan Cheng	a36562006a	isTwoAddress = 1 -> Constraints. llvm-svn: 47941	2008-03-05 08:19:16 +00:00
Evan Cheng	6c2bb7c67e	PSLLWri etc. are two-address instructions. llvm-svn: 47940	2008-03-05 08:11:27 +00:00
Evan Cheng	bb577266bf	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00

1 2 3 4 5 ...

291 Commits