llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00

Author	SHA1	Message	Date
Chris Lattner	2bd91746e1	pretty print node name llvm-svn: 27806	2006-04-18 18:05:58 +00:00
Chris Lattner	44ea12c5f8	Implement an important entry from README_ALTIVEC: If an altivec predicate compare is used immediately by a branch, don't use a (serializing) MFCR instruction to read the CR6 register, which requires a compare to get it back to CR's. Instead, just branch on CR6 directly. :) For example, for: void foo2(vector float A, vector float B) { if (!vec_any_eq(A, B)) *B = (vector float){0,0,0,0}; } We now generate: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 bne cr6, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr instead of: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 cmpwi cr0, r3, 0 beq cr0, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr This implements CodeGen/PowerPC/vec_br_cmp.ll. llvm-svn: 27804	2006-04-18 17:59:36 +00:00
Chris Lattner	519001b0ee	move some stuff around, clean things up llvm-svn: 27802	2006-04-18 17:52:36 +00:00
Chris Lattner	e90fdf3b98	Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing even/odd halves. Thanks to Nate telling me what's what. llvm-svn: 27793	2006-04-18 04:28:57 +00:00
Chris Lattner	5951b60cb4	Implement v16i8 multiply with this code: vmuloub v5, v3, v2 vmuleub v2, v3, v2 vperm v2, v2, v5, v4 This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are 6.79x faster than before. Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with GCC. Remove the 'integer multiplies' todo from the README file. llvm-svn: 27792	2006-04-18 03:57:35 +00:00
Chris Lattner	4d84b56e64	Lower v8i16 multiply into this code: li r5, lo16(LCPI1_0) lis r6, ha16(LCPI1_0) lvx v4, r6, r5 vmulouh v5, v3, v2 vmuleuh v2, v3, v2 vperm v2, v2, v5, v4 where v4 is: LCPI1_0: ; <16 x ubyte> .byte 2 .byte 3 .byte 18 .byte 19 .byte 6 .byte 7 .byte 22 .byte 23 .byte 10 .byte 11 .byte 26 .byte 27 .byte 14 .byte 15 .byte 30 .byte 31 This is 5.07x faster on the G5 (measured) than lowering to scalar code + loads/stores. llvm-svn: 27789	2006-04-18 03:43:48 +00:00
Chris Lattner	613d7fda64	Custom lower v4i32 multiplies into a cute sequence, instead of having legalize scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788	2006-04-18 03:24:30 +00:00
Chris Lattner	81938fa3db	remove done item llvm-svn: 27778	2006-04-17 21:52:03 +00:00
Chris Lattner	fdecddb741	Don't diddle VRSAVE if no registers need to be added/removed from it. This allows us to codegen functions as: _test_rol: vspltisw v2, -12 vrlw v2, v2, v2 blr instead of: _test_rol: mfvrsave r2, 256 mr r3, r2 mtvrsave r3 vspltisw v2, -12 vrlw v2, v2, v2 mtvrsave r2 blr Testcase here: CodeGen/PowerPC/vec_vrsave.ll llvm-svn: 27777	2006-04-17 21:48:13 +00:00
Chris Lattner	021f521a41	Vectors that are known live-in and live-out are clearly already marked in the vrsave register for the caller. This allows us to codegen a function as: _test_rol: mfspr r2, 256 mr r3, r2 mtspr 256, r3 vspltisw v2, -12 vrlw v2, v2, v2 mtspr 256, r2 blr instead of: _test_rol: mfspr r2, 256 oris r3, r2, 40960 mtspr 256, r3 vspltisw v0, -12 vrlw v2, v0, v0 mtspr 256, r2 blr llvm-svn: 27772	2006-04-17 21:22:06 +00:00
Chris Lattner	a717d4f53b	Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this: vspltisw v2, -12 vrlw v2, v2, v2 instead of: vspltisw v0, -12 vrlw v2, v0, v0 when a function is returning a value. llvm-svn: 27771	2006-04-17 21:19:12 +00:00
Chris Lattner	6b76deffb5	Move some knowledge about registers out of the code emitter into the register info. llvm-svn: 27770	2006-04-17 21:07:20 +00:00
Chris Lattner	face261a94	Use a small table instead of macros to do this conversion. llvm-svn: 27769	2006-04-17 20:59:25 +00:00
Chris Lattner	f2347c31b4	Make sure to check splats of every constant we can, handle splat(31) by being a bit more clever, add support for odd splats from -31 to -17. llvm-svn: 27764	2006-04-17 18:09:22 +00:00
Chris Lattner	cc4222d95b	Teach the ppc backend to use rol and vsldoi to generate splatted constants. This implements vec_constants.ll:test_vsldoi and test_rol llvm-svn: 27760	2006-04-17 17:55:10 +00:00
Chris Lattner	7d66e5a118	add a note llvm-svn: 27758	2006-04-17 17:29:41 +00:00
Chris Lattner	2d8d6c9feb	Make some code more general, adding support for constant formation of several new patterns. llvm-svn: 27754	2006-04-17 06:58:41 +00:00
Chris Lattner	9dd4ebffca	Learn how to make odd splatted constants in range [17,29]. This implements PowerPC/vec_constants.ll:test_29. llvm-svn: 27752	2006-04-17 06:07:44 +00:00
Chris Lattner	72a67a5b1f	Pull some code out into a helper function. Effeciently codegen even splats in the range [-32,30]. This allows us to codegen <30,30,30,30> as: vspltisw v0, 15 vadduwm v2, v0, v0 instead of as a cp load. llvm-svn: 27750	2006-04-17 06:00:21 +00:00
Chris Lattner	5367a73dec	Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle, if it can be implemented in 3 or fewer discrete altivec instructions, codegen it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll llvm-svn: 27748	2006-04-17 05:28:54 +00:00
Chris Lattner	34ec6432f6	Regenerate with adjusted costs llvm-svn: 27746	2006-04-17 05:26:20 +00:00
Chris Lattner	36ceea9e96	Regenerate with correct offset llvm-svn: 27744	2006-04-17 05:08:46 +00:00
Chris Lattner	671f50cf33	Increase the opcodes by one each to disambiguate COPY from VMRGHW. llvm-svn: 27742	2006-04-17 00:47:48 +00:00
Chris Lattner	99ee809cb6	Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles of various 4-element vectors. llvm-svn: 27739	2006-04-17 00:37:02 +00:00
Chris Lattner	d86516991a	Implement a TODO: have the legalizer canonicalize a bunch of operations to one type (v4i32) so that we don't have to write patterns for each type, and so that more CSE opportunities are exposed. llvm-svn: 27731	2006-04-16 01:37:57 +00:00
Chris Lattner	f4126f0db7	Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors. Remove some done items from the todo list. llvm-svn: 27729	2006-04-16 01:01:29 +00:00
Chris Lattner	44245f11c3	Fix a crash when faced with a shuffle vector that has an undef in its mask. llvm-svn: 27726	2006-04-15 23:48:05 +00:00
Chris Lattner	2ede0fef98	Add patterns for matching vnots with bit converted inputs. Most of these will go away when I start using evan's binop type canonicalizer llvm-svn: 27725	2006-04-15 23:45:24 +00:00
Chris Lattner	5c9d357d7c	Allow undef in a shuffle mask llvm-svn: 27714	2006-04-14 23:19:08 +00:00
Chris Lattner	cf80e569f6	Move the rest of the PPCTargetLowering::LowerOperation cases out into separate functions, for simplicity and code clarity. llvm-svn: 27693	2006-04-14 06:01:58 +00:00
Chris Lattner	aacabea404	Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate functions, which makes the code much cleaner :) llvm-svn: 27692	2006-04-14 05:19:18 +00:00
Chris Lattner	569ea9c6dd	Force non-darwin targets to use a static relo model. This fixes PR734, tested by CodeGen/Generic/vector.ll llvm-svn: 27657	2006-04-13 17:10:48 +00:00
Chris Lattner	cec07adf4d	add a note, move an altivec todo to the altivec list. llvm-svn: 27654	2006-04-13 16:48:00 +00:00
Reid Spencer	b08854af39	Add the README files to the distribution. llvm-svn: 27651	2006-04-13 06:39:24 +00:00
Chris Lattner	e087b8e321	Add a new way to match vector constants, which make it easier to bang bits of different types. Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load, implementing PowerPC/vec_constants.ll:test1. This compiles: typedef float vf __attribute__ ((vector_size (16))); typedef int vi __attribute__ ((vector_size (16))); void test(vi P1, vi P2, vf P3) { P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000}; P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF}; P3 = vec_abs((vector float)*P3); } to: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 vspltisw v0, -1 vslw v0, v0, v0 lvx v1, 0, r3 vand v1, v1, v0 stvx v1, 0, r3 lvx v1, 0, r4 vandc v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vandc v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr instead of (with two constant pool entries): _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 li r6, lo16(LCPI1_0) lis r7, ha16(LCPI1_0) li r8, lo16(LCPI1_1) lis r9, ha16(LCPI1_1) lvx v0, r7, r6 lvx v1, 0, r3 vand v0, v1, v0 stvx v0, 0, r3 lvx v0, r9, r8 lvx v1, 0, r4 vand v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vand v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr GCC produces (with 2 cp entries): _test: mfspr r0,256 stw r0,-4(r1) oris r0,r0,0xc00c mtspr 256,r0 lis r2,ha16(LC0) lis r9,ha16(LC1) la r2,lo16(LC0)(r2) lvx v0,0,r3 lvx v1,0,r5 la r9,lo16(LC1)(r9) lwz r12,-4(r1) lvx v12,0,r2 lvx v13,0,r9 vand v0,v0,v12 stvx v0,0,r3 vspltisw v0,-1 vslw v12,v0,v0 vandc v1,v1,v12 stvx v1,0,r5 lvx v0,0,r4 vand v0,v0,v13 stvx v0,0,r4 mtspr 256,r12 blr llvm-svn: 27624	2006-04-12 19:07:14 +00:00
Chris Lattner	ce6e988fa6	Rename get_VSPLI_elt -> get_VSPLTI_elt Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each form, eliminating a bunch of Pat patterns in the .td file and allowing us to CSE stuff more aggressively. This implements PowerPC/buildvec_canonicalize.ll:VSPLTI llvm-svn: 27614	2006-04-12 17:37:20 +00:00
Chris Lattner	602d86f7af	Ensure that zero vectors are always v4i32, which forces them to CSE with each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll llvm-svn: 27609	2006-04-12 16:53:28 +00:00
Nate Begeman	ccd6ea1913	Fix SingleSource/UnitTests/Vector/sumarray-dbl llvm-svn: 27594	2006-04-11 19:44:43 +00:00
Nate Begeman	786d44f822	Fix PR727, correctly handling large stack aligments on ppc llvm-svn: 27593	2006-04-11 19:29:21 +00:00
Chris Lattner	0e63e916b3	we have a shuffle instr, add an example. llvm-svn: 27592	2006-04-11 18:47:03 +00:00
Jim Laskey	1e0cbe4158	Suppress debug label when not debug. llvm-svn: 27588	2006-04-11 08:11:53 +00:00
Chris Lattner	e12152a64b	Vector function results go into V2 according to GCC. The darwin ABI doc doesn't say where they go :-/ llvm-svn: 27579	2006-04-11 01:38:39 +00:00
Chris Lattner	5d1acb831a	Move some return-handling code from lowerarguments to the ISD::RET handling stuff. No functionality change. llvm-svn: 27577	2006-04-11 01:21:43 +00:00
Chris Lattner	3c6e4a1dc9	properly mark vector selects as expanded to select_cc llvm-svn: 27544	2006-04-08 22:59:15 +00:00
Chris Lattner	2ffa288a23	Add VRRC select support llvm-svn: 27543	2006-04-08 22:45:08 +00:00
Nate Begeman	6cdc599d05	Disable switch lowering for targets based on the selection dag isel, letting the code generator handle them directly. llvm-svn: 27539	2006-04-08 19:46:55 +00:00
Chris Lattner	8234bfe18e	Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a constant pool load. llvm-svn: 27538	2006-04-08 07:14:26 +00:00
Chris Lattner	e8defcff7d	Change the interface to the predicate that determines if vsplti* can be used. No functionality changes. llvm-svn: 27536	2006-04-08 06:46:53 +00:00
Jim Laskey	fabb0ba736	Make sure that debug labels are defined within the same section and after the entry point of a function. llvm-svn: 27494	2006-04-07 20:44:42 +00:00
Jim Laskey	b93bc75add	Foundation for call frame information. llvm-svn: 27491	2006-04-07 16:34:46 +00:00
Chris Lattner	db7dfe8c61	Add an item llvm-svn: 27470	2006-04-06 23:16:19 +00:00
Chris Lattner	a390188fd4	Make sure to return the result in the right type. llvm-svn: 27469	2006-04-06 23:12:19 +00:00
Chris Lattner	c0680ae07e	Match vpku[hw]um(x,x). Convert vsldoi(x,x) to work the same way other (x,x) cases work. llvm-svn: 27467	2006-04-06 22:28:36 +00:00
Chris Lattner	a52d88ee89	Add support for matching vmrg(x,x) patterns llvm-svn: 27463	2006-04-06 22:02:42 +00:00
Chris Lattner	300076cbd8	Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles. llvm-svn: 27457	2006-04-06 21:11:54 +00:00
Chris Lattner	6cf87c1b01	remove two done items llvm-svn: 27453	2006-04-06 19:19:38 +00:00
Chris Lattner	2875bb116e	Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to lower it and LLVM to have one fewer intrinsic. This implements CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27450	2006-04-06 18:26:28 +00:00
Chris Lattner	10fa7be550	Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into vperm with a perm mask lvx'd from the constant pool. llvm-svn: 27448	2006-04-06 17:23:16 +00:00
Chris Lattner	7f13e50435	Add all of the data stream intrinsics and instructions. woo llvm-svn: 27442	2006-04-05 22:27:14 +00:00
Chris Lattner	338945e669	Fix a typo llvm-svn: 27440	2006-04-05 20:15:25 +00:00
Chris Lattner	d1b47b18ed	Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll llvm-svn: 27439	2006-04-05 17:39:25 +00:00
Evan Cheng	9e56e97205	Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered. llvm-svn: 27433	2006-04-05 06:09:26 +00:00
Chris Lattner	ee971bedf2	add vsl llvm-svn: 27425	2006-04-05 01:16:22 +00:00
Chris Lattner	993209029f	add vmladduhm llvm-svn: 27423	2006-04-05 00:49:48 +00:00
Chris Lattner	66c3b75644	Add m[tf]vscr instructions. llvm-svn: 27421	2006-04-05 00:03:57 +00:00
Chris Lattner	10394b1c42	add a note llvm-svn: 27419	2006-04-04 23:45:11 +00:00
Chris Lattner	e7a52b473f	Add missing byte merges. llvm-svn: 27418	2006-04-04 23:43:56 +00:00
Chris Lattner	ab137b431f	Add FP -> Int Conversions llvm-svn: 27417	2006-04-04 23:25:02 +00:00
Chris Lattner	6cf881590f	add average intrinsics llvm-svn: 27416	2006-04-04 23:14:00 +00:00
Chris Lattner	59c4add58a	add a note llvm-svn: 27414	2006-04-04 22:43:55 +00:00
Chris Lattner	d1483ca1ad	Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'. llvm-svn: 27413	2006-04-04 22:28:35 +00:00
Chris Lattner	4e99e6dfdd	Ask legalize to promote all vector shuffles to be v16i8 instead of having to handle all 4 PPC vector types. This simplifies the matching code and allows us to eliminate a bunch of patterns. This also adds cases we were missing, such as CodeGen/PowerPC/vec_splat.ll:splat_h. llvm-svn: 27400	2006-04-04 17:25:31 +00:00
Chris Lattner	2bf9c8cc18	Plug in the byte and short splats llvm-svn: 27387	2006-04-04 00:05:13 +00:00
Chris Lattner	0128e4d335	Revert accidentally committed hunks. llvm-svn: 27386	2006-04-03 23:58:04 +00:00
Chris Lattner	57b9e01b3e	Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand. llvm-svn: 27385	2006-04-03 23:55:43 +00:00
Chris Lattner	eb9684f6a4	Force use of a frame-pointer if there is anything on the stack that is aligned more than the OS keeps the stack aligned. llvm-svn: 27381	2006-04-03 22:03:29 +00:00
Chris Lattner	c65511b05c	Add the full set of min/max instructions llvm-svn: 27372	2006-04-03 15:58:28 +00:00
Chris Lattner	fa82c33ae7	add a note llvm-svn: 27360	2006-04-02 07:20:00 +00:00
Chris Lattner	8ba4723c74	Inform the dag combiner that the predicate compares only return a low bit. llvm-svn: 27359	2006-04-02 06:26:07 +00:00
Chris Lattner	8967316b8c	Remove done item llvm-svn: 27351	2006-04-02 05:28:54 +00:00
Chris Lattner	9c24ec6de5	add a note llvm-svn: 27348	2006-04-02 03:59:11 +00:00
Chris Lattner	da4217646a	Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into "vspltisb v0, 8" instead of a constant pool load. llvm-svn: 27335	2006-04-02 00:43:36 +00:00
Chris Lattner	38318b2706	Implement vnot using VNOR instead of using 'vspltisb v0, -1' and vxor llvm-svn: 27331	2006-04-01 22:41:47 +00:00
Chris Lattner	32bb17a5f3	Shrinkify some more intrinsic definitions. llvm-svn: 27322	2006-03-31 22:41:56 +00:00
Chris Lattner	12e9ce7104	Pull operand asm string into base class, shrinkifying intrinsic definitions. No functionality change. llvm-svn: 27320	2006-03-31 22:34:05 +00:00
Chris Lattner	3d6e5f8a05	Fix 80 column violations :) llvm-svn: 27315	2006-03-31 21:57:36 +00:00
Chris Lattner	d66dd2a4ee	fix a pasto llvm-svn: 27308	2006-03-31 21:19:06 +00:00
Chris Lattner	28219f34bc	Add vperm support for all datatypes llvm-svn: 27307	2006-03-31 20:00:35 +00:00
Chris Lattner	336d6646ab	Rearrange code a bit llvm-svn: 27306	2006-03-31 19:52:36 +00:00
Chris Lattner	786f782398	Add, sub and shuffle are legal for all vector types llvm-svn: 27305	2006-03-31 19:48:58 +00:00
Chris Lattner	d27ced882b	add a note llvm-svn: 27302	2006-03-31 19:00:22 +00:00
Chris Lattner	e3774da014	note to self: save file, then check it in llvm-svn: 27291	2006-03-31 06:04:53 +00:00
Chris Lattner	95d358dbdb	Implement an item from the readme, folding vcmp/vcmp. instructions with identical instructions into a single instruction. For example, for: void test(vector float x, vector float y, int P) { int v = vec_any_out(x, y); x = (vector float)vec_cmpb(x, y); P = v; } we now generate: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v0, v1, v0 mfcr r4, 2 stvx v0, 0, r3 rlwinm r3, r4, 27, 31, 31 xori r3, r3, 1 stw r3, 0(r5) mtspr 256, r2 blr instead of: _test: mfspr r2, 256 oris r6, r2, 57344 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v2, v1, v0 mfcr r4, 2 ** vcmpbfp v0, v1, v0 rlwinm r4, r4, 27, 31, 31 stvx v0, 0, r3 xori r3, r4, 1 stw r3, 0(r5) mtspr 256, r2 blr Testcase here: CodeGen/PowerPC/vcmp-fold.ll llvm-svn: 27290	2006-03-31 06:02:07 +00:00
Chris Lattner	560f734320	compactify some more instruction definitions llvm-svn: 27288	2006-03-31 05:38:32 +00:00
Chris Lattner	2c3d6bdb55	Compactify comparisons. llvm-svn: 27287	2006-03-31 05:32:57 +00:00
Chris Lattner	e330741a6c	Lower vector compares to VCMP nodes, just like we lower vector comparison predicates to VCMPo nodes. llvm-svn: 27285	2006-03-31 05:13:27 +00:00
Chris Lattner	a7a7c035b3	These are done llvm-svn: 27284	2006-03-31 04:53:21 +00:00
Chris Lattner	a31d719e0a	Mark INSERT_VECTOR_ELT as expand llvm-svn: 27276	2006-03-31 01:48:55 +00:00
Chris Lattner	87d3a2e045	Add the rest of the vmul instructions and the vmulsum* instructions. llvm-svn: 27268	2006-03-30 23:39:06 +00:00
Chris Lattner	22b7e551f1	Use a new tblgen feature to significantly shrinkify instruction definitions that directly correspond to intrinsics. llvm-svn: 27266	2006-03-30 23:21:27 +00:00
Chris Lattner	6aca6013d2	Add a bunch of new instructions for intrinsics. llvm-svn: 27265	2006-03-30 23:07:36 +00:00
Chris Lattner	1a773f8f18	add a note llvm-svn: 27243	2006-03-29 00:24:13 +00:00
Chris Lattner	93559450b8	add a note llvm-svn: 27227	2006-03-28 18:56:23 +00:00
Jim Laskey	eb38a3e83a	Expose base register for DwarfWriter. Refactor code accordingly. llvm-svn: 27225	2006-03-28 13:48:33 +00:00
Nate Begeman	d432d66cc8	Fix a couple typos llvm-svn: 27216	2006-03-28 04:18:18 +00:00
Nate Begeman	5a82c8ccbd	Add a few more altivec intrinsics llvm-svn: 27215	2006-03-28 04:15:58 +00:00
Chris Lattner	a570305421	implement a bunch more intrinsics. llvm-svn: 27209	2006-03-28 02:29:37 +00:00
Chris Lattner	ac98e20cc9	Use normal lvx for scalar_to_vector instead of lve*x. They do the exact same thing and we have a dag node for the former. llvm-svn: 27205	2006-03-28 01:43:22 +00:00
Chris Lattner	d5da541d42	Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums. llvm-svn: 27201	2006-03-28 00:40:33 +00:00
Jim Laskey	8688957c53	Translate llvm target registers to dwarf register numbers properly. llvm-svn: 27180	2006-03-27 20:18:45 +00:00
Chris Lattner	dab8425129	Add a bunch of notes from my journey thus far. llvm-svn: 27170	2006-03-27 07:41:00 +00:00
Chris Lattner	f1d6a9483f	Split out altivec notes into their own README llvm-svn: 27168	2006-03-27 07:04:16 +00:00
Chris Lattner	4b0fc38fe7	Fix the JIT encoding of VSEL llvm-svn: 27160	2006-03-27 03:34:17 +00:00
Chris Lattner	b5efa3e0f5	Fix the JIT encoding of VSPLTI* llvm-svn: 27159	2006-03-27 03:28:57 +00:00
Nate Begeman	3d518334b9	SelectionDAGISel can now natively handle Switch instructions, in the same manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary search tree of basic blocks. The new approach has several advantages: it is faster, it generates significantly smaller code in many cases, and it paves the way for implementing dense switch tables as a jump table by handling switches directly in the instruction selector. This functionality is currently only enabled on x86, but should be safe for every target. In anticipation of making it the default, the cfg is now properly updated in the x86, ppc, and sparc select lowering code. llvm-svn: 27156	2006-03-27 01:32:24 +00:00
Chris Lattner	03ad35fd49	add vsel llvm-svn: 27153	2006-03-26 22:38:43 +00:00
Chris Lattner	65a455b060	Codegen vector predicate compares. llvm-svn: 27151	2006-03-26 10:06:40 +00:00
Evan Cheng	b17bbf8ccb	Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead llvm-svn: 27149	2006-03-26 09:52:32 +00:00
Chris Lattner	f0c36b99e6	Add all of the altivec comparison instructions. Add patterns for the non-predicate altivec compare intrinsics. llvm-svn: 27143	2006-03-26 04:57:17 +00:00
Chris Lattner	4e0a78ea30	Add and 8/16-bit adds, add all integer subtracts, add saturating subtract intrinsics. llvm-svn: 27142	2006-03-26 02:39:02 +00:00
Chris Lattner	d33ef7a1bc	implement the vsldoi intrinsic. llvm-svn: 27139	2006-03-26 00:41:48 +00:00
Chris Lattner	7d557e00f3	fix the pattern for vandc, it's NOT vnand llvm-svn: 27136	2006-03-25 23:10:40 +00:00
Chris Lattner	88a0c65463	add patterns for VANDC/VNOR, implementing CodeGen/PowerPC/eqv-andc-orc-nor.ll:VNOR/VANDC llvm-svn: 27135	2006-03-25 23:05:29 +00:00
Chris Lattner	f80b39f9b1	Add some logical operations llvm-svn: 27127	2006-03-25 22:16:05 +00:00
Chris Lattner	d2823658b4	implement a bunch of intrinsics llvm-svn: 27118	2006-03-25 08:01:02 +00:00
Chris Lattner	cb5f9269a9	Move all Altivec stuff out into a new PPCInstrAltivec.td file. Add a bunch of patterns for different datatypes, e.g. bit_convert, undef and zero vector support. llvm-svn: 27117	2006-03-25 07:51:43 +00:00
Chris Lattner	57064915a6	Add some basic patterns for other datatypes llvm-svn: 27116	2006-03-25 07:39:07 +00:00
Chris Lattner	7f5fba9c67	add all supported formats to the vector register file llvm-svn: 27115	2006-03-25 07:36:56 +00:00
Chris Lattner	2fa3a6c436	Add support for __builtin_altivec_vnmsubfp /vmaddfp llvm-svn: 27112	2006-03-25 07:05:55 +00:00
Chris Lattner	e199d55073	#include Intrinsics.h into all dag isels llvm-svn: 27109	2006-03-25 06:47:10 +00:00
Chris Lattner	0899b16b2d	Codegen things like: <int -1, int -1, int -1, int -1> and <int 65537, int 65537, int 65537, int 65537> Using things like: vspltisb v0, -1 and: vspltish v0, 1 instead of using constant pool loads. This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32\|16}. llvm-svn: 27106	2006-03-25 06:12:06 +00:00
Jim Laskey	1716e53341	Add dwarf register numbering to register data. llvm-svn: 27081	2006-03-24 21:15:58 +00:00
Chris Lattner	8840036091	add another note llvm-svn: 27077	2006-03-24 20:04:27 +00:00
Chris Lattner	21abff3712	Fix a bad JIT encoding of VPERM. Why is VPERM D,A,B,C but vfmadd is D,A,C,B ?? llvm-svn: 27069	2006-03-24 18:24:43 +00:00
Chris Lattner	3133dafd4b	Like the comment says, prefer to use the implicit add done by [r+r] addressing modes than emitting an explicit add and using a base of r0. This implements Regression/CodeGen/PowerPC/mem-rr-addr-mode.ll llvm-svn: 27068	2006-03-24 17:58:06 +00:00
Chris Lattner	303dc30593	Disable the i32->float G5 optimization. It is unsafe, as documented in the comment. This fixes 177.mesa, and McCat/09-vor with the td scheduler. llvm-svn: 27060	2006-03-24 07:53:47 +00:00
Chris Lattner	ba4966c16c	add support for using vxor to build zero vectors. This implements Regression/CodeGen/PowerPC/vec_zero.ll llvm-svn: 27059	2006-03-24 07:48:08 +00:00
Chris Lattner	ace2d0d227	Gabor points out that we can't spell. :) llvm-svn: 27049	2006-03-24 07:12:19 +00:00
Chris Lattner	2e5162fa6e	add a note llvm-svn: 27000	2006-03-23 21:28:44 +00:00
Chris Lattner	974982c89c	Add PPC vector bit-convert support llvm-svn: 26995	2006-03-23 19:54:27 +00:00
Jim Laskey	cec9c18c62	Add support to locate local variables in frames (early version.) llvm-svn: 26994	2006-03-23 18:12:57 +00:00
Jim Laskey	f3cc740d75	Change interface to DwarfWriter. llvm-svn: 26991	2006-03-23 18:09:44 +00:00
Chris Lattner	89e0790edb	Eliminate IntrinsicLowering from TargetMachine. Make the CBE and V9 backends create their own, since they're the only ones that use it. llvm-svn: 26974	2006-03-23 05:43:16 +00:00
Chris Lattner	5141ebb2c4	This has been implemented. Tweak it into another note llvm-svn: 26944	2006-03-22 05:33:23 +00:00
Chris Lattner	f84f3bf95b	When possible, custom lower 32-bit SINT_TO_FP to this: _foo2: extsw r2, r3 std r2, -8(r1) lfd f0, -8(r1) fcfid f0, f0 frsp f1, f0 blr instead of this: _foo2: lis r2, ha16(LCPI2_0) lis r4, 17200 xoris r3, r3, 32768 stw r3, -4(r1) stw r4, -8(r1) lfs f0, lo16(LCPI2_0)(r2) lfd f1, -8(r1) fsub f0, f1, f0 frsp f1, f0 blr This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s with llcbeta (16.7% and 38.1% respectively). llvm-svn: 26943	2006-03-22 05:30:33 +00:00
Chris Lattner	cfbce5186a	Add support for "ri" addressing modes where the immediate is a 14-bit field which is shifted left two bits before use. Instructions like STD use this addressing mode. llvm-svn: 26942	2006-03-22 05:26:03 +00:00
Chris Lattner	2e606dc60f	Fix the JIT encoding of the VAForm_1 instructions, including vmaddfp llvm-svn: 26935	2006-03-22 01:44:36 +00:00
Chris Lattner	31a93c7740	These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will. llvm-svn: 26930	2006-03-21 20:51:05 +00:00
Chris Lattner	140be98ab8	Don't emit pseudo instructions! llvm-svn: 26926	2006-03-21 20:19:37 +00:00
Nate Begeman	4bd73df4bd	Update readme llvm-svn: 26924	2006-03-21 18:58:20 +00:00
Chris Lattner	414fed4108	Print absolute memory references like this: lwz r2, 8(0) instead of this: lwz r2, 8(r0) This fixes the llc/llc-beta failures on PPC last night. llvm-svn: 26922	2006-03-21 17:21:13 +00:00
Chris Lattner	6417236c41	With Evan's latest tblgen patch, this code is obsolete, thanks Evan! llvm-svn: 26917	2006-03-21 06:37:40 +00:00
Chris Lattner	acb2506622	When codegen'ing vector MUL using VFMADD, add the 0, don't mul the 0. llvm-svn: 26913	2006-03-21 00:51:38 +00:00
Chris Lattner	9e611a25c7	minor note llvm-svn: 26912	2006-03-21 00:47:09 +00:00
Chris Lattner	cdc4657988	Handle constant addresses more efficiently, folding the low bits into the disp field of the load/store if possible. This compiles CodeGen/PowerPC/load-constant-addr.ll to: _test: lis r2, 2838 lfs f1, 26848(r2) blr instead of: _test: lis r2, 2838 ori r2, r2, 26848 lfs f1, 0(r2) blr llvm-svn: 26908	2006-03-20 22:38:22 +00:00
Chris Lattner	a498dd25d9	remove dead variable llvm-svn: 26907	2006-03-20 22:37:23 +00:00
Chris Lattner	978628896b	Fix a couple of bugs in permute/splat generate, thanks to Nate for actually figuring these out! :) llvm-svn: 26904	2006-03-20 18:26:51 +00:00
Chris Lattner	5c994b8c63	reenable this hack, the tblgen version isn't quite ready llvm-svn: 26902	2006-03-20 17:54:43 +00:00
Chris Lattner	fb0e160aa5	Fix the pattern for VADDUWM, add i32 splat llvm-svn: 26901	2006-03-20 17:51:58 +00:00
Evan Cheng	57da1afbc8	Use tblgen'd VECTOR_SHUFFLE selection code. llvm-svn: 26900	2006-03-20 08:14:16 +00:00
Chris Lattner	dc3605efdb	Add support for generating vspltw, instead of a vperm instruction with a constant pool load. This generates significantly nicer code for splats. When tblgen gets bugfixed, we can remove the custom selection code. llvm-svn: 26898	2006-03-20 06:51:10 +00:00
Chris Lattner	8faa2cf693	Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate. llvm-svn: 26897	2006-03-20 06:37:44 +00:00
Chris Lattner	4b7aa59bbc	fix duplicate definition errors llvm-svn: 26896	2006-03-20 06:33:01 +00:00
Chris Lattner	1cdeda1c5a	Check in some intermediate code that adds a skeleton for matching vsplt* instructions llvm-svn: 26894	2006-03-20 06:15:45 +00:00
Chris Lattner	c230af9810	fix typo llvm-svn: 26889	2006-03-20 05:05:55 +00:00
Chris Lattner	bea056ecf2	add vsplat instructions, fix sched description for vperm llvm-svn: 26888	2006-03-20 04:47:33 +00:00
Chris Lattner	0e56cf0d94	Custom lower arbitrary VECTOR_SHUFFLE's to VPERM. TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized operations like vsplt* llvm-svn: 26887	2006-03-20 01:53:53 +00:00
Chris Lattner	65e6e12dca	Claim to have v16i8 for perm masks llvm-svn: 26886	2006-03-20 01:53:02 +00:00
Chris Lattner	6f502da274	add the vperm instruction llvm-svn: 26883	2006-03-20 01:00:56 +00:00
Chris Lattner	ada41aad4d	Add a note about the MUL -> FMADD vector bug. llvm-svn: 26874	2006-03-19 22:08:08 +00:00
Chris Lattner	789570bafb	Custom lower SCALAR_TO_VECTOR into lve*x. llvm-svn: 26868	2006-03-19 06:55:52 +00:00
Chris Lattner	80f9f7138a	PPC doesn't have SCALAR_TO_VECTOR llvm-svn: 26865	2006-03-19 06:17:19 +00:00
Chris Lattner	89bc332152	add support for vector undef llvm-svn: 26863	2006-03-19 06:10:09 +00:00
Chris Lattner	a9b4a2ab99	minor fixes llvm-svn: 26857	2006-03-19 05:43:01 +00:00
Chris Lattner	d3910ca755	notes llvm-svn: 26856	2006-03-19 05:33:30 +00:00
Chris Lattner	b46a4c28ad	we don't use lmw/stmw. When we want them they are easy enough to add llvm-svn: 26853	2006-03-19 04:33:37 +00:00
Chris Lattner	1bd0aaf2b8	rename these nodes llvm-svn: 26848	2006-03-19 01:13:28 +00:00
Nate Begeman	793c8136ae	Fix subfic to match subc by default instead of sub so that it is correctly cost-modeled as producing a flag. This fixes the test I just added for neg llvm-svn: 26835	2006-03-17 22:41:37 +00:00
Nate Begeman	42736d46b2	Remove BRTWOWAY* Make the PPC backend not dependent on BRTWOWAY_CC and make the branch selector smarter about the code it generates, fixing a case in the readme. llvm-svn: 26814	2006-03-17 01:40:33 +00:00
Chris Lattner	87dbd49cbe	remove dead variable llvm-svn: 26813	2006-03-16 23:52:08 +00:00
Nate Begeman	63c4456867	Notes on how to kill the eeevil brtwoway, and make ppc branch selector more target independant, generate better code, and be less conservative. llvm-svn: 26809	2006-03-16 22:37:48 +00:00
Chris Lattner	f2008cb73b	Strangely, calls clobber call-clobbered vector regs. Whodathoughtit? llvm-svn: 26808	2006-03-16 22:35:59 +00:00
Chris Lattner	8a756c5171	add a note llvm-svn: 26807	2006-03-16 22:25:55 +00:00
Chris Lattner	57773fdac1	teach the ppc backend how to spill/reload vector regs llvm-svn: 26806	2006-03-16 22:24:02 +00:00
Chris Lattner	661ee5d3c1	add callee saved vector regs llvm-svn: 26805	2006-03-16 22:07:06 +00:00
Evan Cheng	cad75d9f0c	Added a way for TargetLowering to specify what values can be used as the scale component of the target addressing mode. llvm-svn: 26802	2006-03-16 21:47:42 +00:00
Chris Lattner	7f5361757b	in functions that use a lot of callee saved regs, this can be more than 5 instructions away. llvm-svn: 26801	2006-03-16 21:31:45 +00:00
Chris Lattner	bf153651b1	Add support for copying registers. still needed: spilling and reloading them llvm-svn: 26800	2006-03-16 20:03:58 +00:00
Nate Begeman	cbca1b3d14	Another case we could do better on. llvm-svn: 26795	2006-03-16 18:50:44 +00:00
Chris Lattner	b5d0896994	Save/restore VRSAVE once per function, not once per block. llvm-svn: 26793	2006-03-16 18:25:23 +00:00
Nate Begeman	e371cb595a	Update scheduling info for vrsave instruction llvm-svn: 26776	2006-03-15 05:25:05 +00:00
Chris Lattner	392087f5bd	Fix an off by one error that caused PPC LLC failures last night. llvm-svn: 26758	2006-03-14 17:56:49 +00:00
Evan Cheng	ae7469b2c5	PPC LSR pass should use target lowering hooks. llvm-svn: 26743	2006-03-13 23:56:51 +00:00
Evan Cheng	7ec94f2ff7	Added getTargetLowering() to TargetMachine. Refactored targets to support this. llvm-svn: 26742	2006-03-13 23:20:37 +00:00
Chris Lattner	d0505331d2	For functions that use vector registers, save VRSAVE, mark used registers, and update it on entry to each function, then restore it on exit. This compiles: void func(vfloat a, vfloat b, vfloat c) { a = b c + c; } to this: _func: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 lvx v0, 0, r5 lvx v1, 0, r4 vmaddfp v0, v1, v0, v0 stvx v0, 0, r3 mtspr 256, r2 blr GCC produces this (which has additional stack accesses): _func: mfspr r0,256 stw r0,-4(r1) oris r0,r0,0xc000 mtspr 256,r0 lvx v0,0,r5 lvx v1,0,r4 lwz r12,-4(r1) vmaddfp v0,v0,v1,v0 stvx v0,0,r3 mtspr 256,r12 blr llvm-svn: 26733	2006-03-13 21:52:10 +00:00
Chris Lattner	3aff8e6acf	Fix a couple of bugs that broke the alpha tester build llvm-svn: 26722	2006-03-13 05:23:59 +00:00
Chris Lattner	9898674f99	Handle cracked instructions in dispatch group formation. llvm-svn: 26721	2006-03-13 05:20:04 +00:00
Chris Lattner	ba10d4e4ab	Mark instructions that are cracked by the PPC970 decoder as such. llvm-svn: 26720	2006-03-13 05:15:10 +00:00
Chris Lattner	a278639f29	Several big changes: 1. Use flags on the instructions in the .td file to indicate the PPC970 unit type instead of a table in the .cpp file. Much cleaner. 2. Change the hazard recognizer to build d-groups according to the actual algorithm used, not my flawed understanding of it. 3. Model "must be in the first slot" and "must be the only instr in a group" accurately. llvm-svn: 26719	2006-03-12 09:13:49 +00:00
Chris Lattner	19b93158c1	blr is a branch too llvm-svn: 26710	2006-03-11 21:49:49 +00:00

... 2 3 4 5 6 ...

1509 Commits