llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00

Author	SHA1	Message	Date
Dan Gohman	98ca33cb59	Fix the handling of va_copy on x86-64. As of llvm-gcc r49920 llvm-gcc is now lowering va_copy on x86-64, so this completes the fix for PR2230. llvm-svn: 49922	2008-04-18 20:55:41 +00:00
Roman Levenstein	728d59166f	Ongoing work on improving the instruction selection infrastructure: Rename SDOperandImpl back to SDOperand. Introduce the SDUse class that represents a use of the SDNode referred by an SDOperand. Now it is more similar to Use/Value classes. Patch is approved by Dan Gohman. llvm-svn: 49795	2008-04-16 16:15:27 +00:00
Dan Gohman	be8f2b452b	Add support for the form of the SSE41 extractps instruction that puts its result in a 32-bit GPR. llvm-svn: 49762	2008-04-16 02:32:24 +00:00
Dan Gohman	cf79877623	Recreate the size SDNode instead of reusing the old one in the x86 memcpy lowering code; this ensures that the size node has the desired result type. This fixes a regression from r49572 with @llvm.memcpy.i64 on x86-32. llvm-svn: 49761	2008-04-16 01:32:32 +00:00
Dan Gohman	8d46278998	Fix const-correctness issues with the SrcValue handling in the memory intrinsic expansion code. llvm-svn: 49666	2008-04-14 17:55:48 +00:00
Arnold Schwaighofer	82af0e6a43	This patch corrects the handling of byval arguments for tailcall optimized x86-64 (and x86) calls so that they work (... at least for my test cases). Should fix the following problems: Problem 1: When i introduced the optimized handling of arguments for tail called functions (using a sequence of copyto/copyfrom virtual registers instead of always lowering to top of the stack) i did not handle byval arguments correctly e.g they did not work at all :). Problem 2: On x86-64 after the arguments of the tail called function are moved to their registers (which include ESI/RSI etc), tail call optimization performs byval lowering which causes xSI,xDI, xCX registers to be overwritten. This is handled in this patch by moving the arguments to virtual registers first and after the byval lowering the arguments are moved from those virtual registers back to RSI/RDI/RCX. llvm-svn: 49584	2008-04-12 18:11:06 +00:00
Dan Gohman	15edbf989f	Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal on any current target and aren't optimized in DAGCombiner. Instead of using intermediate nodes, expand the operations, choosing between simple loads/stores, target-specific code, and library calls, immediately. Previously, the code to emit optimized code for these operations was only used at initial SelectionDAG construction time; now it is used at all times. This fixes some cases where rep;movs was being used for small copies where simple loads/stores would be better. This also cleans up code that checks for alignments less than 4; let the targets make that decision instead of doing it in target-independent code. This allows x86 to use rep;movs in low-alignment cases. Also, this fixes a bug that resulted in the use of rep;stos for memsets of 0 with non-constant memory size when the alignment was at least 4. It's better to use the library in this case, which can be significantly faster when the size is large. This also preserves more SourceValue information when memory intrinsics are lowered into simple loads/stores. llvm-svn: 49572	2008-04-12 04:36:06 +00:00
Dan Gohman	41f9d24d52	Fix a bug that prevented x86-64 from using rep.movsq for 8-byte-aligned data. llvm-svn: 49571	2008-04-12 02:35:39 +00:00
Dan Gohman	b3a511b236	Make isVectorClearMaskLegal's operand list const. llvm-svn: 49446	2008-04-09 20:09:42 +00:00
Roman Levenstein	b40d332929	Re-commit of the r48822, where the infinite looping problem discovered by Dan Gohman is fixed. llvm-svn: 49330	2008-04-07 10:06:32 +00:00
Evan Cheng	4d7b2ab16f	Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps. llvm-svn: 49244	2008-04-05 00:30:36 +00:00
Evan Cheng	497c607fae	Backing out 48222 temporarily. llvm-svn: 49124	2008-04-03 03:13:16 +00:00
Dan Gohman	a3e01dc1ec	Don't use __bzero for memset if the second argument isn't zero. llvm-svn: 49050	2008-04-01 20:56:18 +00:00
Dan Gohman	168b2b1300	Speculatively micro-optimize memory-zeroing calls on Darwin 10. llvm-svn: 49048	2008-04-01 20:38:36 +00:00
Dale Johannesen	1336104c02	Accept 'y' constraint (MMX) in inline asm. llvm-svn: 49011	2008-04-01 00:57:48 +00:00
Dan Gohman	227e702cae	Fix a tokenfactor node to use the load chain rather than the load value. This fixes PR2177. llvm-svn: 48932	2008-03-28 23:45:16 +00:00
Roman Levenstein	55b8822511	Use a linked data structure for the uses lists of an SDNode, just like LLVM Value/Use does and MachineRegisterInfo/MachineOperand does. This allows constant time for all uses list maintenance operations. The idea was suggested by Chris. Reviewed by Evan and Dan. Patch is tested and approved by Dan. On normal use-cases compilation speed is not affected. On very big basic blocks there are compilation speedups in the range of 15-20% or even better. llvm-svn: 48822	2008-03-26 12:39:26 +00:00
Evan Cheng	dbdf48276a	- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction. llvm-svn: 48746	2008-03-24 21:52:23 +00:00
Anton Korobeynikov	dad919f561	Add convenient helper for win64 check. Simplify things slightly. llvm-svn: 48691	2008-03-22 20:57:27 +00:00
Anton Korobeynikov	27c8ad4020	Initial support for Win64 calling conventions. Still in early state. llvm-svn: 48690	2008-03-22 20:37:30 +00:00
Duncan Sands	4153fc30c9	Introduce a new node for holding call argument flags. This is needed by the new legalize types infrastructure which wants to expand the 64 bit constants previously used to hold the flags on 32 bit machines. There are two functional changes: (1) in LowerArguments, if a parameter has the zext attribute set then that is marked in the flags; before it was being ignored; (2) PPC had some bogus code for handling two word arguments when using the ELF 32 ABI, which was hard to convert because of the bogusness. As suggested by the original author (Nicolas Geoffray), I've disabled it for the moment. Tested with "make check" and the Ada ACATS testsuite. llvm-svn: 48640	2008-03-21 09:14:45 +00:00
Chris Lattner	edfc239ced	remove Evan's "ugly hack" that sorta attempted to get x86-64 return conventions correct, but was never enabled. We can now do the "right thing" with multiple return values. llvm-svn: 48635	2008-03-21 06:50:21 +00:00
Evan Cheng	8ecb189245	Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m))) llvm-svn: 48578	2008-03-20 02:18:41 +00:00
Christopher Lamb	958b0494c3	Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491. llvm-svn: 48542	2008-03-19 08:30:06 +00:00
Evan Cheng	5ac87b837e	Fix a x86-64 isel lowering bug that's been around forever. A x86-64 varargs function implicitly reads X86::AL, don't clobber it! llvm-svn: 48515	2008-03-18 23:36:35 +00:00
Chris Lattner	7925cc72c0	Reimplement the parameter attributes support, phase #1 . hilights: 1. There is now a "PAListPtr" class, which is a smart pointer around the underlying uniqued parameter attribute list object, and manages its refcount. It is now impossible to mess up the refcount. 2. PAListPtr is now the main interface to the underlying object, and the underlying object is now completely opaque. 3. Implementation details like SmallVector and FoldingSet are now no longer part of the interface. 4. You can create a PAListPtr with an arbitrary sequence of ParamAttrsWithIndex's, no need to make a SmallVector of a specific size (you can just use an array or scalar or vector if you wish). 5. All the client code that had to check for a null pointer before dereferencing the pointer is simplified to just access the PAListPtr directly. 6. The interfaces for adding attrs to a list and removing them is a bit simpler. Phase #2 will rename some stuff (e.g. PAListPtr) and do other less invasive changes. llvm-svn: 48289	2008-03-12 17:45:29 +00:00
Chris Lattner	b3fefb1e5c	start handling the 'f' x87 constraint. llvm-svn: 48239	2008-03-11 19:06:29 +00:00
Chris Lattner	9826c9365e	Change the model for FP Stack return to use fp operands on the RET instruction instead of using FpSET_ST0_32. This also generalizes the code to handling returning of multiple FP results. llvm-svn: 48209	2008-03-11 03:23:40 +00:00
Chris Lattner	d393772580	Eliminate the FP_GET_ST0/FP_SET_ST0 target-specific dag nodes, just lower to copyfromreg/copytoreg instead. llvm-svn: 48174	2008-03-10 21:08:41 +00:00
Evan Cheng	7d9e5a7680	Default ISD::PREFETCH to expand. llvm-svn: 48169	2008-03-10 19:38:10 +00:00
Scott Michel	bb8e8fca47	Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's return ValueType can depend its operands' ValueType. This is a cosmetic change, no functionality impacted. llvm-svn: 48145	2008-03-10 15:42:14 +00:00
Dale Johannesen	e6b0009792	Increase ISD::ParamFlags to 64 bits. Increase the ByValSize field to 32 bits, thus enabling correct handling of ByVal structs bigger than 0x1ffff. Abstract interface a bit. Fixes gcc.c-torture/execute/pr23135.c and gcc.c-torture/execute/pr28982b.c in gcc testsuite (were ICE'ing on ppc32, quietly producing wrong code on x86-32.) llvm-svn: 48122	2008-03-10 02:17:22 +00:00
Chris Lattner	2e7537b60b	rename FP_SETRESULT -> FP_SET_ST0 llvm-svn: 48094	2008-03-09 07:08:44 +00:00
Chris Lattner	826402e365	rename FpGETRESULT32 -> FpGET_ST0_32 etc. Add support for isel'ing value preserving FP roundings from one fp stack reg to another into a noop, instead of stack traffic. llvm-svn: 48093	2008-03-09 07:05:32 +00:00
Chris Lattner	b628208161	Finish implementing a readme entry: when inserting an i64 variable into a vector of zeros or undef, and when the top part is obviously zero, we can just use movd + shuffle. This allows us to compile vec_set-B.ll into: _test3: movl $1234567, %eax andl 4(%esp), %eax movd %eax, %xmm0 ret instead of: _test3: subl $28, %esp movl $1234567, %eax andl 32(%esp), %eax movl %eax, (%esp) movl $0, 4(%esp) movq (%esp), %xmm0 addl $28, %esp ret llvm-svn: 48090	2008-03-09 05:42:06 +00:00
Chris Lattner	17f68a3075	Implement a readme entry, compiling #include <xmmintrin.h> __m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);} into: movl $1, %eax movd %eax, %xmm0 ret instead of a constant pool load. llvm-svn: 48063	2008-03-09 01:05:04 +00:00
Chris Lattner	81deb3bc9c	1) Improve comments. 2) Don't try to insert an i64 value into the low part of a vector with movq on an x86-32 target. This allows us to compile: __m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);} into: _doload64: movaps LCPI1_0, %xmm0 ret instead of: _doload64: subl $28, %esp movl $0, 4(%esp) movl $1, (%esp) movq (%esp), %xmm0 addl $28, %esp ret llvm-svn: 48057	2008-03-08 22:59:52 +00:00
Chris Lattner	405f2c6356	minor simplifications to this code, don't create a dead SCALAR_TO_VECTOR on paths that end up not using it. llvm-svn: 48056	2008-03-08 22:48:29 +00:00
Evan Cheng	dba1dfe962	Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0\|1\|2} and prefetchnta instructions. llvm-svn: 48042	2008-03-08 00:58:38 +00:00
Chris Lattner	aa81dc7d21	mark frem as expand for all legal fp types on x86, regardless of whether we're using SSE or not. This fixes PR2122. llvm-svn: 48006	2008-03-07 06:36:32 +00:00
Andrew Lenharth	95c88272c6	64bit CAS on 32bit x86. llvm-svn: 47929	2008-03-05 01:15:49 +00:00
Andrew Lenharth	f5674915c5	x86-64 atomics llvm-svn: 47903	2008-03-04 21:13:33 +00:00
Dan Gohman	ccc0bc5878	Add support for lowering i64 SRA_PARTS and friends on x86-64. llvm-svn: 47865	2008-03-03 22:22:09 +00:00
Andrew Lenharth	f6c220738c	make CAS work llvm-svn: 47799	2008-03-01 22:27:48 +00:00
Andrew Lenharth	b91c664226	all but CAS working on x86 llvm-svn: 47798	2008-03-01 21:52:34 +00:00
Evan Cheng	f8b1257d2e	Add a quick and dirty "loop aligner pass". x86 uses it to align its loops to 16-byte boundaries. llvm-svn: 47703	2008-02-28 00:43:03 +00:00
Chris Lattner	1f46cc2345	Make X86TargetLowering::LowerSINT_TO_FP return without creating a dead stack slot and store if the SINT_TO_FP is actually legal. This allows us to compile: double a(double b) {return (unsigned)b;} to: _a: cvttsd2siq %xmm0, %rax movl %eax, %eax cvtsi2sdq %rax, %xmm0 ret instead of: _a: subq $8, %rsp cvttsd2siq %xmm0, %rax movl %eax, %eax cvtsi2sdq %rax, %xmm0 addq $8, %rsp ret crazy. llvm-svn: 47660	2008-02-27 05:57:41 +00:00
Arnold Schwaighofer	642dc28734	Refactor according to Evan's and Anton's suggestions. llvm-svn: 47635	2008-02-26 22:21:54 +00:00
Arnold Schwaighofer	7466041819	Correct function comments. llvm-svn: 47606	2008-02-26 17:50:59 +00:00
Arnold Schwaighofer	ddebd886dc	Add support for intermodule tail calls on x86/32bit with GOT-style position independent code. Before only tail calls to protected/hidden functions within the same module were optimized. Now all function calls are tail call optimized. llvm-svn: 47594	2008-02-26 10:21:54 +00:00

1 2 3 4 5 ...

634 Commits