llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00

Author	SHA1	Message	Date
Arnold Schwaighofer	f58a35e2ec	Tail call optimization improvements: Move platform independent code (lowering of possibly overwritten arguments, check for tail call optimization eligibility) from target X86ISelectionLowering.cpp to TargetLowering.h and SelectionDAGISel.cpp. Initial PowerPC tail call implementation: Support ppc32 implemented and tested (passes my tests and test-suite llvm-test). Support ppc64 implemented and half tested (passes my tests). On ppc tail call optimization is performed if caller and callee are fastcc call is a tail call (in tail call position, call followed by ret) no variable argument lists or byval arguments option -tailcallopt is enabled Supported: * non pic tail calls on linux/darwin * module-local tail calls on linux(PIC/GOT)/darwin(PIC) * inter-module tail calls on darwin(PIC) If constraints are not met a normal call will be emitted. A test checking the argument lowering behaviour on x86-64 was added. llvm-svn: 50477	2008-04-30 09:16:33 +00:00
Dan Gohman	0285c1e9bb	Fix the SVOffset values for loads and stores produced by memcpy/memset expansion. It was a bug for the SVOffset value to be used in the actual address calculations. llvm-svn: 50359	2008-04-28 17:15:20 +00:00
Chris Lattner	b5bd654163	A few inline asm cleanups: - Make targetlowering.h fit in 80 cols. - Make LowerAsmOperandForConstraint const. - Make lowerXConstraint -> LowerXConstraint - Make LowerXConstraint return a const char* instead of taking a string byref. llvm-svn: 50312	2008-04-26 23:02:14 +00:00
Dan Gohman	6f9b55bc7c	Remove X86_64SRet; it isn't used anymore. llvm-svn: 49759	2008-04-16 00:24:30 +00:00
Dan Gohman	8d46278998	Fix const-correctness issues with the SrcValue handling in the memory intrinsic expansion code. llvm-svn: 49666	2008-04-14 17:55:48 +00:00
Arnold Schwaighofer	82af0e6a43	This patch corrects the handling of byval arguments for tailcall optimized x86-64 (and x86) calls so that they work (... at least for my test cases). Should fix the following problems: Problem 1: When i introduced the optimized handling of arguments for tail called functions (using a sequence of copyto/copyfrom virtual registers instead of always lowering to top of the stack) i did not handle byval arguments correctly e.g they did not work at all :). Problem 2: On x86-64 after the arguments of the tail called function are moved to their registers (which include ESI/RSI etc), tail call optimization performs byval lowering which causes xSI,xDI, xCX registers to be overwritten. This is handled in this patch by moving the arguments to virtual registers first and after the byval lowering the arguments are moved from those virtual registers back to RSI/RDI/RCX. llvm-svn: 49584	2008-04-12 18:11:06 +00:00
Dan Gohman	15edbf989f	Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal on any current target and aren't optimized in DAGCombiner. Instead of using intermediate nodes, expand the operations, choosing between simple loads/stores, target-specific code, and library calls, immediately. Previously, the code to emit optimized code for these operations was only used at initial SelectionDAG construction time; now it is used at all times. This fixes some cases where rep;movs was being used for small copies where simple loads/stores would be better. This also cleans up code that checks for alignments less than 4; let the targets make that decision instead of doing it in target-independent code. This allows x86 to use rep;movs in low-alignment cases. Also, this fixes a bug that resulted in the use of rep;stos for memsets of 0 with non-constant memory size when the alignment was at least 4. It's better to use the library in this case, which can be significantly faster when the size is large. This also preserves more SourceValue information when memory intrinsics are lowered into simple loads/stores. llvm-svn: 49572	2008-04-12 04:36:06 +00:00
Dan Gohman	b3a511b236	Make isVectorClearMaskLegal's operand list const. llvm-svn: 49446	2008-04-09 20:09:42 +00:00
Chris Lattner	edfc239ced	remove Evan's "ugly hack" that sorta attempted to get x86-64 return conventions correct, but was never enabled. We can now do the "right thing" with multiple return values. llvm-svn: 48635	2008-03-21 06:50:21 +00:00
Arnold Schwaighofer	19a78545d9	Don't loose incoming argument registers. Fix documentation style. llvm-svn: 48545	2008-03-19 16:39:45 +00:00
Chris Lattner	d393772580	Eliminate the FP_GET_ST0/FP_SET_ST0 target-specific dag nodes, just lower to copyfromreg/copytoreg instead. llvm-svn: 48174	2008-03-10 21:08:41 +00:00
Scott Michel	bb8e8fca47	Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's return ValueType can depend its operands' ValueType. This is a cosmetic change, no functionality impacted. llvm-svn: 48145	2008-03-10 15:42:14 +00:00
Chris Lattner	2e7537b60b	rename FP_SETRESULT -> FP_SET_ST0 llvm-svn: 48094	2008-03-09 07:08:44 +00:00
Chris Lattner	826402e365	rename FpGETRESULT32 -> FpGET_ST0_32 etc. Add support for isel'ing value preserving FP roundings from one fp stack reg to another into a noop, instead of stack traffic. llvm-svn: 48093	2008-03-09 07:05:32 +00:00
Evan Cheng	e0b3c221ab	Add a target lowering hook to control whether it's worthwhile to compress fp constant. For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive. llvm-svn: 47931	2008-03-05 01:30:59 +00:00
Andrew Lenharth	95c88272c6	64bit CAS on 32bit x86. llvm-svn: 47929	2008-03-05 01:15:49 +00:00
Andrew Lenharth	b91c664226	all but CAS working on x86 llvm-svn: 47798	2008-03-01 21:52:34 +00:00
Arnold Schwaighofer	642dc28734	Refactor according to Evan's and Anton's suggestions. llvm-svn: 47635	2008-02-26 22:21:54 +00:00
Arnold Schwaighofer	6383666085	Change the lowering of arguments for tail call optimized calls. Before arguments that could overwrite each other were explicitly lowered to a stack slot, not giving the register allocator a chance to optimize. Now a sequence of copyto/copyfrom virtual registers ensures that arguments are loaded in (virtual) registers before they are lowered to the stack slot (and might overwrite each other). Also parameter stack slots are marked mutable for (potentially) tail calling functions. llvm-svn: 47593	2008-02-26 09:19:59 +00:00
Evan Cheng	bb577266bf	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00
Dan Gohman	99b38405e3	Simplify some logic in ComputeMaskedBits. And change ComputeMaskedBits to pass the mask APInt by value, not by reference. llvm-svn: 47096	2008-02-13 22:28:48 +00:00
Dan Gohman	09023887f8	Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t. Add an overload that supports the uint64_t interface for use by clients that haven't been updated yet. llvm-svn: 47039	2008-02-13 00:35:47 +00:00
Nate Begeman	5a4e290b70	Enable SSE4 codegen and pattern matching. Add some notes to the README. llvm-svn: 46949	2008-02-11 04:19:36 +00:00
Dan Gohman	cabaec582f	Rename MRegisterInfo to TargetRegisterInfo. llvm-svn: 46930	2008-02-10 18:45:23 +00:00
Dan Gohman	3993809a0c	Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting with the real FLT_ROUNDS (defined in <float.h>). llvm-svn: 46587	2008-01-31 00:41:03 +00:00
Evan Cheng	918b9c9335	Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert instruction at the end. llvm-svn: 46562	2008-01-30 18:18:23 +00:00
Evan Cheng	618761903d	Work in progress. This patch fixes x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results. Status: This only works for direct calls, and only the caller side is done. Disabled for now. llvm-svn: 46527	2008-01-29 19:34:22 +00:00
Dale Johannesen	f12104ce4b	Handle 'X' constraint in asm's better. llvm-svn: 46485	2008-01-29 02:21:21 +00:00
Evan Cheng	91089e6d66	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Chris Lattner	eb7d65d073	make a method public llvm-svn: 46159	2008-01-18 06:52:41 +00:00
Chris Lattner	2825f8fff0	make it more clear that this predicate only applies to scalar FP types. llvm-svn: 46058	2008-01-16 06:24:21 +00:00
Chris Lattner	fbb75278b2	introduce a isTypeInSSEReg predicate, which allows us to simplify some code. No functionality change. llvm-svn: 46055	2008-01-16 06:19:45 +00:00
Chris Lattner	0072ce1ff4	no need to expand ISD::TRAP to X86ISD::TRAP, just match ISD::TRAP. llvm-svn: 46015	2008-01-15 21:58:22 +00:00
Anton Korobeynikov	08ea121968	For PR1839: add initial support for __builtin_trap. llvm-gcc part is missed as well as PPC codegen llvm-svn: 46001	2008-01-15 07:02:33 +00:00
Gordon Henriksen	f4e137838b	Refactoring the x86 and x86-64 calling convention implementations, unifying the copied algorithms and saving over 500 LOC. There should be no functionality change, but please test on your favorite x86 target. llvm-svn: 45627	2008-01-05 16:56:59 +00:00
Chris Lattner	ad9a6ccb83	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Evan Cheng	51cf86ded0	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Chris Lattner	28262fbaf2	Several changes: 1) Change the interface to TargetLowering::ExpandOperationResult to take and return entire NODES that need a result expanded, not just the value. This allows us to handle things like READCYCLECOUNTER, which returns two values. 2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES. 3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new ExpandOperationResult. This makes the result simpler and fully general. 4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes. 5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM i64 shifts, allowing them to work with LegalizeDAGTypes. 6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT, allowing them to work with LegalizeDAGTypes. LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when type legalization in LegalizeDAG is ifdef'd out. llvm-svn: 44300	2007-11-24 07:07:01 +00:00
Anton Korobeynikov	cd9b16df61	Implement codegen for flt_rounds on x86 llvm-svn: 44183	2007-11-16 01:31:51 +00:00
Evan Cheng	7d8deec92f	Much improved pic jumptable codegen: Then: call "L1$pb" "L1$pb": popl %eax ... LBB1_1: # entry imull $4, %ecx, %ecx leal LJTI1_0-"L1$pb"(%eax), %edx addl LJTI1_0-"L1$pb"(%ecx,%eax), %edx jmpl %edx .align 2 .set L1_0_set_3,LBB1_3-LJTI1_0 .set L1_0_set_2,LBB1_2-LJTI1_0 .set L1_0_set_5,LBB1_5-LJTI1_0 .set L1_0_set_4,LBB1_4-LJTI1_0 LJTI1_0: .long L1_0_set_3 .long L1_0_set_2 Now: call "L1$pb" "L1$pb": popl %eax ... LBB1_1: # entry addl LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax jmpl %eax .align 2 .set L1_0_set_3,LBB1_3-"L1$pb" .set L1_0_set_2,LBB1_2-"L1$pb" .set L1_0_set_5,LBB1_5-"L1$pb" .set L1_0_set_4,LBB1_4-"L1$pb" LJTI1_0: .long L1_0_set_3 .long L1_0_set_2 llvm-svn: 43924	2007-11-09 01:32:10 +00:00
Rafael Espindola	ec025c3042	Move the LowerMEMCPY and LowerMEMCPYCall to a common place. Thanks for the suggestions Bill :-) llvm-svn: 43742	2007-11-05 23:12:20 +00:00
Evan Cheng	5fe81cf64e	Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465	2007-10-29 19:58:20 +00:00
Evan Cheng	53696b7e9f	Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free. e.g. Turns this loop: LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx movw %dx, %si LBB1_2: # bb movl L_X$non_lazy_ptr, %edi movw %si, (%edi) movl L_Y$non_lazy_ptr, %edi movw %dx, (%edi) addw $4, %dx incw %si incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb into LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx LBB1_2: # bb movl L_X$non_lazy_ptr, %esi movw %cx, (%esi) movl L_Y$non_lazy_ptr, %esi movw %dx, (%esi) addw $4, %dx incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb llvm-svn: 43375	2007-10-26 01:56:11 +00:00
Arnold Schwaighofer	d47210011e	Added tail call optimization to the x86 back end. It can be enabled by passing -tailcallopt to llc. The optimization is performed if the following conditions are satisfied: * caller/callee are fastcc * elf/pic is disabled OR elf/pic enabled + callee is in module + callee has visibility protected or hidden llvm-svn: 42870	2007-10-11 19:40:01 +00:00
Dan Gohman	6c3e0cdd36	LowerIntegerDivOrRem no longer exists. llvm-svn: 42787	2007-10-09 15:45:13 +00:00
Dan Gohman	6df332f0cb	Migrate X86 and ARM from using X86ISD::{,I}DIV and ARMISD::MULHILO{U,S} to use ISD::{S,U}DIVREM and ISD::{S,U}MUL_HIO. Move the lowering code associated with these operators into target-independent in LegalizeDAG.cpp and TargetLowering.cpp. llvm-svn: 42762	2007-10-08 18:33:35 +00:00
Evan Cheng	f3c130a8b6	Enabling new condition code modeling scheme. llvm-svn: 42459	2007-09-29 00:00:36 +00:00
Rafael Espindola	01b306e575	Refactor the memcpy lowering for the x86 target. The only generated code difference is that now we call memcpy when the size of the array is unknown. This matches GCC behavior and is better since the run time value can be arbitrarily large. llvm-svn: 42433	2007-09-28 12:53:01 +00:00
Dan Gohman	a01dd49472	Fix a typo in a comment. llvm-svn: 42313	2007-09-25 19:37:26 +00:00
Dan Gohman	1bb346f9f1	When both x/y and x%y are needed (x and y both scalar integer), compute both results with a single div or idiv instruction. This uses new X86ISD nodes for DIV and IDIV which are introduced during the legalize phase so that the SelectionDAG's CSE can automatically eliminate redundant computations. llvm-svn: 42308	2007-09-25 18:23:27 +00:00

1 2 3 4

165 Commits