llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Chris Lattner	e83ae1063f	Fold loads into sign/zero extends. instead of: mov %AL, BYTE PTR [%EDX + l18_length_code] movzx %EAX, %AL Emit: movzx %EAX, BYTE PTR [%EDX + l18_length_code] llvm-svn: 19489	2005-01-11 23:33:00 +00:00
Chris Lattner	87a38bd4a8	Comment out debug code :) Select [mem] += Val operations. For constants, we used to get: mov %ECX, -32768 add %ECX, DWORD PTR [l4_match_start] mov DWORD PTR [l4_match_start], %ECX Now we get: add DWORD PTR [l4_match_start], -32768 For other values we used to get: mov %EBP, %EDI ;; because the add destroys the value add %EBP, DWORD PTR [l4_input_len] mov DWORD PTR [l4_input_len], %EBP now we get: add DWORD PTR [l4_input_len], %EDI Both of these use less registers than the alternative, are faster and smaller. llvm-svn: 19488	2005-01-11 23:21:30 +00:00
Chris Lattner	282473a25d	Handle the global address case here, not just the offset case. llvm-svn: 19487	2005-01-11 22:58:43 +00:00
Chris Lattner	9eb2cc700b	Treat int constants as not requiring a register, since they are almost always folded into an instruction. llvm-svn: 19486	2005-01-11 22:29:12 +00:00
Chris Lattner	74fcfd5148	Print the value types in the nodes of the graph llvm-svn: 19485	2005-01-11 22:21:04 +00:00
Chris Lattner	f588cdd51e	add an assertion, avoid creating copyfromreg/copytoreg pairs that are the same for PHI nodes. llvm-svn: 19484	2005-01-11 22:03:46 +00:00
Chris Lattner	7cb2220907	* Factor a bunch of binary operator cases into shared code. * Fold loads into Add, sub, and, or, xor and mul when possible. * Codegen shl X, 1 as add X, X llvm-svn: 19483	2005-01-11 21:19:59 +00:00
Chris Lattner	b1a72cb39a	Clear the whole array, always. llvm-svn: 19482	2005-01-11 20:25:26 +00:00
Chris Lattner	b838c9748e	Fold multiplies by 3,5,9 into addressing modes when possible. llvm-svn: 19480	2005-01-11 19:37:02 +00:00
Chris Lattner	8de5a27681	Squelch optimized warning. llvm-svn: 19475	2005-01-11 17:46:49 +00:00
Chris Lattner	e7b1130b01	Instead of generating stuff like this: mov %ECX, %EAX add %ECX, 32768 mov %SI, WORD PTR [2%ECX + l13_prev] Generate this: mov %SI, WORD PTR [2%ECX + l13_prev + 65536] This occurs when you have a GEP instruction where an index is "something + imm". llvm-svn: 19472	2005-01-11 06:36:20 +00:00
Chris Lattner	bb63a09cd1	Implement MEMCPY natively in terms of rep movs* llvm-svn: 19468	2005-01-11 06:19:26 +00:00
Chris Lattner	b2b08a8bc1	Implement memset -> rep stos* llvm-svn: 19467	2005-01-11 06:14:36 +00:00
Chris Lattner	58816a9e81	Announce that we don't support mem ops yet. llvm-svn: 19466	2005-01-11 05:57:36 +00:00
Chris Lattner	963af6652b	Teach legalize to lower MEMSET/MEMCPY/MEMMOVE operations if the target does not support them. llvm-svn: 19465	2005-01-11 05:57:22 +00:00
Chris Lattner	6b9082114f	Print new operations. llvm-svn: 19464	2005-01-11 05:57:01 +00:00
Chris Lattner	7cde8a2658	Turn memset/memcpy/memmove into the corresponding operations. llvm-svn: 19463	2005-01-11 05:56:49 +00:00
Chris Lattner	f867443d7e	Teach the address selector to make 'reg+reg' addressing modes. llvm-svn: 19457	2005-01-11 04:40:19 +00:00
Reid Spencer	7e9642515c	Add the LOADABLE_MODULE=1 directive to indicate that this shared library is intended to be a dlopenable module and not a "plain" shared library. llvm-svn: 19456	2005-01-11 04:33:32 +00:00
Chris Lattner	edf06be50e	Emit NOT instructions. llvm-svn: 19455	2005-01-11 04:31:30 +00:00
Chris Lattner	2eacd11a86	shift X, 0 -> X llvm-svn: 19453	2005-01-11 04:25:13 +00:00
Chris Lattner	4e4bef2d6c	Fix a bug emitting branches that broke a lot of programs. llvm-svn: 19452	2005-01-11 04:06:27 +00:00
Chris Lattner	4b51297a94	Be more careful where we set ContainsFPCode. We were missing a set in the int -> FP casting code. Note that we don't have to set it for FP operations that take FP values as operands: whatever produces the FP value will set the flag. llvm-svn: 19451	2005-01-11 03:50:45 +00:00
Chris Lattner	0c4c4094e3	Fix a major bug in setcc/cmov folding, where we accidentally inverted the sense of the comparison. llvm-svn: 19450	2005-01-11 03:37:59 +00:00
Chris Lattner	d188e03011	Take register pressure into account when we have to decide whether to evaluate the LHS or the RHS of an operation first. This causes good things to happen. For example, instead of compiling a loop to this: .LBBstrength_result7_1: # loopentry movl 16(%esp), %edi movl (%edi), %edi ;;; LOAD movl (%ecx), %ebx movl $2, (%eax,%ebx,4) movl (%edx), %ebx movl %esi, %ebp addl $21, %ebp addl $42, %esi cmpl $0, %edi ;;; USE cmovne %esi, %ebp cmpl %ebp, %ebx movl %ebp, %esi jg .LBBstrength_result7_1 We now compile it to this: .LBBstrength_result7_1: # loopentry movl %edi, %ebx addl $42, %ebx addl $21, %edi movl (%ecx), %ebp ;; LOAD cmpl $0, %ebp ;; USE cmovne %ebx, %edi movl (%edx), %ebx movl $2, (%eax,%ebx,4) movl (%esi), %ebx cmpl %edi, %ebx jg .LBBstrength_result7_1 Which reduces register pressure enough (in this case) to avoid spilling in the loop. As another example, consider the CodeGen/X86/regpressure.ll testcase. We used to generate this code for both cases: regpressure1: subl $32, %esp movl %esi, 12(%esp) movl %edi, 8(%esp) movl %ebx, 4(%esp) movl %ebp, (%esp) movl 36(%esp), %ecx movl (%ecx), %eax movl 4(%ecx), %edx movl %edx, 24(%esp) movl 8(%ecx), %edx movl %edx, 16(%esp) movl 12(%ecx), %edx movl 16(%ecx), %esi movl 20(%ecx), %edi movl 24(%ecx), %ebx movl %ebx, 28(%esp) movl 28(%ecx), %ebx movl 32(%ecx), %ebp movl %ebp, 20(%esp) movl 36(%ecx), %ecx imull 24(%esp), %eax imull 16(%esp), %eax imull %edx, %eax imull %esi, %eax imull %edi, %eax imull 28(%esp), %eax imull %ebx, %eax imull 20(%esp), %eax imull %ecx, %eax movl (%esp), %ebp movl 4(%esp), %ebx movl 8(%esp), %edi movl 12(%esp), %esi addl $32, %esp ret This code is basically trying to do all of the loads first, then execute all of the multiplies. Because we run out of registers, lots of spill code happens. We now generate this code for both cases: regpressure1: movl 4(%esp), %ecx movl (%ecx), %eax movl 4(%ecx), %edx imull %edx, %eax movl 8(%ecx), %edx imull %edx, %eax movl 12(%ecx), %edx imull %edx, %eax movl 16(%ecx), %edx imull %edx, %eax movl 20(%ecx), %edx imull %edx, %eax movl 24(%ecx), %edx imull %edx, %eax movl 28(%ecx), %edx imull %edx, %eax movl 32(%ecx), %edx imull %edx, %eax movl 36(%ecx), %ecx imull %ecx, %eax ret which is much nicer (when we fold loads into the muls it will be even better). The old instruction selector used to produce the good code for regpressure1 but not for regpressure2, as it depended on the order of operations in the LLVM code. llvm-svn: 19449	2005-01-11 03:11:44 +00:00
Chris Lattner	07a3ade230	Print SelectionDAGs bottom up, include extra info in the node labels llvm-svn: 19447	2005-01-11 00:34:33 +00:00
Chris Lattner	1c273d3a14	Add a marker for the graph root. llvm-svn: 19445	2005-01-10 23:52:04 +00:00
Chris Lattner	daa052a97e	Put the operation name in each node, put the function name on the graph. llvm-svn: 19444	2005-01-10 23:26:00 +00:00
Chris Lattner	0307506841	Split out SDNode::getOperationName into its own method. llvm-svn: 19443	2005-01-10 23:25:25 +00:00
Chris Lattner	8c13447254	Implement initial selectiondag printing support. This gets us a nice graph with no labels! :) llvm-svn: 19441	2005-01-10 23:08:40 +00:00
Chris Lattner	497e24c885	Fold setcc instructions into selects. llvm-svn: 19438	2005-01-10 22:10:13 +00:00
Chris Lattner	65d007ab62	Add conditional moves for the parity flag. llvm-svn: 19437	2005-01-10 22:09:33 +00:00
Chris Lattner	5433d8de29	Lower to the correct functions. This fixes FreeBench/fourinarow llvm-svn: 19436	2005-01-10 21:02:37 +00:00
Chris Lattner	d61491dea2	Implement 8-bit multiply for X86. llvm-svn: 19435	2005-01-10 20:55:48 +00:00
Chris Lattner	b35b30c283	Rework constant pool handling so that function constant pools are no longer leaked to the system. Now they are destroyed with the JITMemoryManager is destroyed. llvm-svn: 19434	2005-01-10 18:23:22 +00:00
Jeff Cohen	8b03a55724	Apply feedback from Chris. llvm-svn: 19432	2005-01-10 04:23:32 +00:00
Jeff Cohen	a7f1ae5dc0	Apply feed back from Chris: 1. Rename createLoaderPass to CreateProfileLoaderPass 2. Opt shouldn't use the pass registered in CodeGen. llvm-svn: 19431	2005-01-10 03:56:27 +00:00
Chris Lattner	02236df007	Implement a couple of more simplifications. This lets us codegen: int test2(int * P, int* Q, int A, int B) { return P+A == P; } into: test2: movl 4(%esp), %eax movl 12(%esp), %eax shll $2, %eax cmpl $0, %eax sete %al movzbl %al, %eax ret instead of: test2: movl 4(%esp), %eax movl 12(%esp), %ecx leal (%eax,%ecx,4), %ecx cmpl %eax, %ecx sete %al movzbl %al, %eax ret ICC is producing worse code: test2: movl 4(%esp), %eax #8.5 movl 12(%esp), %edx #8.5 lea (%edx,%edx), %ecx #9.9 addl %ecx, %ecx #9.9 addl %eax, %ecx #9.9 cmpl %eax, %ecx #9.16 movl $0, %eax #9.16 sete %al #9.16 ret #9.16 as is GCC (looks like our old code): test2: movl 4(%esp), %edx movl 12(%esp), %eax leal (%edx,%eax,4), %ecx cmpl %edx, %ecx sete %al movzbl %al, %eax ret llvm-svn: 19430	2005-01-10 02:03:02 +00:00
Chris Lattner	8d09b03ed1	Fix incorrect constant folds, fixing Stepanov after the SHR patch. llvm-svn: 19429	2005-01-10 01:16:03 +00:00
Chris Lattner	9d479d4a34	Constant fold shifts, turning this loop: .LBB_Z5test0PdS__3: # no_exit.1 fldl data(,%eax,8) fldl 24(%esp) faddp %st(1) fstl 24(%esp) incl %eax movl $16000, %ecx sarl $3, %ecx cmpl %eax, %ecx fstpl 16(%esp) #FP_REG_KILL jg .LBB_Z5test0PdS__3 # no_exit.1 into: .LBB_Z5test0PdS__3: # no_exit.1 fldl data(,%eax,8) fldl 24(%esp) faddp %st(1) fstl 24(%esp) incl %eax cmpl $2000, %eax fstpl 16(%esp) #FP_REG_KILL jl .LBB_Z5test0PdS__3 # no_exit.1 llvm-svn: 19427	2005-01-10 00:07:15 +00:00
Reid Spencer	283688b80d	Rename Unix/.cpp and Win32/.cpp to have a *.inc suffix so that the silly gdb debugger doesn't get confused on which file it is reading (the one in lib/System or the one in lib/System/{Win32,Unix}) llvm-svn: 19426	2005-01-09 23:29:00 +00:00
Chris Lattner	59d7066da8	Add some folds for == and != comparisons. This allows us to codegen this loop in stepanov: no_exit.i: ; preds = %entry, %no_exit.i, %then.i, %_Z5checkd.exit %i.0.0 = phi int [ 0, %entry ], [ %i.0.0, %no_exit.i ], [ %inc.0, %_Z5checkd.exit ], [ %inc.012, %then.i ] ; <int> [#uses=3] %indvar = phi uint [ %indvar.next, %no_exit.i ], [ 0, %entry ], [ 0, %then.i ], [ 0, %_Z5checkd.exit ] ; <uint> [#uses=3] %result_addr.i.0 = phi double [ %tmp.4.i.i, %no_exit.i ], [ 0.000000e+00, %entry ], [ 0.000000e+00, %then.i ], [ 0.000000e+00, %_Z5checkd.exit ] ; <double> [#uses=1] %first_addr.0.i.2.rec = cast uint %indvar to int ; <int> [#uses=1] %first_addr.0.i.2 = getelementptr [2000 x double]* %data, int 0, uint %indvar ; <double> [#uses=1] %inc.i.rec = add int %first_addr.0.i.2.rec, 1 ; <int> [#uses=1] %inc.i = getelementptr [2000 x double] %data, int 0, int %inc.i.rec ; <double> [#uses=1] %tmp.3.i.i = load double %first_addr.0.i.2 ; <double> [#uses=1] %tmp.4.i.i = add double %result_addr.i.0, %tmp.3.i.i ; <double> [#uses=2] %tmp.2.i = seteq double* %inc.i, getelementptr ([2000 x double]* %data, int 0, int 2000) ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.2.i, label %_Z10accumulateIPddET0_T_S2_S1_.exit, label %no_exit.i To this: .LBB_Z4testIPddEvT_S1_T0__1: # no_exit.i fldl data(,%eax,8) fldl 16(%esp) faddp %st(1) fstpl 16(%esp) incl %eax movl %eax, %ecx shll $3, %ecx cmpl $16000, %ecx #FP_REG_KILL jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i instead of this: .LBB_Z4testIPddEvT_S1_T0__1: # no_exit.i fldl data(,%eax,8) fldl 16(%esp) faddp %st(1) fstpl 16(%esp) incl %eax leal data(,%eax,8), %ecx leal data+16000, %edx cmpl %edx, %ecx #FP_REG_KILL jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i llvm-svn: 19425	2005-01-09 20:52:51 +00:00
Jeff Cohen	f692cd303d	Add last four createXxxPass functions llvm-svn: 19424	2005-01-09 20:42:52 +00:00
Jeff Cohen	91dd6d2d20	Fix VC++ compilation error llvm-svn: 19423	2005-01-09 20:41:56 +00:00
Chris Lattner	fa06762d0e	Print the DAG out more like a DAG in nested format. llvm-svn: 19422	2005-01-09 20:38:33 +00:00
Chris Lattner	e3b9f22967	Print out nodes sorted by their address to make it easier to find them in a list. llvm-svn: 19421	2005-01-09 20:26:36 +00:00
Chris Lattner	fcab5f75c0	Codegen (Reg\|imm)+&GV as an LEA, because we cannot put it into the immediate field of an ADDri (due to current restrictions on MachineOperand :( ). This allows us to generate: leal Data+16000, %edx instead of: movl $Data, %edx addl $16000, %edx llvm-svn: 19420	2005-01-09 20:20:29 +00:00
Chris Lattner	82caa0dc2e	Add a simple transformation. This allows us to compile one of the inner loops in stepanov to this: .LBB_Z5test0PdS__2: # no_exit.1 fldl data(,%eax,8) fldl 24(%esp) faddp %st(1) fstl 24(%esp) incl %eax cmpl $2000, %eax fstpl 16(%esp) #FP_REG_KILL jl .LBB_Z5test0PdS__2 instead of this: .LBB_Z5test0PdS__2: # no_exit.1 fldl data(,%eax,8) fldl 24(%esp) faddp %st(1) fstl 24(%esp) incl %eax movl $data, %ecx movl %ecx, %edx addl $16000, %edx subl %ecx, %edx movl %edx, %ecx sarl $2, %ecx shrl $29, %ecx addl %ecx, %edx sarl $3, %edx cmpl %edx, %eax fstpl 16(%esp) #FP_REG_KILL jl .LBB_Z5test0PdS__2 The old instruction selector produced: .LBB_Z5test0PdS__2: # no_exit.1 fldl 24(%esp) faddl data(,%eax,8) fstl 24(%esp) movl %eax, %ecx incl %ecx incl %eax leal data+16000, %edx movl $data, %edi subl %edi, %edx movl %edx, %edi sarl $2, %edi shrl $29, %edi addl %edi, %edx sarl $3, %edx cmpl %edx, %ecx fstpl 16(%esp) #FP_REG_KILL jl .LBB_Z5test0PdS__2 # no_exit.1 Which is even worse! llvm-svn: 19419	2005-01-09 20:09:57 +00:00
Chris Lattner	35375c11bf	Fix copy and pasto's for FP -> Int. This fixes fldry llvm-svn: 19418	2005-01-09 19:49:59 +00:00
Chris Lattner	d674d08230	Fix a bug legalizing call instructions (make sure to remember all result values), and eliminate some switch statements. llvm-svn: 19417	2005-01-09 19:43:23 +00:00

1 2 3 4 5 ...

8887 Commits