llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-29 23:12:55 +01:00

Author	SHA1	Message	Date
Chris Lattner	79ba9d58fd	Don't emit two comparisons when comparing a FP value against zero! llvm-svn: 20651	2005-03-17 16:29:26 +00:00
Chris Lattner	c9a3ea81bf	Fix the missing symbols problem Bill was hitting. Patch contributed by Bill Wendling!! llvm-svn: 20649	2005-03-17 15:38:16 +00:00
Chris Lattner	4b688a1c70	This mega patch converts us from using Function::a{iterator\|begin\|end} to using Function::arg_{iterator\|begin\|end}. Likewise Module::g* -> Module::global_*. This patch is contributed by Gabor Greif, thanks! llvm-svn: 20597	2005-03-15 04:54:21 +00:00
Reid Spencer	b0ca4aa8cd	Patch to make assembly output compatible with mingw compilation (identical to cygwin) llvm-svn: 20520	2005-03-08 17:02:05 +00:00
Chris Lattner	a024984017	Fix spelling, patch contributed by Gabor Greif! llvm-svn: 20343	2005-02-27 06:18:25 +00:00
Chris Lattner	9838ab1271	Silence some uninit variable warnings. llvm-svn: 20284	2005-02-23 05:57:21 +00:00
Chris Lattner	ab92b92bc5	We can fold promoted and non-promoted loads into divs also! llvm-svn: 19835	2005-01-25 20:35:10 +00:00
Chris Lattner	a9a0369879	Fold promoted loads into binary ops for FP, allowing us to generate m32 forms of FP ops. llvm-svn: 19834	2005-01-25 20:03:11 +00:00
Chris Lattner	6ff85c3152	Silence a warning. llvm-svn: 19798	2005-01-23 23:20:06 +00:00
Chris Lattner	94952e0947	Allow the FP stackifier to completely ignore functions that do not use FP at all. This should speed up the X86 backend fairly significantly on integer codes. Now if only we didn't have to compute livevar still... ;-) llvm-svn: 19796	2005-01-23 23:13:59 +00:00
Reid Spencer	5c7b6e83f0	Support Cygwin assembly generation. The cygwin version of Gnu ASsembler doesn't support certain directives and symbols on cygwin are prefixed with an underscore. This patch makes the necessary adjustments to the output. llvm-svn: 19775	2005-01-23 03:52:14 +00:00
Chris Lattner	b4cf4ffb04	Speed up folding operations into loads. llvm-svn: 19733	2005-01-21 21:43:02 +00:00
Chris Lattner	fd4d7f71ae	The ever-important vanity pass name :) llvm-svn: 19731	2005-01-21 21:35:14 +00:00
Chris Lattner	5f2fbeaa69	Fix a FIXME: realize that argument stores are all independent (don't alias) llvm-svn: 19728	2005-01-21 19:46:38 +00:00
Chris Lattner	febeb380ae	Implement ADD_PARTS/SUB_PARTS so that 64-bit integer add/sub work. This fixes most of the remaining llc-beta failures. llvm-svn: 19716	2005-01-20 18:53:00 +00:00
Chris Lattner	8b0a2a3251	Fix a crash compiling 134.perl. llvm-svn: 19711	2005-01-20 16:50:16 +00:00
Chris Lattner	6534e1ede3	Fix a problem where were were literally selecting for INCREASED register pressure, not decreases register pressure. Fix problem where we accidentally swapped the operands of SHLD, which caused fourinarow to fail. This fixes fourinarow. llvm-svn: 19697	2005-01-19 17:24:34 +00:00
Chris Lattner	b75589131d	When commuting these instructions, make sure to actually swap the operands too. llvm-svn: 19694	2005-01-19 16:55:52 +00:00
Chris Lattner	fde1a5688b	Implement Regression/CodeGen/X86/rotate.ll: emit rotate instructions (which typically cost 1 cycle) instead of shld/shrd instruction (which are typically 6 or more cycles). This also saves code space. For example, instead of emitting: rotr: mov %EAX, DWORD PTR [%ESP + 4] mov %CL, BYTE PTR [%ESP + 8] shrd %EAX, %EAX, %CL ret rotli: mov %EAX, DWORD PTR [%ESP + 4] shrd %EAX, %EAX, 27 ret Emit: rotr32: mov %CL, BYTE PTR [%ESP + 8] mov %EAX, DWORD PTR [%ESP + 4] ror %EAX, %CL ret rotli32: mov %EAX, DWORD PTR [%ESP + 4] ror %EAX, 27 ret We also emit byte rotate instructions which do not have a sh[lr]d counterpart at all. llvm-svn: 19692	2005-01-19 08:07:05 +00:00
Chris Lattner	34757ff939	Add rotate instructions. llvm-svn: 19690	2005-01-19 07:50:03 +00:00
Chris Lattner	e539ce8223	Match 16-bit shld/shrd instructions as well, implementing shift-double.llx:test5 llvm-svn: 19689	2005-01-19 07:37:26 +00:00
Chris Lattner	9d5ee289d7	Improve coverage of the X86 instruction set by adding 16-bit shift doubles. llvm-svn: 19687	2005-01-19 07:31:24 +00:00
Chris Lattner	c03f360215	Teach the code generator that shrd/shld is commutable if it has an immediate. This allows us to generate this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %EDX, DWORD PTR [%ESP + 8] shld %EDX, %EDX, 2 shl %EAX, 2 ret instead of this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] mov %EDX, %EAX shrd %EDX, %ECX, 30 shl %EAX, 2 ret Note the magically transmogrifying immediate. llvm-svn: 19686	2005-01-19 07:11:01 +00:00
Chris Lattner	575e912fcf	Codegen long >> 2 to this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %EDX, DWORD PTR [%ESP + 8] shrd %EAX, %EDX, 2 sar %EDX, 2 ret instead of this: test1: mov %ECX, DWORD PTR [%ESP + 4] shr %ECX, 2 mov %EDX, DWORD PTR [%ESP + 8] mov %EAX, %EDX shl %EAX, 30 or %EAX, %ECX sar %EDX, 2 ret and long << 2 to this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] * mov %EDX, %EAX shrd %EDX, %ECX, 30 shl %EAX, 2 ret instead of this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX shr %ECX, 30 mov %EDX, DWORD PTR [%ESP + 8] shl %EDX, 2 or %EDX, %ECX shl %EAX, 2 ret The extra copy (marked *) can be eliminated when I teach the code generator that shrd32rri8 is really commutative. llvm-svn: 19681	2005-01-19 06:18:43 +00:00
Chris Lattner	419a5d213b	X86 shifts mask the amount. llvm-svn: 19678	2005-01-19 03:36:30 +00:00
Chris Lattner	6dec8cb829	Code to handle FP_EXTEND is dead now. X86 doesn't support any data types to FP_EXTEND from! llvm-svn: 19674	2005-01-18 20:05:56 +00:00
Chris Lattner	798e9c85d6	Remove more dead code. llvm-svn: 19673	2005-01-18 19:50:08 +00:00
Chris Lattner	401814508f	The selection dag code handles the promotions from F32 to F64 for us, so we don't need to even think about F32 in the X86 code anymore. llvm-svn: 19672	2005-01-18 19:46:54 +00:00
Chris Lattner	dc09e52b3e	Fix 124.m88ksim. llvm-svn: 19667	2005-01-18 17:35:28 +00:00
Chris Lattner	a04b1ee7a8	Do not emit loads multiple times, potentially in the wrong places. llvm-svn: 19661	2005-01-18 04:18:32 +00:00
Chris Lattner	722ddeb86e	Eliminate bad assertions. llvm-svn: 19659	2005-01-18 04:00:54 +00:00
Chris Lattner	8f3a8d96e2	* Eliminate the TokenSet and just use the ExprMap for both tokens and values. * Insert some really pedantic assertions that will notice when we emit the same loads more than one time, exposing bugs. This turns a miscompilation in bzip2 into a compile-fail. yaay. llvm-svn: 19658	2005-01-18 03:51:59 +00:00
Chris Lattner	b3edb09ede	Rely on the code in MatchAddress to do this work. Otherwise we fail to match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index register, then there is no place to put the Z. llvm-svn: 19652	2005-01-18 02:25:52 +00:00
Chris Lattner	ce2e0125dc	Fix a problem where probing for addressing modes caused expressions to be emitted too early. In particular, this fixes Regression/CodeGen/X86/regpressure.ll:regpressure3. This also improves the 2nd basic block in 164.gzip:flush_block, which went from .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree + 20] movzx %ECX, WORD PTR [dyn_ltree + 16] mov DWORD PTR [%ESP + 32], %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] movzx %EDX, WORD PTR [dyn_ltree + 8] movzx %EBX, WORD PTR [dyn_ltree + 4] mov DWORD PTR [%ESP + 36], %EBX movzx %EBX, WORD PTR [dyn_ltree] add DWORD PTR [%ESP + 36], %EBX add %EDX, DWORD PTR [%ESP + 36] add %ECX, %EDX add DWORD PTR [%ESP + 32], %ECX add %EAX, DWORD PTR [%ESP + 32] movzx %ECX, WORD PTR [dyn_ltree + 24] add %EAX, %ECX mov %ECX, 0 mov %EDX, %ECX to .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree] movzx %ECX, WORD PTR [dyn_ltree + 4] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 8] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 16] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 20] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 24] add %ECX, %EAX mov %EAX, 0 mov %EDX, %EAX ... which results in less spilling in the function. This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc. The default isel takes 37.31s. llvm-svn: 19650	2005-01-18 01:06:26 +00:00
Chris Lattner	a78f9ced61	Fix indentation. llvm-svn: 19649	2005-01-17 23:25:45 +00:00
Chris Lattner	dff1e3e86f	Don't bother using max here. llvm-svn: 19647	2005-01-17 23:02:13 +00:00
Chris Lattner	2d86b43318	Do not give token factor nodes outrageous weights llvm-svn: 19645	2005-01-17 22:56:09 +00:00
Chris Lattner	f2878ce8ba	Two changes: 1. Fold [mem] += (1\|-1) into inc [mem]/dec [mem] to save some icache space. 2. Do not let token factor nodes prevent forming '[mem] op= val' folds. llvm-svn: 19643	2005-01-17 22:10:42 +00:00
Chris Lattner	40c0fca632	Refactor load/op/store folding into it's own method, no functionality changes. llvm-svn: 19641	2005-01-17 19:25:26 +00:00
Chris Lattner	2348abc421	Fix a major regression last night that prevented us from producing [mem] op= reg operations. The body of the if is less indented but unmodified in this patch. llvm-svn: 19638	2005-01-17 17:49:14 +00:00
Chris Lattner	adb669ab1f	Codegen this: int %foo(int %X) { %T = add int %X, 13 %S = mul int %T, 3 ret int %S } as this: mov %ECX, DWORD PTR [%ESP + 4] lea %EAX, DWORD PTR [%ECX + 2*%ECX + 39] ret instead of this: mov %ECX, DWORD PTR [%ESP + 4] mov %EAX, %ECX add %EAX, 13 imul %EAX, %EAX, 3 ret llvm-svn: 19633	2005-01-17 06:48:02 +00:00
Chris Lattner	51590b615c	Fix test/Regression/CodeGen/X86/2005-01-17-CycleInDAG.ll and 132.ijpeg. Do not fold a load into an operation if it will induce a cycle in the DAG. Repeat after me: dAg. llvm-svn: 19631	2005-01-17 06:26:58 +00:00
Chris Lattner	f1e85bec5a	Do not fold a load into a comparison that is used by more than one place. The comparison will probably be folded, so this is not ok to do. This fixed 197.parser. llvm-svn: 19624	2005-01-17 01:34:14 +00:00
Chris Lattner	1b8c8fe020	Do not codegen 'xor bool, true' as 'not reg'. not reg inverts the upper bits of the bytereg. This fixes yacr2, 300.twolf and probably others. llvm-svn: 19622	2005-01-17 00:23:16 +00:00
Chris Lattner	46dac4394c	Set up the shift and setcc types. If we emit a load because we followed a token chain to get to it, try to fold it into its single user if possible. llvm-svn: 19620	2005-01-17 00:00:33 +00:00
Chris Lattner	9ffc59287e	* Adjust to changes in TargetLowering interfaces. * Remove custom promotion for bool and byte select ops. Legalize now promotes them for us. * Allow folding ConstantPoolIndexes into EXTLOAD's, useful for float immediates. * Declare which operations are not supported better. * Add some hacky code for TRUNCSTORE to pretend that we have truncstore for i16 types. This is useful for testing promotion code because I can just remove 16-bit registers all together and verify that programs work. llvm-svn: 19614	2005-01-16 07:34:08 +00:00
Chris Lattner	f3d950e816	Add support for truncstore and *extload. llvm-svn: 19566	2005-01-15 05:22:24 +00:00
Chris Lattner	27c91fac94	Adjust to CopyFromREg changes. llvm-svn: 19561	2005-01-14 22:37:41 +00:00
Chris Lattner	7a8788c9ac	Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode. llvm-svn: 19535	2005-01-13 20:50:02 +00:00
Chris Lattner	fce6a5439d	Codegen factor nodes more intelligently according to perceived register pressure. llvm-svn: 19532	2005-01-13 19:56:00 +00:00
Chris Lattner	cb4359465a	Initial trivial (but stupid) codegen for this node. llvm-svn: 19529	2005-01-13 18:01:36 +00:00
Chris Lattner	9a70166615	Add some really pedantic assertions to the load folding code. Fix a bunch of cases where we accidentally emitted a load folded once and unfolded elsewhere. llvm-svn: 19522	2005-01-13 05:53:16 +00:00
Chris Lattner	2ab70aafe0	We can only fold a load into an op if there is exactly one use of the value. Checking to see if the load has two uses is not equivalent, as the chain value may have zero uses. llvm-svn: 19518	2005-01-12 18:38:26 +00:00
Chris Lattner	4b03f0f99e	Try both ways to fold an add together. This allows us to generate this code imul %EAX, %EAX, 400 add %ECX, %EAX add %ESI, DWORD PTR [%ECX + 4*%EDX] inc %EDX cmp %EDX, 100 instead of this: imul %EAX, %EAX, 400 add %ECX, %EAX mov %EAX, %EDX shl %EAX, 2 add %ECX, %EAX add %ESI, DWORD PTR [%ECX] inc %EDX cmp %EDX, 100 llvm-svn: 19513	2005-01-12 18:08:53 +00:00
Chris Lattner	61c572eb7f	Fix a major miscompilation where we were overwriting the scale reg. llvm-svn: 19511	2005-01-12 07:33:20 +00:00
Chris Lattner	5816f1a302	Do not use the type of the RHS constant to determine the type of the operation. This fails for shifts because the constant is always 8 bits. llvm-svn: 19508	2005-01-12 05:22:07 +00:00
Chris Lattner	89d6b21ae6	Do not lose the offset from teh global when peephole optimizing instructions. This fixes FreeBench/pcompress llvm-svn: 19507	2005-01-12 05:17:28 +00:00
Jeff Cohen	614a5ec22a	Fix C++ more compilatiom errors llvm-svn: 19504	2005-01-12 04:29:05 +00:00
Chris Lattner	5ef92f3a40	Fix a compile error with VC++, which things that static const arrays need to be dynamically initialized. :( llvm-svn: 19503	2005-01-12 04:23:22 +00:00
Chris Lattner	627c64e5e5	Fix a bug that caused us to crash on povray. We weren't emitting an FP_REG_KILL into a block that had a successor with a FP PHI node. llvm-svn: 19502	2005-01-12 04:21:28 +00:00
Chris Lattner	a5f0ba59a0	Print a load of a null pointer (in intel mode) like this: mov %AX, WORD PTR [0] instead of like this: mov %AX, WORD PTR [] llvm-svn: 19501	2005-01-12 04:07:11 +00:00
Chris Lattner	360988bae2	Print a load of a null pointer like this: movw 0, %ax instead of like this: movw , %ax llvm-svn: 19500	2005-01-12 04:05:19 +00:00
Chris Lattner	3c85c67c97	Fix a crash compiling povray on UINT_TO_FP from i16. llvm-svn: 19499	2005-01-12 04:00:00 +00:00
Chris Lattner	4e72a2a000	There are no [mem] op= reg instructions for FP, so remove their entries. llvm-svn: 19496	2005-01-12 03:16:09 +00:00
Chris Lattner	00cb0ace9b	Fix a bug where we didn't insert FP_REG_KILL instructions into MBB's that contain FP PHI nodes but no other FP defining instructions. This fixes 183.equake llvm-svn: 19495	2005-01-12 02:57:10 +00:00
Chris Lattner	92166ed1df	Fold TRUNCATE (LOAD P) into a smaller load from P. llvm-svn: 19494	2005-01-12 02:19:06 +00:00
Chris Lattner	258b23bd9d	Be more careful about order of arg evalution for CopyToReg nodes. This shrinks 256.bzip2 from 7142 to 7103 lines of .s file. Second, add initial support for folding loads into compares, though this code is dynamically dead for now. :( llvm-svn: 19493	2005-01-12 02:02:48 +00:00
Chris Lattner	604416e8f4	Fold some more [mem] op= val operators. This allows us to things like this several times in 256.bzip2: mov %EAX, DWORD PTR [%ESP + 204] - mov %EAX, DWORD PTR [%EAX] - or %EAX, 2097152 - mov %ECX, DWORD PTR [%ESP + 204] - mov DWORD PTR [%ECX], %EAX + or DWORD PTR [%EAX], 2097152 llvm-svn: 19492	2005-01-12 01:28:00 +00:00
Chris Lattner	e83ae1063f	Fold loads into sign/zero extends. instead of: mov %AL, BYTE PTR [%EDX + l18_length_code] movzx %EAX, %AL Emit: movzx %EAX, BYTE PTR [%EDX + l18_length_code] llvm-svn: 19489	2005-01-11 23:33:00 +00:00
Chris Lattner	87a38bd4a8	Comment out debug code :) Select [mem] += Val operations. For constants, we used to get: mov %ECX, -32768 add %ECX, DWORD PTR [l4_match_start] mov DWORD PTR [l4_match_start], %ECX Now we get: add DWORD PTR [l4_match_start], -32768 For other values we used to get: mov %EBP, %EDI ;; because the add destroys the value add %EBP, DWORD PTR [l4_input_len] mov DWORD PTR [l4_input_len], %EBP now we get: add DWORD PTR [l4_input_len], %EDI Both of these use less registers than the alternative, are faster and smaller. llvm-svn: 19488	2005-01-11 23:21:30 +00:00
Chris Lattner	282473a25d	Handle the global address case here, not just the offset case. llvm-svn: 19487	2005-01-11 22:58:43 +00:00
Chris Lattner	9eb2cc700b	Treat int constants as not requiring a register, since they are almost always folded into an instruction. llvm-svn: 19486	2005-01-11 22:29:12 +00:00
Chris Lattner	7cb2220907	* Factor a bunch of binary operator cases into shared code. * Fold loads into Add, sub, and, or, xor and mul when possible. * Codegen shl X, 1 as add X, X llvm-svn: 19483	2005-01-11 21:19:59 +00:00
Chris Lattner	b838c9748e	Fold multiplies by 3,5,9 into addressing modes when possible. llvm-svn: 19480	2005-01-11 19:37:02 +00:00
Chris Lattner	e7b1130b01	Instead of generating stuff like this: mov %ECX, %EAX add %ECX, 32768 mov %SI, WORD PTR [2%ECX + l13_prev] Generate this: mov %SI, WORD PTR [2%ECX + l13_prev + 65536] This occurs when you have a GEP instruction where an index is "something + imm". llvm-svn: 19472	2005-01-11 06:36:20 +00:00
Chris Lattner	bb63a09cd1	Implement MEMCPY natively in terms of rep movs* llvm-svn: 19468	2005-01-11 06:19:26 +00:00
Chris Lattner	b2b08a8bc1	Implement memset -> rep stos* llvm-svn: 19467	2005-01-11 06:14:36 +00:00
Chris Lattner	58816a9e81	Announce that we don't support mem ops yet. llvm-svn: 19466	2005-01-11 05:57:36 +00:00
Chris Lattner	f867443d7e	Teach the address selector to make 'reg+reg' addressing modes. llvm-svn: 19457	2005-01-11 04:40:19 +00:00
Chris Lattner	edf06be50e	Emit NOT instructions. llvm-svn: 19455	2005-01-11 04:31:30 +00:00
Chris Lattner	4e4bef2d6c	Fix a bug emitting branches that broke a lot of programs. llvm-svn: 19452	2005-01-11 04:06:27 +00:00
Chris Lattner	4b51297a94	Be more careful where we set ContainsFPCode. We were missing a set in the int -> FP casting code. Note that we don't have to set it for FP operations that take FP values as operands: whatever produces the FP value will set the flag. llvm-svn: 19451	2005-01-11 03:50:45 +00:00
Chris Lattner	0c4c4094e3	Fix a major bug in setcc/cmov folding, where we accidentally inverted the sense of the comparison. llvm-svn: 19450	2005-01-11 03:37:59 +00:00
Chris Lattner	d188e03011	Take register pressure into account when we have to decide whether to evaluate the LHS or the RHS of an operation first. This causes good things to happen. For example, instead of compiling a loop to this: .LBBstrength_result7_1: # loopentry movl 16(%esp), %edi movl (%edi), %edi ;;; LOAD movl (%ecx), %ebx movl $2, (%eax,%ebx,4) movl (%edx), %ebx movl %esi, %ebp addl $21, %ebp addl $42, %esi cmpl $0, %edi ;;; USE cmovne %esi, %ebp cmpl %ebp, %ebx movl %ebp, %esi jg .LBBstrength_result7_1 We now compile it to this: .LBBstrength_result7_1: # loopentry movl %edi, %ebx addl $42, %ebx addl $21, %edi movl (%ecx), %ebp ;; LOAD cmpl $0, %ebp ;; USE cmovne %ebx, %edi movl (%edx), %ebx movl $2, (%eax,%ebx,4) movl (%esi), %ebx cmpl %edi, %ebx jg .LBBstrength_result7_1 Which reduces register pressure enough (in this case) to avoid spilling in the loop. As another example, consider the CodeGen/X86/regpressure.ll testcase. We used to generate this code for both cases: regpressure1: subl $32, %esp movl %esi, 12(%esp) movl %edi, 8(%esp) movl %ebx, 4(%esp) movl %ebp, (%esp) movl 36(%esp), %ecx movl (%ecx), %eax movl 4(%ecx), %edx movl %edx, 24(%esp) movl 8(%ecx), %edx movl %edx, 16(%esp) movl 12(%ecx), %edx movl 16(%ecx), %esi movl 20(%ecx), %edi movl 24(%ecx), %ebx movl %ebx, 28(%esp) movl 28(%ecx), %ebx movl 32(%ecx), %ebp movl %ebp, 20(%esp) movl 36(%ecx), %ecx imull 24(%esp), %eax imull 16(%esp), %eax imull %edx, %eax imull %esi, %eax imull %edi, %eax imull 28(%esp), %eax imull %ebx, %eax imull 20(%esp), %eax imull %ecx, %eax movl (%esp), %ebp movl 4(%esp), %ebx movl 8(%esp), %edi movl 12(%esp), %esi addl $32, %esp ret This code is basically trying to do all of the loads first, then execute all of the multiplies. Because we run out of registers, lots of spill code happens. We now generate this code for both cases: regpressure1: movl 4(%esp), %ecx movl (%ecx), %eax movl 4(%ecx), %edx imull %edx, %eax movl 8(%ecx), %edx imull %edx, %eax movl 12(%ecx), %edx imull %edx, %eax movl 16(%ecx), %edx imull %edx, %eax movl 20(%ecx), %edx imull %edx, %eax movl 24(%ecx), %edx imull %edx, %eax movl 28(%ecx), %edx imull %edx, %eax movl 32(%ecx), %edx imull %edx, %eax movl 36(%ecx), %ecx imull %ecx, %eax ret which is much nicer (when we fold loads into the muls it will be even better). The old instruction selector used to produce the good code for regpressure1 but not for regpressure2, as it depended on the order of operations in the LLVM code. llvm-svn: 19449	2005-01-11 03:11:44 +00:00
Chris Lattner	497e24c885	Fold setcc instructions into selects. llvm-svn: 19438	2005-01-10 22:10:13 +00:00
Chris Lattner	65d007ab62	Add conditional moves for the parity flag. llvm-svn: 19437	2005-01-10 22:09:33 +00:00
Chris Lattner	d61491dea2	Implement 8-bit multiply for X86. llvm-svn: 19435	2005-01-10 20:55:48 +00:00
Chris Lattner	fcab5f75c0	Codegen (Reg\|imm)+&GV as an LEA, because we cannot put it into the immediate field of an ADDri (due to current restrictions on MachineOperand :( ). This allows us to generate: leal Data+16000, %edx instead of: movl $Data, %edx addl $16000, %edx llvm-svn: 19420	2005-01-09 20:20:29 +00:00
Chris Lattner	35375c11bf	Fix copy and pasto's for FP -> Int. This fixes fldry llvm-svn: 19418	2005-01-09 19:49:59 +00:00
Chris Lattner	45155a3dee	Initial implementation of FP->INT and INT->FP casts Also, fix zero_extend from bool to i8, which fixes Shootout/objinst. llvm-svn: 19414	2005-01-09 18:52:44 +00:00
Chris Lattner	9ca9b20447	Fix a subtle bug involving constant expr casts from int to fp llvm-svn: 19410	2005-01-09 01:49:29 +00:00
Chris Lattner	c5e53c07fd	Implement varargs and returnaddress/frameaddress intrinsics. With this patch, all of SingleSource/UnitTests passes. llvm-svn: 19408	2005-01-09 00:01:27 +00:00
Chris Lattner	ca81756527	Okay 15th time is the charm. Looking at the vector size is useless as it gets clobbered by a previous statement. This fixes all calls finally. llvm-svn: 19399	2005-01-08 20:51:36 +00:00
Chris Lattner	85816cff9a	Okay, my off by one was actually off by two. This fixes Generic/2003-07-07-BadLongConst.ll llvm-svn: 19398	2005-01-08 20:39:31 +00:00
Chris Lattner	2d68cb6cf4	Fix off by one error llvm-svn: 19396	2005-01-08 20:31:34 +00:00
Chris Lattner	c4d075cfa3	Adjust to changes in LowerCallTo interface Minor bugfixes llvm-svn: 19376	2005-01-08 19:28:19 +00:00
Chris Lattner	6c7d3bd8ea	Wrap long line. llvm-svn: 19367	2005-01-08 06:59:50 +00:00
Chris Lattner	473ec492f7	The X86 instruction selector already handles codegen of: store float 123.45, float* %P as an integer store. This adds handling of float immediate stores as integers for arguments passed function calls. This is now tested by CodeGen/X86/store-fp-constant.ll llvm-svn: 19364	2005-01-08 05:45:24 +00:00
Chris Lattner	2c398fc8f6	Allow the selection-dag based selector to be diabled with -disable-pattern-isel. For now, this is the default, as the current selector is missing some big pieces. To enable the new selector, pass -disable-pattern-isel=false to llc or lli. llvm-svn: 19335	2005-01-07 07:50:50 +00:00
Chris Lattner	216198574d	Reimplementation of the X86 pattern isel. This is still missing many large pieces, but can already do amazing things in some cases. llvm-svn: 19334	2005-01-07 07:49:41 +00:00
Chris Lattner	74019f517a	This file is now dead. llvm-svn: 19333	2005-01-07 07:49:05 +00:00
Chris Lattner	079b497982	Add a new prototype llvm-svn: 19332	2005-01-07 07:48:33 +00:00
Chris Lattner	608dd77d6b	Codegen -1 and -0.0 more efficiently. This implements CodeGen/X86/negatize_zero.ll llvm-svn: 19313	2005-01-06 21:19:16 +00:00
Chris Lattner	6d651234d6	1. If a double FP constant must be put into a constant pool, but it can be precisely represented as a float, put it into the constant pool as a float. 2. Use the cbw/cwd/cdq instructions instead of an explicit SAR for signed division. llvm-svn: 19291	2005-01-05 16:30:14 +00:00
Chris Lattner	b438f5251f	Minor optimization to allocate R8 registers in a better order. llvm-svn: 19289	2005-01-05 16:09:16 +00:00
Jeff Cohen	36968ed8c1	Revert elimination of global variable hack... still needed. llvm-svn: 19273	2005-01-03 16:34:19 +00:00
Chris Lattner	1aaf8cccb2	ADC and IMUL are also commutable. llvm-svn: 19264	2005-01-03 01:27:59 +00:00
Jeff Cohen	1087b72875	Eliminate the use of the global variable hack in the X86 target that was used to get Visual Studio to link in X86.lib to the executables that need it. There is another way of doing it. llvm-svn: 19252	2005-01-02 04:23:12 +00:00
Chris Lattner	a78fd4726e	Disable 2->3 address promotion of add and inc instructions to LEA's. In addition to being three address, LEA's don't set the flags. This fixes 186.crafty. llvm-svn: 19251	2005-01-02 04:18:17 +00:00
Chris Lattner	3ef32da6c3	Add a new method. llvm-svn: 19249	2005-01-02 02:38:18 +00:00
Chris Lattner	95f1e628ed	Add support for SETNPr to lower to memory form. llvm-svn: 19248	2005-01-02 02:37:46 +00:00
Chris Lattner	d6bc921fa8	Implement the convertToThreeAddress method, add support for inverting JP/JNP branches. llvm-svn: 19247	2005-01-02 02:37:07 +00:00
Chris Lattner	0d6f03e52b	Two changes here: 1. Add new instructions for checking parity flags: JP, JNP, SETP, SETNP. 2. Set the isCommutable and isPromotableTo3Address bits on several instructions. llvm-svn: 19246	2005-01-02 02:35:46 +00:00
Chris Lattner	3b78513843	Remove unused enum value llvm-svn: 19024	2004-12-17 22:41:46 +00:00
Chris Lattner	d11ba51208	Change the sentinal llvm-svn: 19007	2004-12-17 00:46:51 +00:00
Chris Lattner	59d0c02d2b	Create a stack slot for the return address lazily instead of eagerly. This save small amounts of time for functions that don't call llvm.returnaddress or llvm.frameaddress (which is almost all functions). llvm-svn: 19006	2004-12-17 00:07:46 +00:00
Chris Lattner	4b1d58bf4b	Adjust to changes in asmwriter filenames llvm-svn: 18987	2004-12-16 17:33:24 +00:00
Chris Lattner	4136428410	Set the rounding mode for the X86 FPU to 64-bits instead of 80-bits. We don't support long double anyway, and this gives us FP results closer to other targets. This also speeds up 179.art from 41.4s to 18.32s, by eliminating a problem with extra precision that causes an FP == comparison to fail (leading to extra loop iterations). llvm-svn: 18895	2004-12-13 17:23:11 +00:00
Chris Lattner	6131b06f73	Use the target triple to pick this target. llvm-svn: 18830	2004-12-12 17:40:28 +00:00
Chris Lattner	99c8cf8ef8	Fix a regression caused by the previous patch llvm-svn: 18449	2004-12-03 05:13:15 +00:00
Chris Lattner	316f923a9c	Spill/restore X86 floating point stack registers with 64-bits of precision instead of 80-bits of precision. This fixes PR467. This change speeds up fldry on X86 with LLC from 7.32s on apoc to 4.68s. llvm-svn: 18433	2004-12-02 18:17:31 +00:00
Chris Lattner	dfdd49b7af	Consider 64-bit registers to be FP as well. llvm-svn: 18432	2004-12-02 17:57:21 +00:00
Tanya Lattner	893f987574	Reverting this patch: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041122/021428.html It broke Mutlisource/Applications/obsequi llvm-svn: 18407	2004-12-01 18:27:03 +00:00
Chris Lattner	9c400f3b28	Revamp long/ulong comparisons to use a much more efficient sequence (thanks to Brian and the Sun compiler for pointing out that the obvious works :) This also enables folding all long comparisons into setcc and branch instructions: before we could only do == and != For example, for: void test(unsigned long long A, unsigned long long B) { if (A < B) foo(); } We now generate: test: subl $4, %esp movl %esi, (%esp) movl 8(%esp), %eax movl 12(%esp), %ecx movl 16(%esp), %edx movl 20(%esp), %esi subl %edx, %eax sbbl %esi, %ecx jae .LBBtest_2 # UnifiedReturnBlock .LBBtest_1: # then call foo movl (%esp), %esi addl $4, %esp ret .LBBtest_2: # UnifiedReturnBlock movl (%esp), %esi addl $4, %esp ret Instead of: test: subl $12, %esp movl %esi, 8(%esp) movl %ebx, 4(%esp) movl 16(%esp), %eax movl 20(%esp), %ecx movl 24(%esp), %edx movl 28(%esp), %esi cmpl %edx, %eax setb %al cmpl %esi, %ecx setb %bl cmove %ax, %bx testb %bl, %bl je .LBBtest_2 # UnifiedReturnBlock .LBBtest_1: # then call foo movl 4(%esp), %ebx movl 8(%esp), %esi addl $12, %esp ret .LBBtest_2: # UnifiedReturnBlock movl 4(%esp), %ebx movl 8(%esp), %esi addl $12, %esp ret llvm-svn: 18330	2004-11-29 05:55:24 +00:00
Chris Lattner	7a34cbf266	Do not push two return addresses on the stack when we call external functions who have their addresses taken. This fixes test-call.ll llvm-svn: 18134	2004-11-22 22:25:30 +00:00
Chris Lattner	f4c8575535	There is no reason to emit function stubs for direct calls. llvm-svn: 18082	2004-11-21 03:46:06 +00:00
Chris Lattner	6d1fb33657	ignore generated files llvm-svn: 18073	2004-11-21 00:01:54 +00:00
Chris Lattner	4a340e281e	Remove all JIT specific code and switch the code generator over to emitting relocations for global references. llvm-svn: 18068	2004-11-20 23:55:15 +00:00
Chris Lattner	b9a44893e9	Implement the X86 JIT interfaces llvm-svn: 18067	2004-11-20 23:54:33 +00:00
Chris Lattner	8e33311566	Describe the X86 target-specific relocations. llvm-svn: 18066	2004-11-20 23:54:19 +00:00
Chris Lattner	3c20464ad7	We implement these interfaces llvm-svn: 18065	2004-11-20 23:53:56 +00:00
Chris Lattner	0c79788bc4	Dont' forget to switch back to decimal output llvm-svn: 18010	2004-11-19 20:57:07 +00:00
Chris Lattner	a7eec14b04	Fix a major bug in the signed shr code, which apparently only breaks 134.perl! llvm-svn: 17902	2004-11-16 18:40:52 +00:00
Chris Lattner	41d31d7461	Remove a dead function, which died when we got GAS emission working (phwew, hold your nose!) llvm-svn: 17869	2004-11-16 04:34:29 +00:00
Chris Lattner	b378786c97	Implement a simple FIXME: if we are emitting a basic block address that has already been emitted, we don't have to remember it and deal with it later, just emit it directly. llvm-svn: 17868	2004-11-16 04:30:51 +00:00
Chris Lattner	3f73c77ace	* Merge some win32 ifdefs together * Get rid of "emitMaybePCRelativeValue", either we want to emit a PC relative value or not: drop the maybe BS. As it turns out, the only places where the bool was a variable coming in, the bool was a dynamic constant. llvm-svn: 17867	2004-11-16 04:21:18 +00:00
Chris Lattner	3ed3e8669f	Add debug-only=jit printout, so we see when lazily resolved symbols are set up. llvm-svn: 17862	2004-11-15 23:16:55 +00:00
Chris Lattner	9ef34d44e1	Simplify and rearrange long shift code llvm-svn: 17861	2004-11-15 23:16:34 +00:00
Misha Brukman	8c1b4a5b9d	GhostLinkage should not reach asm printing stage llvm-svn: 17750	2004-11-14 21:03:49 +00:00
Chris Lattner	09b7f968e0	Don't print unneeded labels llvm-svn: 17714	2004-11-13 23:27:11 +00:00
Chris Lattner	1cde11aa95	shld is a very high latency operation. Instead of emitting it for shifts of two or three, open code the equivalent operation which is faster on athlon and P4 (by a substantial margin). For example, instead of compiling this: long long X2(long long Y) { return Y << 2; } to: X3_2: movl 4(%esp), %eax movl 8(%esp), %edx shldl $2, %eax, %edx shll $2, %eax ret Compile it to: X2: movl 4(%esp), %eax movl 8(%esp), %ecx movl %eax, %edx shrl $30, %edx leal (%edx,%ecx,4), %edx shll $2, %eax ret Likewise, for << 3, compile to: X3: movl 4(%esp), %eax movl 8(%esp), %ecx movl %eax, %edx shrl $29, %edx leal (%edx,%ecx,8), %edx shll $3, %eax ret This matches icc, except that icc open codes the shifts as adds on the P4. llvm-svn: 17707	2004-11-13 20:48:57 +00:00
Chris Lattner	c531e090db	Add missing check llvm-svn: 17706	2004-11-13 20:04:38 +00:00
Chris Lattner	d1381380ae	Compile: long long X3_2(long long Y) { return Y+Y; } int X(int Y) { return Y+Y; } into: X3_2: movl 4(%esp), %eax movl 8(%esp), %edx addl %eax, %eax adcl %edx, %edx ret X: movl 4(%esp), %eax addl %eax, %eax ret instead of: X3_2: movl 4(%esp), %eax movl 8(%esp), %edx shldl $1, %eax, %edx shll $1, %eax ret X: movl 4(%esp), %eax shll $1, %eax ret llvm-svn: 17705	2004-11-13 20:03:48 +00:00
John Criswell	402e338f11	Correct the name of stosd for the AT&T syntax: It's stosl (l for long == 32 bit). llvm-svn: 17658	2004-11-10 04:48:15 +00:00
John Criswell	97da76178c	Fix compilation problem; make the cast and the LHS be the same type. llvm-svn: 17488	2004-11-05 16:17:06 +00:00
Chris Lattner	499e1b16a7	Quiet VC++ warnings llvm-svn: 17484	2004-11-05 04:50:59 +00:00
Chris Lattner	d9696aa7b8	Fix a warning llvm-svn: 17431	2004-11-02 15:27:57 +00:00
Chris Lattner	10de12fd46	Add placeholder variable to make Win32 work, applied for Morten Ofstad llvm-svn: 17406	2004-11-01 20:10:20 +00:00
Reid Spencer	d3f7233495	Change Library Names Not To Conflict With Others When Installed llvm-svn: 17286	2004-10-27 23:18:45 +00:00
Reid Spencer	019621a1ea	Adjust to changes in Makefile.rules llvm-svn: 17167	2004-10-22 21:02:08 +00:00

1 2 3 4 5 ...

1076 Commits