llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Chris Lattner	1ac1e54bf9	Fix bug: 2004-10-08-SelectSetCCFold.llx. Normally this is hidden by the instcombine xform, which is why we didn't notice it before. llvm-svn: 16840	2004-10-08 16:34:13 +00:00
Chris Lattner	5839d93b51	Instcombine (X & FF00) + xx00 -> (X+xx00) & FF00, implementing and.ll:test27 This comes up when doing adds to bitfield elements. llvm-svn: 16836	2004-10-08 05:07:56 +00:00
Chris Lattner	87259c3ce9	Little patch to turn (shl (add X, 123), 4) -> (add (shl X, 4), 123 << 4) This triggers in cases of bitfield additions, opening opportunities for future improvements. llvm-svn: 16834	2004-10-08 03:46:20 +00:00
Nate Begeman	dfefd2f3fc	Implement logical and with an immediate that consists of a contiguous block of one or more 1 bits (may wrap from least significant bit to most significant bit) as the rlwinm rather than andi., andis., or some longer instructons sequence. int andn4(int z) { return z & -4; } int clearhi(int z) { return z & 0x0000FFFF; } int clearlo(int z) { return z & 0xFFFF0000; } int clearmid(int z) { return z & 0x00FFFF00; } int clearwrap(int z) { return z & 0xFF0000FF; } _andn4: rlwinm r3, r3, 0, 0, 29 blr _clearhi: rlwinm r3, r3, 0, 16, 31 blr _clearlo: rlwinm r3, r3, 0, 0, 15 blr _clearmid: rlwinm r3, r3, 0, 8, 23 blr _clearwrap: rlwinm r3, r3, 0, 24, 7 blr llvm-svn: 16832	2004-10-08 02:49:24 +00:00
Nate Begeman	370b1b7a9a	Several fixes and enhancements to the PPC32 backend. 1. Fix an illegal argument to getClassB when deciding whether or not to sign extend a byte load. 2. Initial addition of isLoad and isStore flags to the instruction .td file for eventual use in a scheduler. 3. Rewrite of how constants are handled in emitSimpleBinaryOperation so that we can emit the PowerPC shifted immediate instructions far more often. This allows us to emit the following code: int foo(int x) { return x \| 0x00F0000; } _foo: .LBB_foo_0: ; entry ; IMPLICIT_DEF oris r3, r3, 15 blr llvm-svn: 16826	2004-10-07 22:30:03 +00:00
Nate Begeman	76d2a77998	Add ori reg, reg, 0 as a move instruction. This can be generated from loading a 32bit constant into a register whose low halfword is all zeroes. We now omit the ori after the lis for the following C code: int bar(int y) { return y * 0x00F0000; } _bar: .LBB_bar_0: ; entry ; IMPLICIT_DEF lis r2, 15 mullw r3, r3, r2 blr llvm-svn: 16825	2004-10-07 22:26:12 +00:00
Nate Begeman	f60feea650	Remove unnecessary header include llvm-svn: 16824	2004-10-07 22:24:32 +00:00
Chris Lattner	7882b54197	Improve comments, no functionality changes llvm-svn: 16814	2004-10-07 21:30:30 +00:00
Chris Lattner	d15e144241	Fix a nasty dangling pointer problem, due to a free'd pointer being left in a map. This caused problems if a later object happened to be allocated at the free'd object's address. llvm-svn: 16813	2004-10-07 20:01:31 +00:00
Chris Lattner	50e55bcdb0	Unfortunately the fix for the previous bug introduced the previous exponential behavior (bork!). This patch processes stuff with an explicit SCC finder, allowing the algorithm to be more clear, efficient, and also (as a bonus) correct! This gets us back to taking 0.6s to disassemble my horrible .bc file that previously took something > 30 mins. llvm-svn: 16811	2004-10-07 19:20:48 +00:00
Chris Lattner	dfdbd62d37	Fix a bug in my previous change. Unfortunately this reverts most of the speedup, but has the advantage of not breaking a bunch of programs! llvm-svn: 16806	2004-10-07 16:19:40 +00:00
Chris Lattner	e1d5d599bd	Fix a bug in the safety analysis routine llvm-svn: 16804	2004-10-07 06:01:25 +00:00
Chris Lattner	e7ec24c63e	Comment cleanups llvm-svn: 16803	2004-10-07 06:00:24 +00:00
Chris Lattner	ad9fe72e72	* Rename pass to globalopt, since we do more than just constify * Instead of handling dead functions specially, just nuke them. * Be more aggressive about cleaning up after constification, in particular, handle getelementptr instructions and constantexprs. * Be a little bit more structured about how we process globals. *** Delete globals that are only stored to, and never read. These are clearly not useful, so they should go. This implements deadglobal.llx This last one triggers quite a few times. In particular, 2208 in the external tests, 1865 of which are in 252.eon. This shrinks eon from 1995094 to 1732341 bytes of bytecode. llvm-svn: 16802	2004-10-07 04:16:33 +00:00
Chris Lattner	4a19983f2d	Implement GlobalConstifier/trivialstore.llx, and also do some simplifications of the resultant program to avoid making later passes do it all. This allows us to constify globals that just have the same constant that they are initialized stored into them. Suprisingly this comes up ALL of the freaking time, dozens of times in SPEC, 30 times in vortex alone. For example, on 256.bzip2, it allows us to constify these two globals: %smallMode = internal global ubyte 0 ; <ubyte> [#uses=8] %verbosity = internal global int 0 ; <int> [#uses=49] Which (with later optimizations) results in the bytecode file shrinking from 82286 to 69686 bytes! Lets hear it for IPO :) For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }" code. llvm-svn: 16793	2004-10-06 20:57:02 +00:00
Chris Lattner	e412d10cc0	Dont' let null nodes sneak past cast instructions llvm-svn: 16779	2004-10-06 19:29:13 +00:00
Chris Lattner	c2563bf614	Change Type::isAbstract to have better comments, a more correct name (PromoteAbstractToConcrete), and to use a set to avoid recomputation. In particular, this set eliminates the potentially exponential cases from this little recursive algorithm. On a particularly nasty testcase, llvm-dis on the .bc file went from 34 minutes (which is when I killed it, it still hadn't finished) to 0.57s. Remember kids, exponential algorithms are bad. llvm-svn: 16772	2004-10-06 16:36:46 +00:00
Chris Lattner	38fbf09104	Correct some typeos llvm-svn: 16770	2004-10-06 16:28:24 +00:00
Chris Lattner	ff8cbd01e7	Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16 llvm-svn: 16769	2004-10-06 15:08:25 +00:00
Chris Lattner	82aa8544a5	Remove debugging code, fix encoding problem. This fixes the problems the JIT had last night. llvm-svn: 16766	2004-10-06 14:31:50 +00:00
Nate Begeman	79d42a185e	Turning on fsel code gen now that we can do so would be good. llvm-svn: 16765	2004-10-06 11:03:30 +00:00
Nate Begeman	7b4fe83ba8	Implement floating point select for lt, gt, le, ge using the powerpc fsel instruction. Now, rather than emitting the following loop out of bisect: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f4 bge .LBB_main_64 ; no_exit.0.i .LBB_main_63: ; no_exit.0.i b .LBB_main_65 ; no_exit.0.i .LBB_main_64: ; no_exit.0.i fmr f2, f1 .LBB_main_65: ; no_exit.0.i addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f5 bge .LBB_main_67 ; no_exit.0.i .LBB_main_66: ; no_exit.0.i b .LBB_main_68 ; no_exit.0.i .LBB_main_67: ; no_exit.0.i fmr f4, f1 .LBB_main_68: ; no_exit.0.i fadd f1, f2, f4 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fcmpu cr0, f4, f0 bgt .LBB_main_70 ; no_exit.0.i .LBB_main_69: ; no_exit.0.i b .LBB_main_71 ; no_exit.0.i .LBB_main_70: ; no_exit.0.i fmr f0, f4 .LBB_main_71: ; no_exit.0.i fsub f1, f2, f1 addi r2, r2, -1 fcmpu cr0, f1, f3 blt .LBB_main_73 ; no_exit.0.i .LBB_main_72: ; no_exit.0.i b .LBB_main_74 ; no_exit.0.i .LBB_main_73: ; no_exit.0.i fmr f3, f1 .LBB_main_74: ; no_exit.0.i cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i We emit this instead: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 fsel f1, f1, f1, f2 addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f2, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f2 fsel f2, f2, f2, f4 fadd f1, f1, f2 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fsub f5, f0, f4 fsel f0, f5, f0, f4 fsub f1, f2, f1 addi r2, r2, -1 fsub f2, f1, f3 fsel f3, f2, f3, f1 cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i llvm-svn: 16764	2004-10-06 09:53:04 +00:00
Chris Lattner	b0e465f0cb	Codegen signed mod by 2 or -2 more efficiently. Instead of generating: t: mov %EDX, DWORD PTR [%ESP + 4] mov %ECX, 2 mov %EAX, %EDX sar %EDX, 31 idiv %ECX mov %EAX, %EDX ret Generate: t: mov %ECX, DWORD PTR [%ESP + 4] * mov %EAX, %ECX cdq and %ECX, 1 xor %ECX, %EDX sub %ECX, %EDX * mov %EAX, %ECX ret Note that the two marked moves are redundant, and should be eliminated by the register allocator, but aren't. Compare this to GCC, which generates: t: mov %eax, DWORD PTR [%esp+4] mov %edx, %eax shr %edx, 31 lea %ecx, [%edx+%eax] and %ecx, -2 sub %eax, %ecx ret or ICC 8.0, which generates: t: movl 4(%esp), %ecx #3.5 movl $-2147483647, %eax #3.25 imull %ecx #3.25 movl %ecx, %eax #3.25 sarl $31, %eax #3.25 addl %ecx, %edx #3.25 subl %edx, %eax #3.25 addl %eax, %eax #3.25 negl %eax #3.25 subl %eax, %ecx #3.25 movl %ecx, %eax #3.25 ret #3.25 We would be in great shape if not for the moves. llvm-svn: 16763	2004-10-06 05:01:07 +00:00
Chris Lattner	c959314701	Really fix FreeBSD, which apparently doesn't tolerate the extern. Thanks to Jeff Cohen for pointing out my goof. llvm-svn: 16762	2004-10-06 04:21:52 +00:00
Chris Lattner	09b6b3f514	Fix a scary bug with signed division by a power of two. We used to generate: s: ;; X / 4 mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 mov %EDX, %EAX add %EDX, %ECX sar %EAX, 2 ret When we really meant: s: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 add %EAX, %ECX sar %EAX, 2 ret Hey, this also reduces register pressure too :) llvm-svn: 16761	2004-10-06 04:19:43 +00:00
Chris Lattner	9258948b08	Codegen signed divides by 2 and -2 more efficiently. In particular instead of: s: ;; X / 2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax ret t: ;; X / -2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax negl %eax ret Emit: s: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax ret t: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax negl %eax ret llvm-svn: 16760	2004-10-06 04:02:39 +00:00
Chris Lattner	acd213fba3	Add some new instructions. Fix the asm string for sbb32rr llvm-svn: 16759	2004-10-06 04:01:02 +00:00
Chris Lattner	5f0c904ec0	Reduce code growth implied by the tail duplication pass by not duplicating an instruction if it can be hoisted to a common dominator of the block. This implements: test/Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 16758	2004-10-06 03:27:37 +00:00
Chris Lattner	b2e8fdc431	FreeBSD uses GCC. Patch contributed by Jeff Cohen! llvm-svn: 16756	2004-10-06 03:15:44 +00:00
Brian Gaeke	38641114a3	Must include sys/stat.h before declaring a 'struct stat' llvm-svn: 16728	2004-10-05 18:46:59 +00:00
Chris Lattner	6023408a6e	Make sure the const bit gets inherited correctly when linking declarations of disagreeing constness. This fixes test/Regression/Linker/ConstantGlobals[123].ll llvm-svn: 16692	2004-10-05 02:28:11 +00:00
Reid Spencer	bba26329ab	Adjust sys/stat.h inclusion so its only for SunOS. llvm-svn: 16686	2004-10-05 00:56:46 +00:00
Tanya Lattner	7198953962	Added a couple of includes to get this to compile on Sparc. llvm-svn: 16685	2004-10-05 00:51:26 +00:00
Chris Lattner	6547451f32	Solaris doesn't have MAP_FILE. llvm-svn: 16682	2004-10-05 00:46:21 +00:00
Reid Spencer	079b225788	Excise the ill-advised RLCOMP compression algorithm and simply leave the previously temporary NULLCOMP implementation that merely copies the data verbatim without compression. Also, don't warn if there's no compression library as that is taken care of during configuration time. llvm-svn: 16654	2004-10-04 17:45:44 +00:00
Reid Spencer	49089d64c2	Add a context for the callback so different compression scenarios can be distinguished. Tidy up documentation. Thanks, Chris. llvm-svn: 16652	2004-10-04 17:29:25 +00:00
Chris Lattner	9f6c72d660	Fix build if not HAVE_BZIP2 llvm-svn: 16650	2004-10-04 16:33:25 +00:00
Reid Spencer	da2e8b9943	First version of the MappedFile abstraction for operating system idependent mapping of files. This first version uses mmap where its available. The class needs to implement an alternate mechanism based on malloc'd memory and file reading/writing for platforms without virtual memory. llvm-svn: 16649	2004-10-04 11:08:32 +00:00
Reid Spencer	d2bedc512d	First version of a support utility to provide generalized compression in LLVM that handles availability and unavailability of bzip2 and zlib. llvm-svn: 16648	2004-10-04 10:49:41 +00:00
Chris Lattner	0228f228df	* Prune #includes * Update comments * Rearrange code a bit * Finally ELIMINATE the GAS workaround emitter for Intel mode. woot! llvm-svn: 16647	2004-10-04 07:31:08 +00:00
Chris Lattner	581948c8f6	Add support for emitting AT&T style .s files, and make it the default. Users may now choose their output format with the -x86-asm-syntax={intel\|att} flag. llvm-svn: 16646	2004-10-04 07:24:48 +00:00
Chris Lattner	5959f4a108	Convert some missed patterns to support AT&T style llvm-svn: 16645	2004-10-04 07:23:07 +00:00
Chris Lattner	a05d9f53bb	Apparently the GNU assembler has a HUGE hack to be compatible with really old and broken AT&T syntax assemblers. The problem with this hack is that SOME forms of the fdiv and fsub instructions have the 'r' bit inverted. This was a real pain to figure out, but is trivially easy to support: thus we are now bug compatible with gas and gcc. llvm-svn: 16644	2004-10-04 07:08:46 +00:00
Chris Lattner	08098895db	Fix incorrect suffix llvm-svn: 16642	2004-10-04 05:20:16 +00:00
Chris Lattner	c2fc9597bd	Fix some more missed suffixes and swapped operands llvm-svn: 16641	2004-10-04 01:38:10 +00:00
Chris Lattner	7b15a84728	Add missing suffixes to FP instructions for AT&T mode llvm-svn: 16640	2004-10-04 00:43:31 +00:00
Chris Lattner	8d44dcca97	Add support for the -x86-asm-syntax flag, which can be used to choose between Intel and AT&T style assembly language. The ultimate goal of this is to eliminate the GasBugWorkaroundEmitter class, but for now AT&T style emission is not fully operational. llvm-svn: 16639	2004-10-03 20:36:57 +00:00
Chris Lattner	94780713a8	Add support to the instruction patterns for AT&T style output, which will hopefully lead to the death of the 'GasBugWorkaroundEmitter'. This also includes changes to wrap the whole file to 80 columns! Woot! :) Note that the AT&T style output has not been tested at all. llvm-svn: 16638	2004-10-03 20:35:00 +00:00
Chris Lattner	30b5b79aa0	Add initial support for variants llvm-svn: 16635	2004-10-03 19:34:18 +00:00
Chris Lattner	815b635639	Do not repeat the map lookup llvm-svn: 16633	2004-10-01 23:16:43 +00:00

1 2 3 4 5 ...

7804 Commits