llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00

Author	SHA1	Message	Date
Chris Lattner	4f504b0751	If we found a dead global, we should at least delete it... llvm-svn: 16858	2004-10-08 22:05:31 +00:00
Chris Lattner	7fc483bf28	* Pull out the meat of runOnModule into another function for clarity. * Do not lead dangling dead constants prevent optimization * Iterate global optimization while we're making progress. These changes allow us to be more aggressive, handling cases like GlobalOpt/iterate.llx without a problem (turning it into 'ret int 0'). llvm-svn: 16857	2004-10-08 20:59:28 +00:00
Chris Lattner	d777571d0c	We might as well delete the known-dead global sooner rather than later since we know it is dead. llvm-svn: 16855	2004-10-08 20:25:55 +00:00
Misha Brukman	af84f00600	Hyphenate target-(in)dependent for more tasty grammar goodness (tm) llvm-svn: 16854	2004-10-08 19:43:31 +00:00
Chris Lattner	816a8a5e1e	Temporarily disable a buggy transformation until it can be fixed. This fixes 254.gap. llvm-svn: 16853	2004-10-08 19:15:44 +00:00
Misha Brukman	d858079005	Adjust paths due to moving InstrSched to lib/Target/SparcV9 llvm-svn: 16852	2004-10-08 18:30:22 +00:00
Misha Brukman	a01e5cd2ea	InstrSched has been moved to lib/Target/SparcV9 llvm-svn: 16850	2004-10-08 18:12:53 +00:00
Misha Brukman	049f559995	InstrSched is SparcV9-specific and so has been moved to lib/Target/SparcV9/ llvm-svn: 16849	2004-10-08 18:12:14 +00:00
Misha Brukman	b1851ac70c	Single-space instead of double-spacing in the Makefile llvm-svn: 16848	2004-10-08 18:11:14 +00:00
Misha Brukman	cfe86b257b	Build InstrSched as well, and all three subdirs can be built independently llvm-svn: 16847	2004-10-08 18:10:48 +00:00
Misha Brukman	5ea7613a13	Single-space instead of double-spacing in the Makefile llvm-svn: 16845	2004-10-08 18:05:25 +00:00
Chris Lattner	71aecc5006	Implement SRA for global variables. This allows the other global variable optimizations to trigger much more often. This allows the elimination of several dozen more global variables in Programs/External. Note that we only do this for non-constant globals: constant globals will already be optimized out if the accesses to them permit it. This implements Transforms/GlobalOpt/globalsra.llx llvm-svn: 16842	2004-10-08 17:32:09 +00:00
Chris Lattner	1ac1e54bf9	Fix bug: 2004-10-08-SelectSetCCFold.llx. Normally this is hidden by the instcombine xform, which is why we didn't notice it before. llvm-svn: 16840	2004-10-08 16:34:13 +00:00
Chris Lattner	5839d93b51	Instcombine (X & FF00) + xx00 -> (X+xx00) & FF00, implementing and.ll:test27 This comes up when doing adds to bitfield elements. llvm-svn: 16836	2004-10-08 05:07:56 +00:00
Chris Lattner	87259c3ce9	Little patch to turn (shl (add X, 123), 4) -> (add (shl X, 4), 123 << 4) This triggers in cases of bitfield additions, opening opportunities for future improvements. llvm-svn: 16834	2004-10-08 03:46:20 +00:00
Nate Begeman	dfefd2f3fc	Implement logical and with an immediate that consists of a contiguous block of one or more 1 bits (may wrap from least significant bit to most significant bit) as the rlwinm rather than andi., andis., or some longer instructons sequence. int andn4(int z) { return z & -4; } int clearhi(int z) { return z & 0x0000FFFF; } int clearlo(int z) { return z & 0xFFFF0000; } int clearmid(int z) { return z & 0x00FFFF00; } int clearwrap(int z) { return z & 0xFF0000FF; } _andn4: rlwinm r3, r3, 0, 0, 29 blr _clearhi: rlwinm r3, r3, 0, 16, 31 blr _clearlo: rlwinm r3, r3, 0, 0, 15 blr _clearmid: rlwinm r3, r3, 0, 8, 23 blr _clearwrap: rlwinm r3, r3, 0, 24, 7 blr llvm-svn: 16832	2004-10-08 02:49:24 +00:00
Nate Begeman	370b1b7a9a	Several fixes and enhancements to the PPC32 backend. 1. Fix an illegal argument to getClassB when deciding whether or not to sign extend a byte load. 2. Initial addition of isLoad and isStore flags to the instruction .td file for eventual use in a scheduler. 3. Rewrite of how constants are handled in emitSimpleBinaryOperation so that we can emit the PowerPC shifted immediate instructions far more often. This allows us to emit the following code: int foo(int x) { return x \| 0x00F0000; } _foo: .LBB_foo_0: ; entry ; IMPLICIT_DEF oris r3, r3, 15 blr llvm-svn: 16826	2004-10-07 22:30:03 +00:00
Nate Begeman	76d2a77998	Add ori reg, reg, 0 as a move instruction. This can be generated from loading a 32bit constant into a register whose low halfword is all zeroes. We now omit the ori after the lis for the following C code: int bar(int y) { return y * 0x00F0000; } _bar: .LBB_bar_0: ; entry ; IMPLICIT_DEF lis r2, 15 mullw r3, r3, r2 blr llvm-svn: 16825	2004-10-07 22:26:12 +00:00
Nate Begeman	f60feea650	Remove unnecessary header include llvm-svn: 16824	2004-10-07 22:24:32 +00:00
Chris Lattner	7882b54197	Improve comments, no functionality changes llvm-svn: 16814	2004-10-07 21:30:30 +00:00
Chris Lattner	d15e144241	Fix a nasty dangling pointer problem, due to a free'd pointer being left in a map. This caused problems if a later object happened to be allocated at the free'd object's address. llvm-svn: 16813	2004-10-07 20:01:31 +00:00
Chris Lattner	50e55bcdb0	Unfortunately the fix for the previous bug introduced the previous exponential behavior (bork!). This patch processes stuff with an explicit SCC finder, allowing the algorithm to be more clear, efficient, and also (as a bonus) correct! This gets us back to taking 0.6s to disassemble my horrible .bc file that previously took something > 30 mins. llvm-svn: 16811	2004-10-07 19:20:48 +00:00
Chris Lattner	dfdbd62d37	Fix a bug in my previous change. Unfortunately this reverts most of the speedup, but has the advantage of not breaking a bunch of programs! llvm-svn: 16806	2004-10-07 16:19:40 +00:00
Chris Lattner	e1d5d599bd	Fix a bug in the safety analysis routine llvm-svn: 16804	2004-10-07 06:01:25 +00:00
Chris Lattner	e7ec24c63e	Comment cleanups llvm-svn: 16803	2004-10-07 06:00:24 +00:00
Chris Lattner	ad9fe72e72	* Rename pass to globalopt, since we do more than just constify * Instead of handling dead functions specially, just nuke them. * Be more aggressive about cleaning up after constification, in particular, handle getelementptr instructions and constantexprs. * Be a little bit more structured about how we process globals. *** Delete globals that are only stored to, and never read. These are clearly not useful, so they should go. This implements deadglobal.llx This last one triggers quite a few times. In particular, 2208 in the external tests, 1865 of which are in 252.eon. This shrinks eon from 1995094 to 1732341 bytes of bytecode. llvm-svn: 16802	2004-10-07 04:16:33 +00:00
Chris Lattner	4a19983f2d	Implement GlobalConstifier/trivialstore.llx, and also do some simplifications of the resultant program to avoid making later passes do it all. This allows us to constify globals that just have the same constant that they are initialized stored into them. Suprisingly this comes up ALL of the freaking time, dozens of times in SPEC, 30 times in vortex alone. For example, on 256.bzip2, it allows us to constify these two globals: %smallMode = internal global ubyte 0 ; <ubyte> [#uses=8] %verbosity = internal global int 0 ; <int> [#uses=49] Which (with later optimizations) results in the bytecode file shrinking from 82286 to 69686 bytes! Lets hear it for IPO :) For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }" code. llvm-svn: 16793	2004-10-06 20:57:02 +00:00
Chris Lattner	e412d10cc0	Dont' let null nodes sneak past cast instructions llvm-svn: 16779	2004-10-06 19:29:13 +00:00
Chris Lattner	c2563bf614	Change Type::isAbstract to have better comments, a more correct name (PromoteAbstractToConcrete), and to use a set to avoid recomputation. In particular, this set eliminates the potentially exponential cases from this little recursive algorithm. On a particularly nasty testcase, llvm-dis on the .bc file went from 34 minutes (which is when I killed it, it still hadn't finished) to 0.57s. Remember kids, exponential algorithms are bad. llvm-svn: 16772	2004-10-06 16:36:46 +00:00
Chris Lattner	38fbf09104	Correct some typeos llvm-svn: 16770	2004-10-06 16:28:24 +00:00
Chris Lattner	ff8cbd01e7	Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16 llvm-svn: 16769	2004-10-06 15:08:25 +00:00
Chris Lattner	82aa8544a5	Remove debugging code, fix encoding problem. This fixes the problems the JIT had last night. llvm-svn: 16766	2004-10-06 14:31:50 +00:00
Nate Begeman	79d42a185e	Turning on fsel code gen now that we can do so would be good. llvm-svn: 16765	2004-10-06 11:03:30 +00:00
Nate Begeman	7b4fe83ba8	Implement floating point select for lt, gt, le, ge using the powerpc fsel instruction. Now, rather than emitting the following loop out of bisect: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f4 bge .LBB_main_64 ; no_exit.0.i .LBB_main_63: ; no_exit.0.i b .LBB_main_65 ; no_exit.0.i .LBB_main_64: ; no_exit.0.i fmr f2, f1 .LBB_main_65: ; no_exit.0.i addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f1 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3) fcmpu cr0, f1, f5 bge .LBB_main_67 ; no_exit.0.i .LBB_main_66: ; no_exit.0.i b .LBB_main_68 ; no_exit.0.i .LBB_main_67: ; no_exit.0.i fmr f4, f1 .LBB_main_68: ; no_exit.0.i fadd f1, f2, f4 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fcmpu cr0, f4, f0 bgt .LBB_main_70 ; no_exit.0.i .LBB_main_69: ; no_exit.0.i b .LBB_main_71 ; no_exit.0.i .LBB_main_70: ; no_exit.0.i fmr f0, f4 .LBB_main_71: ; no_exit.0.i fsub f1, f2, f1 addi r2, r2, -1 fcmpu cr0, f1, f3 blt .LBB_main_73 ; no_exit.0.i .LBB_main_72: ; no_exit.0.i b .LBB_main_74 ; no_exit.0.i .LBB_main_73: ; no_exit.0.i fmr f3, f1 .LBB_main_74: ; no_exit.0.i cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i We emit this instead: .LBB_main_19: ; no_exit.0.i rlwinm r3, r2, 3, 0, 28 lfdx f1, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f2, f2, f1 fsel f1, f1, f1, f2 addi r3, r2, 1 rlwinm r3, r3, 3, 0, 28 lfdx f2, r3, r27 addis r3, r30, ha16(.CPI_main_1-"L00000$pb") lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3) fsub f4, f4, f2 fsel f2, f2, f2, f4 fadd f1, f1, f2 addis r3, r30, ha16(.CPI_main_2-"L00000$pb") lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3) fmul f1, f1, f2 rlwinm r3, r2, 3, 0, 28 lfdx f2, r3, r28 fadd f4, f2, f1 fsub f5, f0, f4 fsel f0, f5, f0, f4 fsub f1, f2, f1 addi r2, r2, -1 fsub f2, f1, f3 fsel f3, f2, f3, f1 cmpwi cr0, r2, -1 fmr f16, f0 fmr f17, f3 bgt .LBB_main_19 ; no_exit.0.i llvm-svn: 16764	2004-10-06 09:53:04 +00:00
Chris Lattner	b0e465f0cb	Codegen signed mod by 2 or -2 more efficiently. Instead of generating: t: mov %EDX, DWORD PTR [%ESP + 4] mov %ECX, 2 mov %EAX, %EDX sar %EDX, 31 idiv %ECX mov %EAX, %EDX ret Generate: t: mov %ECX, DWORD PTR [%ESP + 4] * mov %EAX, %ECX cdq and %ECX, 1 xor %ECX, %EDX sub %ECX, %EDX * mov %EAX, %ECX ret Note that the two marked moves are redundant, and should be eliminated by the register allocator, but aren't. Compare this to GCC, which generates: t: mov %eax, DWORD PTR [%esp+4] mov %edx, %eax shr %edx, 31 lea %ecx, [%edx+%eax] and %ecx, -2 sub %eax, %ecx ret or ICC 8.0, which generates: t: movl 4(%esp), %ecx #3.5 movl $-2147483647, %eax #3.25 imull %ecx #3.25 movl %ecx, %eax #3.25 sarl $31, %eax #3.25 addl %ecx, %edx #3.25 subl %edx, %eax #3.25 addl %eax, %eax #3.25 negl %eax #3.25 subl %eax, %ecx #3.25 movl %ecx, %eax #3.25 ret #3.25 We would be in great shape if not for the moves. llvm-svn: 16763	2004-10-06 05:01:07 +00:00
Chris Lattner	c959314701	Really fix FreeBSD, which apparently doesn't tolerate the extern. Thanks to Jeff Cohen for pointing out my goof. llvm-svn: 16762	2004-10-06 04:21:52 +00:00
Chris Lattner	09b6b3f514	Fix a scary bug with signed division by a power of two. We used to generate: s: ;; X / 4 mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 mov %EDX, %EAX add %EDX, %ECX sar %EAX, 2 ret When we really meant: s: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 1 shr %ECX, 30 add %EAX, %ECX sar %EAX, 2 ret Hey, this also reduces register pressure too :) llvm-svn: 16761	2004-10-06 04:19:43 +00:00
Chris Lattner	9258948b08	Codegen signed divides by 2 and -2 more efficiently. In particular instead of: s: ;; X / 2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax ret t: ;; X / -2 movl 4(%esp), %eax movl %eax, %ecx shrl $31, %ecx movl %eax, %edx addl %ecx, %edx sarl $1, %eax negl %eax ret Emit: s: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax ret t: movl 4(%esp), %eax cmpl $-2147483648, %eax sbbl $-1, %eax sarl $1, %eax negl %eax ret llvm-svn: 16760	2004-10-06 04:02:39 +00:00
Chris Lattner	acd213fba3	Add some new instructions. Fix the asm string for sbb32rr llvm-svn: 16759	2004-10-06 04:01:02 +00:00
Chris Lattner	5f0c904ec0	Reduce code growth implied by the tail duplication pass by not duplicating an instruction if it can be hoisted to a common dominator of the block. This implements: test/Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 16758	2004-10-06 03:27:37 +00:00
Chris Lattner	b2e8fdc431	FreeBSD uses GCC. Patch contributed by Jeff Cohen! llvm-svn: 16756	2004-10-06 03:15:44 +00:00
Brian Gaeke	38641114a3	Must include sys/stat.h before declaring a 'struct stat' llvm-svn: 16728	2004-10-05 18:46:59 +00:00
Chris Lattner	6023408a6e	Make sure the const bit gets inherited correctly when linking declarations of disagreeing constness. This fixes test/Regression/Linker/ConstantGlobals[123].ll llvm-svn: 16692	2004-10-05 02:28:11 +00:00
Reid Spencer	bba26329ab	Adjust sys/stat.h inclusion so its only for SunOS. llvm-svn: 16686	2004-10-05 00:56:46 +00:00
Tanya Lattner	7198953962	Added a couple of includes to get this to compile on Sparc. llvm-svn: 16685	2004-10-05 00:51:26 +00:00
Chris Lattner	6547451f32	Solaris doesn't have MAP_FILE. llvm-svn: 16682	2004-10-05 00:46:21 +00:00
Reid Spencer	079b225788	Excise the ill-advised RLCOMP compression algorithm and simply leave the previously temporary NULLCOMP implementation that merely copies the data verbatim without compression. Also, don't warn if there's no compression library as that is taken care of during configuration time. llvm-svn: 16654	2004-10-04 17:45:44 +00:00
Reid Spencer	49089d64c2	Add a context for the callback so different compression scenarios can be distinguished. Tidy up documentation. Thanks, Chris. llvm-svn: 16652	2004-10-04 17:29:25 +00:00
Chris Lattner	9f6c72d660	Fix build if not HAVE_BZIP2 llvm-svn: 16650	2004-10-04 16:33:25 +00:00
Reid Spencer	da2e8b9943	First version of the MappedFile abstraction for operating system idependent mapping of files. This first version uses mmap where its available. The class needs to implement an alternate mechanism based on malloc'd memory and file reading/writing for platforms without virtual memory. llvm-svn: 16649	2004-10-04 11:08:32 +00:00

1 2 3 4 5 ...

7816 Commits