llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Alkis Evlogimenos	2caa729f02	Remove asssert since it is breaking cases that it shouldn't. llvm-svn: 11841	2004-02-25 22:01:06 +00:00
Alkis Evlogimenos	f1516015af	Add DenseMap template and actually use it for for mapping virtual regs to objects. llvm-svn: 11840	2004-02-25 21:55:45 +00:00
Chris Lattner	24c4dd56b5	Add a new pass, run internalize first llvm-svn: 11839	2004-02-25 21:35:13 +00:00
Chris Lattner	0dfe43a53a	Add a new pass llvm-svn: 11838	2004-02-25 21:35:02 +00:00
Chris Lattner	64cbbad38e	Add prototype llvm-svn: 11837	2004-02-25 21:34:51 +00:00
Chris Lattner	2a13dd5706	My faith in programmers has been found to be totally misplaced. One would assume that if they don't intend to write to a global variable, that they would mark it as constant. However, there are people that don't understand that the compiler can do nice things for them if they give it the information it needs. This pass looks for blatently obvious globals that are only ever read from. Though it uses a trivially simple "alias analysis" of sorts, it is still able to do amazing things to important benchmarks. 253.perlbmk, for example, contains several *GIANT* function pointer tables that are not marked constant and should be. Marking them constant allows the optimizer to turn a whole bunch of indirect calls into direct calls. Note that only a link-time optimizer can do this transformation, but perlbmk does have several strings and other minor globals that can be marked constant by this pass when run from GCCAS. 176.gcc has a ton of strings and large tables that are marked constant, both at compile time (38 of them) and at link time (48 more). Other benchmarks give similar results, though it seems like big ones have disproportionally more than small ones. This pass is extremely quick and does good things. I'm going to enable it in gccas & gccld. Not bad for 50 SLOC. llvm-svn: 11836	2004-02-25 21:34:36 +00:00
Misha Brukman	6a13621948	SparcV8 regs are really 32-bit, not 64! Thanks, Chris. llvm-svn: 11835	2004-02-25 21:03:02 +00:00
Misha Brukman	f12c1e5a55	Clean up the tablegen descriptions for SparcV8. llvm-svn: 11834	2004-02-25 21:02:21 +00:00
Misha Brukman	c8801eb5be	Fix the SparcV8 register definitions that were imported from PPC template. llvm-svn: 11833	2004-02-25 21:00:05 +00:00
Misha Brukman	a4b3e0f01b	SparcV8 has different types of instructions, but F1 is only used for CALL. llvm-svn: 11832	2004-02-25 20:52:20 +00:00
Brian Gaeke	0f7bfb8ef5	Note that this test is currently expected to fail. llvm-svn: 11831	2004-02-25 20:34:02 +00:00
Chris Lattner	ccae3f6d60	Add an assertion llvm-svn: 11830	2004-02-25 19:37:44 +00:00
Chris Lattner	7c05e5d4d8	Fix failures in 099.go due to the cfgsimplify pass creating switch instructions where there did not used to be any before llvm-svn: 11829	2004-02-25 19:30:19 +00:00
Brian Gaeke	5166390fd2	SparcV8 skeleton llvm-svn: 11828	2004-02-25 19:28:19 +00:00
Brian Gaeke	c6de948cd1	Great renaming part II: Sparc --> SparcV9 (also includes command-line options and Makefiles) llvm-svn: 11827	2004-02-25 19:08:12 +00:00
Brian Gaeke	965df0b91b	Great renaming: Sparc --> SparcV9 llvm-svn: 11826	2004-02-25 18:44:15 +00:00
Chris Lattner	4f09004dff	Add a bunch more functions used by perlbmk llvm-svn: 11824	2004-02-25 17:43:20 +00:00
John Criswell	cf4474a67b	Updated to use llc to generate CBE code. llvm-svn: 11823	2004-02-25 17:15:02 +00:00
Chris Lattner	1dfc1b9629	Substantial improvements and cleanups for the release notes. We were missing a bunch of stuff! :) llvm-svn: 11822	2004-02-25 16:36:51 +00:00
Chris Lattner	04f116953d	Fix incorrect debug code llvm-svn: 11821	2004-02-25 15:15:04 +00:00
Chris Lattner	ab9628ad18	Teach the instruction selector how to transform 'array' GEP computations into X86 scaled indexes. This allows us to compile GEP's like this: int* %test([10 x { int, { int } }]* %X, int %Idx) { %Idx = cast int %Idx to long %X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0 ret int* %X } Into a single address computation: test: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4] ret Before it generated: test: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] shl %ECX, 3 add %EAX, %ECX lea %EAX, DWORD PTR [%EAX + 4] ret This is useful for things like int/float/double arrays, as the indexing can be folded into the loads&stores, reducing register pressure and decreasing the pressure on the decode unit. With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot. On bzip2 for example, we go from this: 10665 asm-printer - Number of machine instrs printed 40 ra-local - Number of loads/stores folded into instructions 1708 ra-local - Number of loads added 1532 ra-local - Number of stores added 1354 twoaddressinstruction - Number of instructions added 1354 twoaddressinstruction - Number of two-address instructions 2794 x86-peephole - Number of peephole optimization performed to this: 9873 asm-printer - Number of machine instrs printed 41 ra-local - Number of loads/stores folded into instructions 1710 ra-local - Number of loads added 1521 ra-local - Number of stores added 789 twoaddressinstruction - Number of instructions added 789 twoaddressinstruction - Number of two-address instructions 2142 x86-peephole - Number of peephole optimization performed ... and these types of instructions are often in tight loops. Linear scan is also helped, but not as much. It goes from: 8787 asm-printer - Number of machine instrs printed 2389 liveintervals - Number of identity moves eliminated after coalescing 2288 liveintervals - Number of interval joins performed 3522 liveintervals - Number of intervals after coalescing 5810 liveintervals - Number of original intervals 700 spiller - Number of loads added 487 spiller - Number of stores added 303 spiller - Number of register spills 1354 twoaddressinstruction - Number of instructions added 1354 twoaddressinstruction - Number of two-address instructions 363 x86-peephole - Number of peephole optimization performed to: 7982 asm-printer - Number of machine instrs printed 1759 liveintervals - Number of identity moves eliminated after coalescing 1658 liveintervals - Number of interval joins performed 3282 liveintervals - Number of intervals after coalescing 4940 liveintervals - Number of original intervals 635 spiller - Number of loads added 452 spiller - Number of stores added 288 spiller - Number of register spills 789 twoaddressinstruction - Number of instructions added 789 twoaddressinstruction - Number of two-address instructions 258 x86-peephole - Number of peephole optimization performed Though I'm not complaining about the drop in the number of intervals. :) llvm-svn: 11820	2004-02-25 07:00:55 +00:00
Chris Lattner	dccf14825c	* Make the previous patch more efficient by not allocating a temporary MachineInstr to do analysis. * FOLD getelementptr instructions into loads and stores when possible, making use of some of the crazy X86 addressing modes. For example, the following C++ program fragment: struct complex { double re, im; complex(double r, double i) : re(r), im(i) {} }; inline complex operator+(const complex& a, const complex& b) { return complex(a.re+b.re, a.im+b.im); } complex addone(const complex& arg) { return arg + complex(1,0); } Used to be compiled to: _Z6addoneRK7complex: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] * mov %EDX, %ECX fld QWORD PTR [%EDX] fld1 faddp %ST(1) * add %ECX, 8 fld QWORD PTR [%ECX] fldz faddp %ST(1) * mov %ECX, %EAX fxch %ST(1) fstp QWORD PTR [%ECX] *** add %EAX, 8 fstp QWORD PTR [%EAX] ret Now it is compiled to: _Z6addoneRK7complex: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] fld QWORD PTR [%ECX] fld1 faddp %ST(1) fld QWORD PTR [%ECX + 8] fldz faddp %ST(1) fxch %ST(1) fstp QWORD PTR [%EAX] fstp QWORD PTR [%EAX + 8] ret Other programs should see similar improvements, across the board. Note that in addition to reducing instruction count, this also reduces register pressure a lot, always a good thing on X86. :) llvm-svn: 11819	2004-02-25 06:13:04 +00:00
Chris Lattner	10d08a2955	Add a helper to create an addressing mode given all of the pieces. llvm-svn: 11818	2004-02-25 06:01:07 +00:00
Chris Lattner	c0e2bc0250	add an inefficient way of folding structure and constant array indexes together into a single LEA instruction. This should improve the code generated for things like X->A.B.C[12].D. The bigger benefit is still coming though. Note that this uses an LEA instruction instead of an add, giving the register allocator more freedom. We should probably never generate ADDri32's. llvm-svn: 11817	2004-02-25 03:45:50 +00:00
Chris Lattner	969f90db77	Implement special case for storing an immediate into memory so that we don't need an intermediate register. llvm-svn: 11816	2004-02-25 02:56:58 +00:00
Brian Gaeke	5d29845f19	Cygwin defines log2 as a macro. Undef it here IFF it has already been defined, so that we always get the inline function instead. Remember, kids, like it says in the GCC manual, "An Inline Function is As Fast As a Macro." llvm-svn: 11815	2004-02-25 01:53:45 +00:00
Brian Gaeke	8bf1c4c026	small portability fix. llvm-svn: 11814	2004-02-24 22:58:31 +00:00
Chris Lattner	9036c86b14	Add support for 'rename' llvm-svn: 11813	2004-02-24 22:17:00 +00:00
Chris Lattner	57ee51ae0b	Make the verifier a little more explicit about this problem. llvm-svn: 11811	2004-02-24 22:06:07 +00:00
Chris Lattner	d9652be664	Add support for remove, fwrite, and fread Also fix problem where we didn't check to see if a node pointer was null. Though fclose(null) doesn't make a lot of sense, 300.twolf does it. llvm-svn: 11810	2004-02-24 22:02:48 +00:00
John Criswell	0158db84e0	Added the VTune tests. llvm-svn: 11809	2004-02-24 21:43:38 +00:00
Brian Gaeke	eae0364189	FunctionLiveVarInfo.h moved: include/llvm/CodeGen -> lib/Target/Sparc/LiveVar llvm-svn: 11804	2004-02-24 19:46:00 +00:00
Chris Lattner	9da41150e8	Fix some unexpected fallout from the config.h changes. Because the CBE no longer was getting this #include, it always fell back on the less precise floating point initializer values, causing some testsuite failures. llvm-svn: 11803	2004-02-24 18:34:10 +00:00
Chris Lattner	fc15346b60	Fix a faulty optimization on FP values llvm-svn: 11801	2004-02-24 18:10:14 +00:00
John Criswell	b086ac893e	Fixed minor typos. llvm-svn: 11800	2004-02-24 16:13:56 +00:00
Chris Lattner	7845e4f7f0	If a block is made dead, make sure to promptly remove it. llvm-svn: 11799	2004-02-24 16:09:21 +00:00
Alkis Evlogimenos	6d7150e9bb	Move machine code rewriter and spiller outside the register allocator. The implementation is completely rewritten and now employs several optimizations not exercised before. For example for 164.gzip we have 997 loads and 699 stores vs the 1221 loads and 880 stores we have before. llvm-svn: 11798	2004-02-24 08:58:30 +00:00
Chris Lattner	d678669018	Implement SimplifyCFG/switch_switch_fold.ll This case occurs many times in various benchmarks, especially when combined with the previous patch. This allows it to get stuff like: if (X == 4 \|\| X == 3) if (X == 5 \|\| X == 8) and switch (X) { case 4: case 5: case 6: if (X == 4 \|\| X == 5) llvm-svn: 11797	2004-02-24 07:23:58 +00:00
Chris Lattner	d75c0eef9f	New testcase. Switch instructions that go to switch instructions should be merged. llvm-svn: 11796	2004-02-24 07:21:09 +00:00
Alkis Evlogimenos	042f01039b	Add predicates for checking if a virtual register has a physical register mapping or a stack slot mapping. llvm-svn: 11795	2004-02-24 06:30:36 +00:00
Chris Lattner	9f2c8c7ea5	Add some helpful methods for dealing with switch instructions llvm-svn: 11794	2004-02-24 06:26:00 +00:00
Chris Lattner	1293e1d00c	Rearrange code a bit llvm-svn: 11793	2004-02-24 05:54:22 +00:00
Chris Lattner	e5db7dc4c6	Implement: test/Regression/Transforms/SimplifyCFG/switch_create.ll This turns code like this: if (X == 4 \| X == 7) and if (X != 4 & X != 7) into switch instructions. llvm-svn: 11792	2004-02-24 05:38:11 +00:00
Chris Lattner	90da2d674f	The simplifycfg pass should be able to turn stuff like: if (X == 4 \|\| X == 7) and if (X != 4 && X != 7) into switch instructions. llvm-svn: 11791	2004-02-24 05:34:44 +00:00
Chris Lattner	2b7ba0ef67	Wow, the description of the 'switch' instruction was out of date. llvm-svn: 11790	2004-02-24 04:54:45 +00:00
Chris Lattner	dc65f68a98	we no longer include boost llvm-svn: 11789	2004-02-24 04:02:20 +00:00
Chris Lattner	da760e3e77	Hrm, my find must have been faulty. It didn't remove these as well. llvm-svn: 11788	2004-02-24 03:54:22 +00:00
Chris Lattner	8138c7e4a2	Boost is now unneeded, thanks to the fix for PR253, contributed by Reid Spencer! llvm-svn: 11787	2004-02-24 03:53:00 +00:00
Chris Lattner	4c3548fcb9	Now that's a new feature! llvm-svn: 11786	2004-02-24 03:50:24 +00:00
Chris Lattner	e532a181c7	Use the new LLVM is_class template instead of the boost one, allowing us to remove our dependency on boost! Thanks to Reid Spencer for making this possible! llvm-svn: 11785	2004-02-24 03:50:05 +00:00

1 2 3 4 5 ...

10783 Commits