llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Chris Lattner	28645a15bd	Take advantage of the recent improvements to the liveintervals set (tracking instructions which define each value#) to simplify and improve the coallescer. In particular, this patch: 1. Implements iterative coallescing. 2. Reverts an unsafe hack from handlePhysRegDef, superceeding it with a better solution. 3. Implements PR865, "coallescing" away the second copy in code like: A = B ... B = A This also includes changes to symbolically print registers in intervals when possible. llvm-svn: 29862	2006-08-24 22:43:55 +00:00
Bill Wendling	33d04dd115	Added a check so that if we have two machine instructions in this form MOV R0, R1 MOV R1, R0 the second machine instruction is removed. Added a regression test. llvm-svn: 29792	2006-08-21 07:33:33 +00:00
Jim Laskey	085a8477a7	Eliminate data relocations by using NULL instead of global empty list. llvm-svn: 29250	2006-07-21 21:15:20 +00:00
Andrew Lenharth	c1074954fb	Reduce number of exported symbols llvm-svn: 29220	2006-07-20 17:28:38 +00:00
Chris Lattner	de706b3e3e	Shave another 27K off libllvmgcc.dylib with visibility hidden llvm-svn: 28973	2006-06-28 22:17:39 +00:00
Chris Lattner	685568510a	Move some methods out of MachineInstr into MachineOperand llvm-svn: 28102	2006-05-04 17:52:23 +00:00
Chris Lattner	c2914c8ac5	Fix a latent bug that my spiller patch last week exposed: we were leaving instructions in the virtregfolded map that were deleted. Because they were deleted, newly allocated instructions could end up at the same address, magically finding themselves in the map. The solution is to remove entries from the map when we delete the instructions. llvm-svn: 28041	2006-05-01 22:03:24 +00:00
Chris Lattner	a9f3c7c50a	When promoting a load to a reg-reg copy, where the load was a previous instruction folded with spill code, make sure the remove the load from the virt reg folded map. llvm-svn: 28040	2006-05-01 21:17:10 +00:00
Chris Lattner	befcd1e76d	Remove previous patch, which wasn't quite right. llvm-svn: 28039	2006-05-01 21:16:03 +00:00
Evan Cheng	ef706b77c4	Remove temp. option -spiller-check-liveout, it didn't cause any failure nor performance regressions. llvm-svn: 28029	2006-05-01 08:54:57 +00:00
Evan Cheng	02e72f8f55	Local spiller kills a store if the folded restore is turned into a copy. But this is incorrect if the spilled value live range extends beyond the current BB. It is currently controlled by a temporary option -spiller-check-liveout. llvm-svn: 28024	2006-04-30 08:41:47 +00:00
Chris Lattner	4e783a3101	Mapping of physregs can make it so that the designated and input physregs are the same. In this case, don't emit a noop copy. llvm-svn: 28008	2006-04-28 04:43:18 +00:00
Chris Lattner	85a24e23a2	When we have a two-address instruction where the input cannot be clobbered and is already available, instead of falling back to emitting a load, fall back to emitting a reg-reg copy. This generates significantly better code for some SSE testcases, as SSE has lots of two-address instructions and none of them are read/modify/write. As one example, this change does: pshufd %XMM5, XMMWORD PTR [%ESP + 84], 255 xorps %XMM2, %XMM5 cmpltps %XMM1, %XMM0 - movaps XMMWORD PTR [%ESP + 52], %XMM0 - movapd %XMM6, XMMWORD PTR [%ESP + 52] + movaps %XMM6, %XMM0 cmpltps %XMM6, XMMWORD PTR [%ESP + 68] movapd XMMWORD PTR [%ESP + 52], %XMM6 movaps %XMM6, %XMM0 cmpltps %XMM6, XMMWORD PTR [%ESP + 36] cmpltps %XMM3, %XMM0 - movaps XMMWORD PTR [%ESP + 20], %XMM0 - movapd %XMM7, XMMWORD PTR [%ESP + 20] + movaps %XMM7, %XMM0 cmpltps %XMM7, XMMWORD PTR [%ESP + 4] movapd XMMWORD PTR [%ESP + 20], %XMM7 cmpltps %XMM4, %XMM0 ... which is far better than a store followed by a load! llvm-svn: 28001	2006-04-28 01:46:50 +00:00
Chris Lattner	4f083f08ca	Fix a bug that Evan exposed with some changes he's making, and that was exposed with a fastcc problem (breaking pcompress2 on x86 with -enable-x86-fastcc). When reloading a reused reg, make sure to invalidate the reloaded reg, and check to see if there are any other pending uses of the same register. llvm-svn: 26369	2006-02-25 02:17:31 +00:00
Chris Lattner	b06aa63852	Remove debugging printout :) Add a minor compile time win, no codegen change. llvm-svn: 26368	2006-02-25 02:03:40 +00:00
Chris Lattner	a342126820	Refactor some code from being inline to being out in a new class with methods. This gets rid of two gotos, which is always nice, and also adds some comments. No functionality change, this is just a refactor. llvm-svn: 26367	2006-02-25 01:51:33 +00:00
Jeff Cohen	291586095d	Fix VC++ warning. llvm-svn: 25957	2006-02-04 03:27:39 +00:00
Chris Lattner	08dba8d9f8	Handle another case exposed on X86. llvm-svn: 25949	2006-02-03 23:50:46 +00:00
Chris Lattner	f0ef4b4391	Fix a nasty problem on two-address machines in the following situation: store EAX -> [ss#0] [ss#0] += 1 ... use(EAX) In this case, it is not valid to rewrite this as: store EAX -> [ss#0] EAX += 1 store EAX -> [ss#0] ;;; this would also delete the store above ... use(EAX) ... because EAX is not a dead at that point. Keep track of which registers we are allowed to clobber, and which ones we aren't, and don't clobber the ones we're not supposed to. :) This should resolve the issues on X86 last night. llvm-svn: 25948	2006-02-03 23:28:46 +00:00
Chris Lattner	03b42d7724	significantly simplify the VirtRegMap code by pulling the SpillSlotsAvailable and PhysRegsAvailable maps out into a new AvailableSpills struct. No functionality change. This paves the way for a bugfix, coming up next. llvm-svn: 25947	2006-02-03 23:13:58 +00:00
Jeff Cohen	e2f56a56f6	Fix VC++ compilation error caused by using a std::map iterator variable to receive a std::multimap iterator value. For some reason, GCC doesn't have a problem with this. llvm-svn: 25927	2006-02-03 03:48:54 +00:00
Chris Lattner	9ca1b98733	Remove move copies and dead stuff by not clobbering the result reg of a noop copy. llvm-svn: 25926	2006-02-03 03:16:14 +00:00
Chris Lattner	8e4e3207fe	Simplify some code llvm-svn: 25924	2006-02-03 03:06:49 +00:00
Chris Lattner	42c3d5124f	Add code that checks for noop copies, which triggers when either: 1. a target doesn't know how to fold load/stores into copies, or 2. the spiller rewrites the input to a copy to the same register as the dest instead of to the reloaded reg. This will be moved/improved in the near future, but allows elimination of some ancient x86 hacks. This eliminates 92 copies from SMG2000 on X86 and 163 copies from 252.eon. llvm-svn: 25922	2006-02-03 02:02:59 +00:00
Chris Lattner	e5344b1169	Physregs may hold multiple stack slot values at the same time. Keep track of this, and use it to our advantage (bwahahah). This allows us to eliminate another 60 instructions from smg2000 on PPC (probably significantly more on X86). A common old-new diff looks like this: stw r2, 3304(r1) - lwz r2, 3192(r1) stw r2, 3300(r1) - lwz r2, 3192(r1) stw r2, 3296(r1) - lwz r2, 3192(r1) stw r2, 3200(r1) - lwz r2, 3192(r1) stw r2, 3196(r1) - lwz r2, 3192(r1) + or r2, r2, r2 stw r2, 3188(r1) and - lwz r31, 604(r1) - lwz r13, 604(r1) - lwz r14, 604(r1) - lwz r15, 604(r1) - lwz r16, 604(r1) - lwz r30, 604(r1) + or r31, r30, r30 + or r13, r30, r30 + or r14, r30, r30 + or r15, r30, r30 + or r16, r30, r30 + or r30, r30, r30 Removal of the R = R copies is coming next... llvm-svn: 25919	2006-02-03 00:36:31 +00:00
Chris Lattner	935255c984	Fix a deficiency in the spiller that Evan noticed. In particular, consider this code: store [stack slot #0], R10 = add R14, [stack slot #0] The spiller didn't know that the store made the value of [stackslot#0] available in R10 IF the store came from a copy instruction with the store folded into it. This patch teaches VirtRegMap to look at these stores and recognize the values they make available. In one case Evan provided, this code: divsd %XMM0, %XMM1 movsd %XMM1, QWORD PTR [%ESP + 40] 1) movsd QWORD PTR [%ESP + 48], %XMM1 2) movsd %XMM1, QWORD PTR [%ESP + 48] addsd %XMM1, %XMM0 3) movsd QWORD PTR [%ESP + 48], %XMM1 movsd QWORD PTR [%ESP + 4], %XMM0 turns into: divsd %XMM0, %XMM1 movsd %XMM1, QWORD PTR [%ESP + 40] addsd %XMM1, %XMM0 3) movsd QWORD PTR [%ESP + 48], %XMM1 movsd QWORD PTR [%ESP + 4], %XMM0 In this case, instruction #2 was removed because of the value made available by #1, and inst #1 was later deleted because it is now never used before the stack slot is redefined by #3. This occurs here and there in a lot of code with high spilling, on PPC most of the removed loads/stores are LSU-reject-causing loads, which is nice. On X86, things are much better (because it spills more), where we nuke about 1% of the instructions from SMG2000 and several hundred from eon. More improvements to come... llvm-svn: 25917	2006-02-02 23:29:36 +00:00
Chris Lattner	15cb732cd7	Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place. Other methods should also be moved if anyoneis interested. :) llvm-svn: 25913	2006-02-02 20:12:32 +00:00
Chris Lattner	aafc339b4e	Add explicit #includes of <iostream> llvm-svn: 25515	2006-01-22 23:41:00 +00:00
Chris Lattner	c3ff71cc3a	Add an assertion, update DefInst even though no one uses it (dangling pointers don't help anyone) llvm-svn: 25081	2006-01-04 06:47:48 +00:00
Chris Lattner	3b848038ab	Fix the LLC regressions on X86 last night. In particular, when undoing previous copy elisions and we discover we need to reload a register, make sure to use the regclass of the original register for the reload, not the class of the current register. This avoid using 16-bit loads to reload 32-bit values. llvm-svn: 23645	2005-10-06 17:19:06 +00:00
Chris Lattner	5afc88fe07	Fix a bug in the local spiller, where we could take code like this: store r12 -> [ss#2] R3 = load [ss#1] use R3 R3 = load [ss#2] R4 = load [ss#1] and turn it into this code: store R12 -> [ss#2] R3 = load [ss#1] use R3 R3 = R12 R4 = R3 <- oops! The problem was that promoting R3 = load[ss#2] to a copy missed the fact that the instruction invalidated R3 at that point. llvm-svn: 23638	2005-10-05 18:30:19 +00:00
Chris Lattner	a9cd99bbc1	Change this code ot pass register classes into the stack slot spiller/reloader code. PrologEpilogInserter hasn't been updated yet though, so targets cannot use this info. llvm-svn: 23536	2005-09-30 01:29:00 +00:00
Chris Lattner	59dd979162	Teach the local spiller to turn stack slot loads into register-register copies when possible, avoiding the load (and avoiding the copy if the value is already in the right register). This patch came about when I noticed code like the following being generated: store R17 -> [SS1] ...blah... R4 = load [SS1] This was causing an LSU reject on the G5. This problem was due to the register allocator folding spill code into a reg-reg copy (producing the load), which prevented the spiller from being able to rewrite the load into a copy, despite the fact that the value was already available in a register. In the case above, we now rip out the R4 load and replace it with a R4 = R17 copy. This speeds up several programs on X86 (which spills a lot :) ), e.g. smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from 68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc. This may have a larger impact in some cases on the G5 (by avoiding LSU rejects), though it probably won't trigger as often (less spilling in general). Targets that implement folding of loads/stores into copies should implement the isLoadFromStackSlot hook to get this. llvm-svn: 23388	2005-09-19 06:56:21 +00:00
Chris Lattner	e7610bc599	Use continue in the use-processing loop to make it clear what the early exits are, simplify logic, and cause things to not be nested as deeply. This also uses MRI->areAliases instead of an explicit loop. No functionality change, just code cleanup. llvm-svn: 23296	2005-09-09 20:29:51 +00:00
Misha Brukman	774e55c446	Remove trailing whitespace llvm-svn: 21420	2005-04-21 22:36:52 +00:00
Chris Lattner	f81edb57b6	Make sure to notice that explicit physregs are used in the function llvm-svn: 21084	2005-04-04 21:35:34 +00:00
Chris Lattner	964297fc32	Update these register allocators to set the PhysRegUsed info in MachineFunction. llvm-svn: 19791	2005-01-23 22:45:13 +00:00
Chris Lattner	2087f3c8e9	Improve compatibility with acc llvm-svn: 19549	2005-01-14 15:54:24 +00:00
Chris Lattner	a361504a90	Clean up the MachineBasicBlock.h file, percolating #includes into this file. Patch contributed by Morten Ofstad llvm-svn: 17251	2004-10-26 15:35:58 +00:00
Chris Lattner	34acee9dbd	This patch fixes the nasty bug that caused 175.vpr to fail for X86 last night. The problem occurred when trying to reload this instruction: MOV32mr %reg2326, 8, %reg2297, 4, %reg2295 The value of reg2326 was available in EBX, so it was reused from there, instead of reloading it into EDX. The value of reg2297 was available in EDX, so it was reused from there, instead of reloading it into EDI. The value of reg2295 was not available, so we tried reloading it into EBX, its assigned register. However, we checked and saw that we already reloaded something into EBX, so we chose what reg2326 was assigned to (EDX) and reloaded into that register instead. Unfortunately EDX had already been used by reg2297, so reloading into EDX clobbered the value used by the reg2326 operand, breaking the program. The fix for this is to check that the newly picked register is ok. In this case we now find that EDX is already used and try using EDI, which succeeds. llvm-svn: 17006	2004-10-15 03:19:31 +00:00
Chris Lattner	2c87b68231	This patch adds and improves debugging output. No functionality changes. llvm-svn: 17005	2004-10-15 03:16:29 +00:00
Chris Lattner	815b635639	Do not repeat the map lookup llvm-svn: 16633	2004-10-01 23:16:43 +00:00
Chris Lattner	c521544a32	When a virtual register is folded into an instruction, keep track of whether it was a use, def, or both. This allows us to be less pessimistic in our analysis of them. In practice, this doesn't make a big difference, but it doesn't hurt either. llvm-svn: 16632	2004-10-01 23:15:36 +00:00
Chris Lattner	38467b8a66	Add a simple little improvement to the local spiller to keep track of stores and delete them if they turn out to be dead. This is a useful little hack that even speeds up some programs. For example, it speeds up Ptrdist/ks from 17.53s to 15.59s, and 188.ammp from 149s to 146s. This also speeds up llc :) llvm-svn: 16630	2004-10-01 19:47:12 +00:00
Chris Lattner	8a5b40154f	Substantially revamp the local spiller, causing it to actually improve the generated code over the simple spiller. The new local spiller generates substantially better code than the simple one in some cases, by reusing values that are loaded out of stack slots and kept available in registers. This primarily helps programs that are spilling a lot, and there is still stuff that can be done to improve it. This patch makes the local spiller the default, as it's only a tiny bit slower than the simple spiller (it increases the runtime of llc by < 1%). Here are some numbers with speedups. Program #reuse old(s) new(s) Speedup Povray: 3452, 16.87 -> 15.93 (5.5%) 177.mesa: 2176, 2.77 -> 2.76 (0%) 179.art: 35, 28.43 -> 28.01 (1.5%) 183.equake: 55, 61.44 -> 61.41 (0%) 188.ammp: 869, 174 -> 149 (15%) 164.gzip: 43, 40.73 -> 40.71 (0%) 175.vpr: 351, 18.54 -> 17.34 (6.5%) 176.gcc: 2471, 5.01 -> 4.92 (1.8%) 181.mcf 42, 79.30 -> 75.20 (5.2%) 186.crafty: 484, 29.73 -> 30.04 (-1%) 197.parser: 251, 10.47 -> 10.67 (-1%) 252.eon: 1501, 1.98 -> 1.75 (12%) 253.perlbm: 1183, 14.83 -> 14.42 (2.8%) 254.gap: 825, 7.46 -> 7.29 (2.3%) 255.vortex: 285, 10.51 -> 10.27 (2.3%) 256.bzip2: 63, 55.70 -> 55.20 (0.9%) 300.twolf: 830, 21.63 -> 22.00 (-1%) PtrDist/ks 14, 32.75 -> 17.53 (46.5%) Olden/tsp 46, 8.71 -> 8.24 (5.4%) Free/distray 70, 1.09 -> 0.99 (9.2%) llvm-svn: 16629	2004-10-01 19:04:51 +00:00
Chris Lattner	db2a0987cc	Use more efficient map operations. Fix a bug that would affect hypothetical targets that supported multiple memory operands. llvm-svn: 16614	2004-09-30 16:35:08 +00:00
Chris Lattner	8d8b8b05bd	There is no need to call MachineInstr::print directly, just send the MI& to an ostream. llvm-svn: 16613	2004-09-30 16:10:45 +00:00
Chris Lattner	168a4380d9	Simplify the logic in the simple spiller and capitalize some variables llvm-svn: 16609	2004-09-30 02:59:33 +00:00
Chris Lattner	1f61bfb971	Switch from defaulting to the 'local' spiller to the 'simple' spiller. The two spillers produce perfectly identical code (at least on povray and eon), but the simple spiller is substantially faster than the local spiller. Once the local spiller is improved, we can switch back. Switching cuts 5.2% off of the llc time for povray (about 1.3s). llvm-svn: 16608	2004-09-30 02:40:06 +00:00
Chris Lattner	d716b5739c	Don't use a densemap for keeping track of which vregs are already loaded, just use a simple vector. This speeds up -spiller=simple from taking 22s to taking .1s on povray (debug build). This change does not modify the generated code. llvm-svn: 16607	2004-09-30 02:33:48 +00:00

1 2

71 Commits