llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Eric Christopher	bc6a51d63f	Rewrite stack callee saved spills and restores to use push/pop instructions. Remove movePastCSLoadStoreOps and associated code for simple pointer increments. Update routines that depended upon other opcodes for save/restore. Adjust all testcases accordingly. llvm-svn: 119725	2010-11-18 19:40:05 +00:00
Evan Cheng	6b2be51f7e	Silence compiler warnings. llvm-svn: 119610	2010-11-18 01:43:23 +00:00
Evan Cheng	ce610bd6b3	Remove ARM isel hacks that fold large immediates into a pair of add, sub, and, and xor. The 32-bit move immediates can be hoisted out of loops by machine LICM but the isel hacks were preventing them. Instead, let peephole optimization pass recognize registers that are defined by immediates and the ARM target hook will fold the immediates in. Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ instructions if there are multiple uses. This happens when the 'and' is live out, machine sink would have sinked the computation and that ends up pessimizing code. The peephole pass would recognize situations where the 'and' can be toggled to define CPSR and eliminate the comparison anyway. 2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking important optimizations. rdar://8663787, rdar://8241368 llvm-svn: 119548	2010-11-17 20:13:28 +00:00
Evan Cheng	eed919e2fb	Simplify code that toggle optional operand to ARM::CPSR. llvm-svn: 119484	2010-11-17 08:06:50 +00:00
Bill Wendling	b450d320ec	Encode the multi-load/store instructions with their respective modes ('ia', 'db', 'ib', 'da') instead of having that mode as a separate field in the instruction. It's more convenient for the asm parser and much more readable for humans. <rdar://problem/8654088> llvm-svn: 119310	2010-11-16 01:16:36 +00:00
Evan Cheng	4afa3a6b1f	Code clean up. The peephole pass should be the one updating the instruction iterator, not TII->OptimizeCompareInstr. llvm-svn: 119186	2010-11-15 21:20:45 +00:00
Eric Christopher	e7f27cf66a	Revert this temporarily. llvm-svn: 118827	2010-11-11 19:47:02 +00:00
Eric Christopher	beb7a50acb	Change the prologue and epilogue to use push/pop for the low ARM registers. llvm-svn: 118823	2010-11-11 19:26:03 +00:00
Evan Cheng	67db408634	Two sets of changes. Sorry they are intermingled. 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135	2010-11-03 00:45:17 +00:00
Bill Wendling	4340c9449a	When we look at instructions to convert to setting the 's' flag, we need to look at more than those which define CPSR. You can have this situation: (1) subs ... (2) sub r6, r5, r4 (3) movge ... (4) cmp r6, 0 (5) movge ... We cannot convert (2) to "subs" because (3) is using the CPSR set by (1). There's an analogous situation here: (1) sub r1, r2, r3 (2) sub r4, r5, r6 (3) cmp r4, ... (5) movge ... (6) cmp r1, ... (7) movge ... We cannot convert (1) to "subs" because of the intervening use of CPSR. llvm-svn: 117950	2010-11-01 20:41:43 +00:00
Evan Cheng	7695213793	Fix fpscr <-> GPR latency info. llvm-svn: 117737	2010-10-29 23:16:55 +00:00
Evan Cheng	392d2cbdcc	Avoiding overly aggressive latency scheduling. If the two nodes share an operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675	2010-10-29 18:09:28 +00:00
Evan Cheng	bc4588c439	Re-commit 117518 and 117519 now that ARM MC test failures are out of the way. llvm-svn: 117531	2010-10-28 06:47:08 +00:00
Evan Cheng	fdc80a0316	Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh. llvm-svn: 117520	2010-10-28 02:00:25 +00:00
Evan Cheng	5c358e02ea	- Assign load / store with shifter op address modes the right itinerary classes. - For now, loads of [r, r] addressing mode is the same as the [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should identify the former case and reduce the output latency by 1. - Also identify [r, r << 2] case. This special form of shifter addressing mode is "free". llvm-svn: 117519	2010-10-28 01:49:06 +00:00
Jim Grosbach	86ecfda983	Refactor ARM STR/STRB instruction patterns into STR{B}i12 and STR{B}rs, like the LDR instructions have. This makes the literal/register forms of the instructions explicit and allows us to assign scheduling itineraries appropriately. rdar://8477752 llvm-svn: 117505	2010-10-27 23:12:14 +00:00
Jim Grosbach	09eab01a37	The immediate operands of an LDRi12 instruction doesn't need the addrmode2 encoding tricks. Handle the 'imm doesn't fit in the insn' case. llvm-svn: 117454	2010-10-27 16:50:31 +00:00
Jim Grosbach	5ccda16fe2	LDRi12 machine instructions handle negative offset operands normally (simple integer values), not with the addrmode2 encoding. llvm-svn: 117429	2010-10-27 01:19:41 +00:00
Jim Grosbach	4d4caf1384	Split ARM::LDRB into LDRBi12 and LDRBrs. Adjust accordingly. Continuing on rdar://8477752. llvm-svn: 117419	2010-10-27 00:19:44 +00:00
Jim Grosbach	30f6744f05	First part of refactoring ARM addrmode2 (load/store) instructions to be more explicit about the operands. Split out the different variants into separate instructions. This gives us the ability to, among other things, assign different scheduling itineraries to the variants. rdar://8477752. llvm-svn: 117409	2010-10-26 22:37:02 +00:00
Evan Cheng	324e678bb7	Use instruction itinerary to determine what instructions are 'cheap'. llvm-svn: 117348	2010-10-26 02:08:50 +00:00
Chandler Carruth	7dd652736f	Move the remaining attribute macros to systematic names based on the attribute name and prefixed with 'LLVM_'. llvm-svn: 117203	2010-10-23 08:40:19 +00:00
Evan Cheng	132906a2d3	Latency between CPSR def and branch is zero. llvm-svn: 117192	2010-10-23 02:04:38 +00:00
Evan Cheng	1c8dafd12a	Re-enable register pressure aware machine licm with fixes. Hoist() may have erased the instruction during LICM so UpdateRegPressureAfter() should not reference it afterwards. llvm-svn: 116845	2010-10-19 18:58:51 +00:00
Daniel Dunbar	6ff550c84d	Revert r116781 "- Add a hook for target to determine whether an instruction def is", which breaks some nightly tests. llvm-svn: 116816	2010-10-19 17:14:24 +00:00
Evan Cheng	9c3f6f486e	- Add a hook for target to determine whether an instruction def is "long latency" enough to hoist even if it may increase spilling. Reloading a value from spill slot is often cheaper than performing an expensive computation in the loop. For X86, that means machine LICM will hoist SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON instructions. - Enable register pressure aware machine LICM by default. llvm-svn: 116781	2010-10-19 00:55:07 +00:00
Bill Wendling	3b3d9aaa86	Don't recompute MachineRegisterInfo in the Optimize* method. llvm-svn: 116750	2010-10-18 21:22:31 +00:00
Bill Wendling	ba58fb0f42	Check to make sure that the iterator isn't at the beginning of the basic block before decrementing. <rdar://problem/8529919> llvm-svn: 116126	2010-10-09 00:03:48 +00:00
Evan Cheng	7fa1e3a474	Code refactoring. llvm-svn: 116002	2010-10-07 23:12:15 +00:00
Evan Cheng	1ce29574c2	Model operand cycles of vldm / vstm; also fixes scheduling itineraries of vldr / vstr, etc. llvm-svn: 115898	2010-10-07 01:50:48 +00:00
Jim Grosbach	de2bd8cd3f	Clean up MOVi32imm and t2MOVi32imm pseudo instruction definitions. llvm-svn: 115853	2010-10-06 22:01:26 +00:00
Evan Cheng	6fbb6dea7c	- Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This allow target to correctly compute latency for cases where static scheduling itineraries isn't sufficient. e.g. variable_ops instructions such as ARM::ldm. This also allows target without scheduling itineraries to compute operand latencies. e.g. X86 can return (approximated) latencies for high latency instructions such as division. - Compute operand latencies for those defined by load multiple instructions, e.g. ldm and those used by store multiple instructions, e.g. stm. llvm-svn: 115755	2010-10-06 06:27:31 +00:00
Michael J. Spencer	d26ae30ed9	fix MSVC 2010 build. llvm-svn: 115594	2010-10-05 06:00:43 +00:00
Michael J. Spencer	12a13def14	Cleanup Whitespace. llvm-svn: 115593	2010-10-05 06:00:33 +00:00
Owen Anderson	95581657a4	Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now, stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide more nuanced estimates in the future. llvm-svn: 115364	2010-10-01 22:45:50 +00:00
Owen Anderson	e93e24ee5d	Make the spelling of the flags for old-style if-conversion heuristics consistent between ARM and Thumb2. llvm-svn: 115341	2010-10-01 20:33:47 +00:00
Owen Anderson	918c558636	Temporarily add a flag to make it easier to compare the new-style ARM if conversion heuristics to the old-style ones. llvm-svn: 115239	2010-09-30 23:48:38 +00:00
Gabor Greif	e1de402213	improve heuristics to find the 'and' corresponding to 'tst' to also catch opportunities on thumb2 added some doxygen on the way llvm-svn: 115033	2010-09-29 10:12:08 +00:00
Owen Anderson	c34e1296b8	Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability. Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable. llvm-svn: 114995	2010-09-28 21:57:50 +00:00
Owen Anderson	c0e1200323	Part one of switching to using a more sane heuristic for determining if-conversion profitability. Rather than having arbitrary cutoffs, actually try to cost model the conversion. For now, the constants are tuned to more or less match our existing behavior, but these will be changed to reflect realistic values as this work proceeds. llvm-svn: 114973	2010-09-28 18:32:13 +00:00
Eric Christopher	bff0f1805f	80-col fixups. llvm-svn: 114943	2010-09-28 04:18:29 +00:00
Evan Cheng	57ed0f6439	Fix r114632. Return if the only terminator is an unconditional branch after the redundant ones are deleted. llvm-svn: 114688	2010-09-23 19:42:03 +00:00
Evan Cheng	ef7b4a9bd4	If there are multiple unconditional branches terminating a block, eliminate all but the first one. Those will never be executed. There was logic to do this but it was faulty. llvm-svn: 114632	2010-09-23 06:54:40 +00:00
Evan Cheng	1d58965067	OptimizeCompareInstr should avoid iterating pass the beginning of the MBB when the 'and' instruction is after the comparison. llvm-svn: 114506	2010-09-21 23:49:07 +00:00
Gabor Greif	324a43436f	Fix buglet when the TST instruction directly uses the AND result. I am unable to write a test for this case, help is solicited, though... What I did is to tickle the code in the debugger and verify that we do the right thing. llvm-svn: 114430	2010-09-21 13:30:57 +00:00
Gabor Greif	99c07b1d95	Move the search for the appropriate AND instruction into OptimizeCompareInstr. This necessitates the passing of CmpValue around, so widen the virtual functions to accomodate. No functionality changes. llvm-svn: 114428	2010-09-21 12:01:15 +00:00
Chris Lattner	2edbad8a3d	convert targets to the new MF.getMachineMemOperand interface. llvm-svn: 114391	2010-09-21 04:39:43 +00:00
Jakob Stoklund Olesen	1a38bba871	Remember VLDMQ. llvm-svn: 114026	2010-09-15 21:40:11 +00:00
Jakob Stoklund Olesen	3896e52f08	Add missing break. llvm-svn: 114025	2010-09-15 21:40:09 +00:00
Jakob Stoklund Olesen	36aeeb67c3	Recognize VST1q64Pseudo and VSTMQ as stack slot stores. Recognize VLD1q64Pseudo as a stack slot load. Reject these if they are loading or storing a subregister. The API (and VirtRegRewriter) doesn't know how to deal with that. llvm-svn: 113985	2010-09-15 17:27:09 +00:00

1 2 3 4 5

213 Commits