1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00
Commit Graph

213 Commits

Author SHA1 Message Date
Eric Christopher
bc6a51d63f Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.

llvm-svn: 119725
2010-11-18 19:40:05 +00:00
Evan Cheng
6b2be51f7e Silence compiler warnings.
llvm-svn: 119610
2010-11-18 01:43:23 +00:00
Evan Cheng
ce610bd6b3 Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368

llvm-svn: 119548
2010-11-17 20:13:28 +00:00
Evan Cheng
eed919e2fb Simplify code that toggle optional operand to ARM::CPSR.
llvm-svn: 119484
2010-11-17 08:06:50 +00:00
Bill Wendling
b450d320ec Encode the multi-load/store instructions with their respective modes ('ia',
'db', 'ib', 'da') instead of having that mode as a separate field in the
instruction. It's more convenient for the asm parser and much more readable for
humans.
<rdar://problem/8654088>

llvm-svn: 119310
2010-11-16 01:16:36 +00:00
Evan Cheng
4afa3a6b1f Code clean up. The peephole pass should be the one updating the instruction
iterator, not TII->OptimizeCompareInstr.

llvm-svn: 119186
2010-11-15 21:20:45 +00:00
Eric Christopher
e7f27cf66a Revert this temporarily.
llvm-svn: 118827
2010-11-11 19:47:02 +00:00
Eric Christopher
beb7a50acb Change the prologue and epilogue to use push/pop for the low ARM registers.
llvm-svn: 118823
2010-11-11 19:26:03 +00:00
Evan Cheng
67db408634 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427

llvm-svn: 118135
2010-11-03 00:45:17 +00:00
Bill Wendling
4340c9449a When we look at instructions to convert to setting the 's' flag, we need to look
at more than those which define CPSR. You can have this situation:

(1)  subs  ...
(2)  sub   r6, r5, r4
(3)  movge ...
(4)  cmp   r6, 0
(5)  movge ...

We cannot convert (2) to "subs" because (3) is using the CPSR set by
(1). There's an analogous situation here:

(1)  sub   r1, r2, r3
(2)  sub   r4, r5, r6
(3)  cmp   r4, ...
(5)  movge ...
(6)  cmp   r1, ...
(7)  movge ...

We cannot convert (1) to "subs" because of the intervening use of CPSR.

llvm-svn: 117950
2010-11-01 20:41:43 +00:00
Evan Cheng
7695213793 Fix fpscr <-> GPR latency info.
llvm-svn: 117737
2010-10-29 23:16:55 +00:00
Evan Cheng
392d2cbdcc Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.

llvm-svn: 117675
2010-10-29 18:09:28 +00:00
Evan Cheng
bc4588c439 Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.
llvm-svn: 117531
2010-10-28 06:47:08 +00:00
Evan Cheng
fdc80a0316 Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh.
llvm-svn: 117520
2010-10-28 02:00:25 +00:00
Evan Cheng
5c358e02ea - Assign load / store with shifter op address modes the right itinerary classes.
- For now, loads of [r, r] addressing mode is the same as the
  [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should
  identify the former case and reduce the output latency by 1.
- Also identify [r, r << 2] case. This special form of shifter addressing mode
  is "free".

llvm-svn: 117519
2010-10-28 01:49:06 +00:00
Jim Grosbach
86ecfda983 Refactor ARM STR/STRB instruction patterns into STR{B}i12 and STR{B}rs, like
the LDR instructions have. This makes the literal/register forms of the
instructions explicit and allows us to assign scheduling itineraries
appropriately. rdar://8477752

llvm-svn: 117505
2010-10-27 23:12:14 +00:00
Jim Grosbach
09eab01a37 The immediate operands of an LDRi12 instruction doesn't need the addrmode2
encoding tricks. Handle the 'imm doesn't fit in the insn' case.

llvm-svn: 117454
2010-10-27 16:50:31 +00:00
Jim Grosbach
5ccda16fe2 LDRi12 machine instructions handle negative offset operands normally (simple
integer values), not with the addrmode2 encoding.

llvm-svn: 117429
2010-10-27 01:19:41 +00:00
Jim Grosbach
4d4caf1384 Split ARM::LDRB into LDRBi12 and LDRBrs. Adjust accordingly. Continuing on
rdar://8477752.

llvm-svn: 117419
2010-10-27 00:19:44 +00:00
Jim Grosbach
30f6744f05 First part of refactoring ARM addrmode2 (load/store) instructions to be more
explicit about the operands. Split out the different variants into separate
instructions. This gives us the ability to, among other things, assign
different scheduling itineraries to the variants. rdar://8477752.

llvm-svn: 117409
2010-10-26 22:37:02 +00:00
Evan Cheng
324e678bb7 Use instruction itinerary to determine what instructions are 'cheap'.
llvm-svn: 117348
2010-10-26 02:08:50 +00:00
Chandler Carruth
7dd652736f Move the remaining attribute macros to systematic names based on the attribute
name and prefixed with 'LLVM_'.

llvm-svn: 117203
2010-10-23 08:40:19 +00:00
Evan Cheng
132906a2d3 Latency between CPSR def and branch is zero.
llvm-svn: 117192
2010-10-23 02:04:38 +00:00
Evan Cheng
1c8dafd12a Re-enable register pressure aware machine licm with fixes. Hoist() may have
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.

llvm-svn: 116845
2010-10-19 18:58:51 +00:00
Daniel Dunbar
6ff550c84d Revert r116781 "- Add a hook for target to determine whether an instruction def
is", which breaks some nightly tests.

llvm-svn: 116816
2010-10-19 17:14:24 +00:00
Evan Cheng
9c3f6f486e - Add a hook for target to determine whether an instruction def is
"long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.

llvm-svn: 116781
2010-10-19 00:55:07 +00:00
Bill Wendling
3b3d9aaa86 Don't recompute MachineRegisterInfo in the Optimize* method.
llvm-svn: 116750
2010-10-18 21:22:31 +00:00
Bill Wendling
ba58fb0f42 Check to make sure that the iterator isn't at the beginning of the basic block
before decrementing. <rdar://problem/8529919>

llvm-svn: 116126
2010-10-09 00:03:48 +00:00
Evan Cheng
7fa1e3a474 Code refactoring.
llvm-svn: 116002
2010-10-07 23:12:15 +00:00
Evan Cheng
1ce29574c2 Model operand cycles of vldm / vstm; also fixes scheduling itineraries of vldr / vstr, etc.
llvm-svn: 115898
2010-10-07 01:50:48 +00:00
Jim Grosbach
de2bd8cd3f Clean up MOVi32imm and t2MOVi32imm pseudo instruction definitions.
llvm-svn: 115853
2010-10-06 22:01:26 +00:00
Evan Cheng
6fbb6dea7c - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This
allow target to correctly compute latency for cases where static scheduling
  itineraries isn't sufficient. e.g. variable_ops instructions such as
  ARM::ldm.
  This also allows target without scheduling itineraries to compute operand
  latencies. e.g. X86 can return (approximated) latencies for high latency
  instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
  e.g. ldm and those used by store multiple instructions, e.g. stm.

llvm-svn: 115755
2010-10-06 06:27:31 +00:00
Michael J. Spencer
d26ae30ed9 fix MSVC 2010 build.
llvm-svn: 115594
2010-10-05 06:00:43 +00:00
Michael J. Spencer
12a13def14 Cleanup Whitespace.
llvm-svn: 115593
2010-10-05 06:00:33 +00:00
Owen Anderson
95581657a4 Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now,
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.

llvm-svn: 115364
2010-10-01 22:45:50 +00:00
Owen Anderson
e93e24ee5d Make the spelling of the flags for old-style if-conversion heuristics consistent between ARM and Thumb2.
llvm-svn: 115341
2010-10-01 20:33:47 +00:00
Owen Anderson
918c558636 Temporarily add a flag to make it easier to compare the new-style ARM if
conversion heuristics to the old-style ones.

llvm-svn: 115239
2010-09-30 23:48:38 +00:00
Gabor Greif
e1de402213 improve heuristics to find the 'and' corresponding to 'tst' to also catch opportunities on thumb2
added some doxygen on the way

llvm-svn: 115033
2010-09-29 10:12:08 +00:00
Owen Anderson
c34e1296b8 Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise
cost modeling for if-conversion.  Now if only we had a way to estimate the misprediction probability.

Adjsut CodeGen/ARM/ifcvt10.ll.  The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.

llvm-svn: 114995
2010-09-28 21:57:50 +00:00
Owen Anderson
c0e1200323 Part one of switching to using a more sane heuristic for determining if-conversion profitability.
Rather than having arbitrary cutoffs, actually try to cost model the conversion.

For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.

llvm-svn: 114973
2010-09-28 18:32:13 +00:00
Eric Christopher
bff0f1805f 80-col fixups.
llvm-svn: 114943
2010-09-28 04:18:29 +00:00
Evan Cheng
57ed0f6439 Fix r114632. Return if the only terminator is an unconditional branch after the redundant ones are deleted.
llvm-svn: 114688
2010-09-23 19:42:03 +00:00
Evan Cheng
ef7b4a9bd4 If there are multiple unconditional branches terminating a block, eliminate all
but the first one. Those will never be executed. There was logic to do this
but it was faulty.

llvm-svn: 114632
2010-09-23 06:54:40 +00:00
Evan Cheng
1d58965067 OptimizeCompareInstr should avoid iterating pass the beginning of the MBB when the 'and' instruction is after the comparison.
llvm-svn: 114506
2010-09-21 23:49:07 +00:00
Gabor Greif
324a43436f Fix buglet when the TST instruction directly uses the AND result.
I am unable to write a test for this case, help is solicited, though...
What I did is to tickle the code in the debugger and verify that we do the right thing.

llvm-svn: 114430
2010-09-21 13:30:57 +00:00
Gabor Greif
99c07b1d95 Move the search for the appropriate AND instruction
into OptimizeCompareInstr.
This necessitates the passing of CmpValue around,
so widen the virtual functions to accomodate.

No functionality changes.

llvm-svn: 114428
2010-09-21 12:01:15 +00:00
Chris Lattner
2edbad8a3d convert targets to the new MF.getMachineMemOperand interface.
llvm-svn: 114391
2010-09-21 04:39:43 +00:00
Jakob Stoklund Olesen
1a38bba871 Remember VLDMQ.
llvm-svn: 114026
2010-09-15 21:40:11 +00:00
Jakob Stoklund Olesen
3896e52f08 Add missing break.
llvm-svn: 114025
2010-09-15 21:40:09 +00:00
Jakob Stoklund Olesen
36aeeb67c3 Recognize VST1q64Pseudo and VSTMQ as stack slot stores.
Recognize VLD1q64Pseudo as a stack slot load.

Reject these if they are loading or storing a subregister. The API (and
VirtRegRewriter) doesn't know how to deal with that.

llvm-svn: 113985
2010-09-15 17:27:09 +00:00