1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

31 Commits

Author SHA1 Message Date
Craig Topper
0534d071b7 Reorder includes to match coding standards. Fix an issue or two exposed by that.
llvm-svn: 152978
2012-03-17 07:33:42 +00:00
Jim Grosbach
cb853fbdc9 ARM implement TargetInstrInfo::getNoopForMachoTarget()
Without this hook, functions w/ a completely empty body (including no
epilogue) will cause an MCEmitter assertion failure.

For example,
define internal fastcc void @empty_function() {
  unreachable
}

rdar://10947471

llvm-svn: 151673
2012-02-28 23:53:30 +00:00
Jia Liu
b077b6085d Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Evan Cheng
fc78767730 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Evan Cheng
67db408634 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427

llvm-svn: 118135
2010-11-03 00:45:17 +00:00
Owen Anderson
95581657a4 Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now,
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.

llvm-svn: 115364
2010-10-01 22:45:50 +00:00
Owen Anderson
d6aa3da08e Provide an option to restore old-style if-conversion heuristics for Thumb2.
llvm-svn: 115339
2010-10-01 20:28:06 +00:00
Owen Anderson
c0e1200323 Part one of switching to using a more sane heuristic for determining if-conversion profitability.
Rather than having arbitrary cutoffs, actually try to cost model the conversion.

For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.

llvm-svn: 114973
2010-09-28 18:32:13 +00:00
Evan Cheng
c9cb37516d Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.

llvm-svn: 113570
2010-09-10 01:29:16 +00:00
Jakob Stoklund Olesen
938e41c1fa Replace copyRegToReg with copyPhysReg for ARM.
llvm-svn: 108078
2010-07-11 06:33:54 +00:00
Evan Cheng
346aecdb8b Change if-conversion block size limit checks to add some flexibility.
llvm-svn: 106901
2010-06-25 22:42:03 +00:00
Evan Cheng
a1ebf91a39 Tail merging pass shall not break up IT blocks. rdar://8115404
llvm-svn: 106517
2010-06-22 01:18:16 +00:00
Evan Cheng
b5fadc47e0 Allow ARM if-converter to be run after post allocation scheduling.
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
  scheduler. If-converter now runs branch folding / tail merging first to
  maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
  register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
  change the instruction ordering in the IT block (since IT mask has been
  finalized). It also ensures no other instructions can be scheduled between
  instructions in the IT block.

This is not yet enabled.

llvm-svn: 106344
2010-06-18 23:09:54 +00:00
Evan Cheng
501c37c9ca Allow target to place 2-address pass inserted copies in better spots. Thumb2 will use this to try to avoid breaking up IT blocks.
llvm-svn: 105745
2010-06-09 19:26:01 +00:00
Dan Gohman
497e752655 Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that it
doesn't have to guess.

llvm-svn: 103194
2010-05-06 20:33:48 +00:00
Evan Cheng
80f3051bb7 Add argument TargetRegisterInfo to loadRegFromStackSlot and storeRegToStackSlot.
llvm-svn: 103193
2010-05-06 19:06:44 +00:00
Dan Gohman
f9654e9258 Remove the target hook TargetInstrInfo::BlockHasNoFallThrough in favor of
MachineBasicBlock::canFallThrough(), which is target-independent and more
thorough.

llvm-svn: 90634
2009-12-05 00:44:40 +00:00
Evan Cheng
069209cf6b Refactor code.
llvm-svn: 86423
2009-11-08 00:15:23 +00:00
Jim Grosbach
7152d0423b 80-column cleanup of file header comments
llvm-svn: 86408
2009-11-07 22:00:39 +00:00
Evan Cheng
899d8cb6a0 Refactor code. Fix a potential missing check. Teach isIdentical() about tLDRpci_pic.
llvm-svn: 86330
2009-11-07 04:04:34 +00:00
Evan Cheng
8eaaffb9da - Add TargetInstrInfo::isIdentical(). It's similar to MachineInstr::isIdentical
except it doesn't care if the definitions' virtual registers differ. This is
  used by machine LICM and other MI passes to perform CSE.
- Teach Thumb2InstrInfo::isIdentical() to check two t2LDRpci_pic are identical.
  Since pc relative constantpool entries are always different, this requires it
  it check if the values can actually the same.

llvm-svn: 86328
2009-11-07 03:52:02 +00:00
Jim Grosbach
71c5411651 80-columns
llvm-svn: 86310
2009-11-07 00:13:30 +00:00
Evan Cheng
6e3e66375a - Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relative
load of a GV from constantpool and then add pc. It allows the code sequence to
  be rematerializable so it would be hoisted by machine licm.
- Add a late pass to break these pseudo instructions into a number of real
  instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm
  to this pass. This is done before post regalloc scheduling to allow the
  scheduler to proper schedule these instructions. It also allow them to be
  if-converted and shrunk by later passes.

llvm-svn: 86304
2009-11-06 23:52:48 +00:00
Evan Cheng
b740190d2e - More refactoring. This gets rid of all of the getOpcode calls.
- This change also makes it possible to switch between ARM / Thumb on a
  per-function basis.
- Fixed thumb2 routine which expand reg + arbitrary immediate. It was using
  using ARM so_imm logic.
- Use movw and movt to do reg + imm when profitable.
- Other code clean ups and minor optimizations.

llvm-svn: 77300
2009-07-28 05:48:47 +00:00
Evan Cheng
674c4d47b9 Use t2LDRi12 and t2STRi12 to load / store to / from stack frames. Eliminate more getOpcode calls.
llvm-svn: 77181
2009-07-27 03:14:20 +00:00
David Goodwin
85bcdffa4f Correctly handle the Thumb-2 imm8 addrmode. Specialize frame index elimination more exactly for Thumb-2 to get better code gen.
llvm-svn: 76919
2009-07-24 00:16:18 +00:00
David Goodwin
f3c839e0b9 Fix frame index elimination to correctly handle thumb-2 addressing modes that don't allow negative offsets. During frame elimination convert *i12 opcode to a *i8 when necessary due to a negative offset.
llvm-svn: 76883
2009-07-23 17:06:46 +00:00
Anton Korobeynikov
dc39f4fff8 Emit cross regclass register moves for thumb2.
Minor code duplication cleanup.

llvm-svn: 76124
2009-07-16 23:26:06 +00:00
David Goodwin
49fbd8d6b7 Use common code for both ARM and Thumb-2 instruction and register info.
llvm-svn: 75067
2009-07-08 23:10:31 +00:00
David Goodwin
5bdef4b3f7 Checkpoint Thumb2 Instr info work. Generalized base code so that it can be shared between ARM and Thumb2. Not yet activated because register information must be generalized first.
llvm-svn: 75010
2009-07-08 16:09:28 +00:00
David Goodwin
81075a5d30 Checkpoint refactoring of ThumbInstrInfo and ThumbRegisterInfo into Thumb1InstrInfo, Thumb2InstrInfo, Thumb1RegisterInfo and Thumb2RegisterInfo. Move methods from ARMInstrInfo to ARMBaseInstrInfo to prepare for sharing with Thumb2.
llvm-svn: 74731
2009-07-02 22:18:33 +00:00