mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-25 20:23:11 +01:00
fc78767730
difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
79 lines
3.0 KiB
C++
79 lines
3.0 KiB
C++
//===- Thumb2InstrInfo.h - Thumb-2 Instruction Information ------*- C++ -*-===//
|
|
//
|
|
// The LLVM Compiler Infrastructure
|
|
//
|
|
// This file is distributed under the University of Illinois Open Source
|
|
// License. See LICENSE.TXT for details.
|
|
//
|
|
//===----------------------------------------------------------------------===//
|
|
//
|
|
// This file contains the Thumb-2 implementation of the TargetInstrInfo class.
|
|
//
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
#ifndef THUMB2INSTRUCTIONINFO_H
|
|
#define THUMB2INSTRUCTIONINFO_H
|
|
|
|
#include "llvm/Target/TargetInstrInfo.h"
|
|
#include "ARM.h"
|
|
#include "ARMInstrInfo.h"
|
|
#include "Thumb2RegisterInfo.h"
|
|
|
|
namespace llvm {
|
|
class ARMSubtarget;
|
|
class ScheduleHazardRecognizer;
|
|
|
|
class Thumb2InstrInfo : public ARMBaseInstrInfo {
|
|
Thumb2RegisterInfo RI;
|
|
public:
|
|
explicit Thumb2InstrInfo(const ARMSubtarget &STI);
|
|
|
|
// Return the non-pre/post incrementing version of 'Opc'. Return 0
|
|
// if there is not such an opcode.
|
|
unsigned getUnindexedOpcode(unsigned Opc) const;
|
|
|
|
void ReplaceTailWithBranchTo(MachineBasicBlock::iterator Tail,
|
|
MachineBasicBlock *NewDest) const;
|
|
|
|
bool isLegalToSplitMBBAt(MachineBasicBlock &MBB,
|
|
MachineBasicBlock::iterator MBBI) const;
|
|
|
|
void copyPhysReg(MachineBasicBlock &MBB,
|
|
MachineBasicBlock::iterator I, DebugLoc DL,
|
|
unsigned DestReg, unsigned SrcReg,
|
|
bool KillSrc) const;
|
|
|
|
void storeRegToStackSlot(MachineBasicBlock &MBB,
|
|
MachineBasicBlock::iterator MBBI,
|
|
unsigned SrcReg, bool isKill, int FrameIndex,
|
|
const TargetRegisterClass *RC,
|
|
const TargetRegisterInfo *TRI) const;
|
|
|
|
void loadRegFromStackSlot(MachineBasicBlock &MBB,
|
|
MachineBasicBlock::iterator MBBI,
|
|
unsigned DestReg, int FrameIndex,
|
|
const TargetRegisterClass *RC,
|
|
const TargetRegisterInfo *TRI) const;
|
|
|
|
/// scheduleTwoAddrSource - Schedule the copy / re-mat of the source of the
|
|
/// two-addrss instruction inserted by two-address pass.
|
|
void scheduleTwoAddrSource(MachineInstr *SrcMI, MachineInstr *UseMI,
|
|
const TargetRegisterInfo &TRI) const;
|
|
|
|
/// getRegisterInfo - TargetInstrInfo is a superset of MRegister info. As
|
|
/// such, whenever a client has an instance of instruction info, it should
|
|
/// always be able to get register info as well (through this method).
|
|
///
|
|
const Thumb2RegisterInfo &getRegisterInfo() const { return RI; }
|
|
};
|
|
|
|
/// getITInstrPredicate - Valid only in Thumb2 mode. This function is identical
|
|
/// to llvm::getInstrPredicate except it returns AL for conditional branch
|
|
/// instructions which are "predicated", but are not in IT blocks.
|
|
ARMCC::CondCodes getITInstrPredicate(const MachineInstr *MI, unsigned &PredReg);
|
|
|
|
|
|
}
|
|
|
|
#endif // THUMB2INSTRUCTIONINFO_H
|