1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

59 Commits

Author SHA1 Message Date
Simon Pilgrim
575c3c5874 [X86] Add WriteDPPD/WriteDPPS dot product scheduler classes
llvm-svn: 331489
2018-05-03 22:31:19 +00:00
Simon Pilgrim
bbc813836e [X86] Split WriteVecShift/WriteVarVecShift into MMX, XMM and YMM/ZMM scheduler classes
This took a bit of extra work as on Intel targets the old (V)PSLLDrr/(V)PSLLDrm style instructions act differently - I ended up creating WriteVecShiftImm classes for XMM/YMM/ZMM vector shift by immediate and retaining WriteVecShift as the default (used only by MMX) plus WriteVecShiftX/WriteVecShiftY. X86SchedWriteWidths hides most of this thank goodness.

llvm-svn: 331472
2018-05-03 17:56:43 +00:00
Simon Pilgrim
c4c90c5eac [X86] Split WriteVecALU/WritePHAdd into XMM and YMM/ZMM scheduler classes
llvm-svn: 331453
2018-05-03 13:27:10 +00:00
Simon Pilgrim
b7289046cc [X86] Split WriteVecIMul/WriteVecPMULLD/WriteMPSAD/WritePSADBW into XMM and YMM/ZMM scheduler classes
Also retagged VDBPSADBW instructions as SchedWritePSADBW instead of SchedWriteVecIMul which matches the behaviour on SkylakeServer (the only thing that supports it...)

llvm-svn: 331445
2018-05-03 10:31:20 +00:00
Simon Pilgrim
92b8f5874f [X86] Split WriteShuffle/WriteVarShuffle + WriteBlend/WriteVarBlend into XMM and YMM/ZMM scheduler classes
llvm-svn: 331386
2018-05-02 18:48:23 +00:00
Simon Pilgrim
7300ea08cb [X86] Split WriteFMul/WriteFDiv into XMM and YMM/ZMM scheduler classes
llvm-svn: 331293
2018-05-01 18:22:53 +00:00
Simon Pilgrim
a2ad8c4bc8 [X86] Split WriteFRcp/WriteFRsqrt/WriteFSqrt into XMM and YMM/ZMM scheduler classes
llvm-svn: 331290
2018-05-01 18:06:07 +00:00
Simon Pilgrim
ae4df65aff [X86] Split WriteFCmp into XMM and YMM/ZMM scheduler classes
Removes more WriteFCmp InstRW overrides

llvm-svn: 331283
2018-05-01 16:50:16 +00:00
Simon Pilgrim
313feeddfd [X86] Split WriteFAdd into XMM and YMM/ZMM scheduler classes
Removes more WriteFAdd InstRW overrides

llvm-svn: 331276
2018-05-01 16:13:42 +00:00
Simon Pilgrim
b6a15879c3 [X86] Split WriteFShuffle into XMM and YMM/ZMM scheduler classes
Removes more WriteFShuffle InstRW overrides

llvm-svn: 331264
2018-05-01 14:25:01 +00:00
Simon Pilgrim
1fb2ac7d07 [X86] Split WriteVecLogic into XMM and YMM/ZMM scheduler classes
This removes all the WriteVecLogic InstRW overrides.

llvm-svn: 331258
2018-05-01 12:39:17 +00:00
Simon Pilgrim
5efacccddd [X86][Atom] Remove unnecessary x87 load/move instrw overrides.
llvm-svn: 331198
2018-04-30 16:51:13 +00:00
Simon Pilgrim
fd7e512d13 [X86] Split WriteFBlend/WriteFVarBlend/WriteFVarShuffle into XMM and YMM/ZMM scheduler classes
This removes all the WriteFBlend/WriteFVarBlend InstRW overrides - some WriteFVarShuffle remain to be fixed.

llvm-svn: 331065
2018-04-27 18:19:48 +00:00
Simon Pilgrim
bd309c497b [X86] Split WriteFHadd into XMM and YMM/ZMM scheduler classes
This removes all the HADD/HSUB PS/PD InstRW overrides.

llvm-svn: 331054
2018-04-27 16:11:57 +00:00
Simon Pilgrim
a73756d82d [X86][AVX] Split WriteFLogic into XMM and YMM/ZMM scheduler classes
This removes all the AND/ANDN/OR/XOR PS/PD InstRW overrides.

llvm-svn: 331051
2018-04-27 15:50:33 +00:00
Simon Pilgrim
c22b24167f [X86] Replace some system instruction instregex single matches with instrs entry. NFCI.
llvm-svn: 331034
2018-04-27 13:32:42 +00:00
Simon Pilgrim
8429d5f513 [X86] Split WriteFMA into XMM, Scalar and YMM/ZMM scheduler classes
This removes all the FMA InstRW overrides.

If we ever get PR36924, then we can remove many of these declarations from models.

llvm-svn: 330820
2018-04-25 13:07:58 +00:00
Simon Pilgrim
c3baeaefd6 [X86] Split off PHMINPOSUW to their own schedule class
This also fixes Jaguar's schedule which was treating it as the WriteVecIMul default. 

llvm-svn: 330756
2018-04-24 18:49:25 +00:00
Simon Pilgrim
9326647813 [X86][F16C] Add WriteCvtF2FSt scheduling class
Fixes the classification of VCVTPS2PHmr/VCVTPS2PHYmr which were tagged as WriteCvtF2FLd_WriteRMW (PR36887)

llvm-svn: 330737
2018-04-24 16:43:07 +00:00
Simon Pilgrim
3603cba798 [X86] Add vector element insertion/extraction scheduler classes
Split off pinsr/pextr and extractps instructions.

(Mostly) fixes PR36887.

Note: It might be worth adding a WriteFInsertLd class as well in the future.

Differential Revision: https://reviews.llvm.org/D45929

llvm-svn: 330714
2018-04-24 13:21:41 +00:00
Simon Pilgrim
a2c00eb146 [X86] Remove unnecessary MMX reg-mem InstRW scheduler overrides.
llvm-svn: 330581
2018-04-23 11:57:15 +00:00
Simon Pilgrim
0bd59db959 [X86][Atom] Remove unnecessary scalar/vector load/move instrw overrides.
llvm-svn: 330548
2018-04-22 16:49:35 +00:00
Craig Topper
f944131ace [X86] Add SchedWrites for LDMXCSR/STMXCSR.
llvm-svn: 330517
2018-04-21 18:07:36 +00:00
Simon Pilgrim
64c82de81a [X86] Strip unnecessary MMX instruction instrw overrides from scheduler models.
llvm-svn: 330503
2018-04-21 12:15:42 +00:00
Simon Pilgrim
e52ad608b0 [X86] Add WriteFSign/WriteFLogic scheduler classes
Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes.

This unearthed a couple of things that are also handled in this patch:

(1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic
(2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015.
(3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops.

Differential Revision: https://reviews.llvm.org/D45629

llvm-svn: 330480
2018-04-20 21:16:05 +00:00
Craig Topper
8cc3b6fad0 [X86] Add separate scheduling class for PSADBW instruction.
llvm-svn: 330204
2018-04-17 19:35:19 +00:00
Simon Pilgrim
8ae32b4f07 [X86] Add FP comparison scheduler classes
Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd

Differential Revision: https://reviews.llvm.org/D45656

llvm-svn: 330179
2018-04-17 07:22:44 +00:00
Simon Pilgrim
de39a5c881 Remove comment reference to itineraries. NFCI.
llvm-svn: 330025
2018-04-13 14:42:48 +00:00
Simon Pilgrim
c09c832a8e [X86][Atom] Convert Atom scheduler model to SchedRW (PR32431)
Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550.

I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here,

There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers.

There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical.

NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329837
2018-04-11 18:23:01 +00:00
Craig Topper
f9c44a5cf4 [X86] Merge itineraries for CLC, CMC, and STC.
These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation.

llvm-svn: 329414
2018-04-06 16:16:43 +00:00
Craig Topper
317ad91d5e [X86] Make the multiply and divide itineraries more consistent.
Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage.

We also used the MUL8 itinerary for MULX32/64 which was also weird.

The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result.

llvm-svn: 327866
2018-03-19 16:38:33 +00:00
Craig Topper
bbff2cbc52 [X6] Remove two unused InstrItinClass
llvm-svn: 327819
2018-03-19 02:07:32 +00:00
Simon Pilgrim
25f8f6268c [X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule tests
Add missing RDTSCP itinerary

llvm-svn: 320581
2017-12-13 14:22:04 +00:00
Simon Pilgrim
513998f9e2 [X86][X87] Tag FCMOV instruction scheduler classes
llvm-svn: 319804
2017-12-05 18:01:26 +00:00
Simon Pilgrim
f02308ab75 [X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classes
Atom's FABS/FCHS/FSQRT latencies taken from Agner.

Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction.
llvm-svn: 319175
2017-11-28 15:03:42 +00:00
Simon Pilgrim
745278cc76 [X86][MMX] Add IIC_MMX_MOVMSK instruction itinerary class
llvm-svn: 318999
2017-11-26 17:56:07 +00:00
Simon Pilgrim
b385b3e7e3 [X86][SSE] Added missing PACKSS/PACKUS intrinsic schedules
Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431).

Checked on Agner that these actually match the UNPACK schedules, but better to include a separate class

llvm-svn: 309701
2017-08-01 16:47:48 +00:00
Matthias Braun
6958800987 TableGen: Check scheduling models for completeness
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:

- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model

Typical steps necessary to complete a model:

- Ensure all pseudo instructions that are expanded before machine
  scheduling (usually everything handled with EmitYYY() functions in
  XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
  resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.

Differential Revision: http://reviews.llvm.org/D17747

llvm-svn: 262384
2016-03-01 20:03:21 +00:00
Andrea Di Biagio
c61458a223 [X86][SchedModel] SSE reciprocal square root instruction latencies.
The SSE rsqrt instruction (a fast reciprocal square root estimate) was
grouped in the same scheduling IIC_SSE_SQRT* class as the accurate (but very
slow) SSE sqrt instruction. For code which uses rsqrt (possibly with
newton-raphson iterations) this poor scheduling was affecting performances.

This patch splits off the rsqrt instruction from the sqrt instruction scheduling
classes and creates new IIC_SSE_RSQER* classes with latency values based on
Agner's table.

Differential Revision: http://reviews.llvm.org/D5370

Patch by Simon Pilgrim.

llvm-svn: 218517
2014-09-26 12:56:44 +00:00
Sanjay Patel
2f0f025b2b Move Post RA Scheduling flag bit into SchedMachineModel
Refactoring; no functional changes intended

    Removed PostRAScheduler bits from subtargets (X86, ARM).
    Added PostRAScheduler bit to MCSchedModel class.
    This bit is set by a CPU's scheduling model (if it exists).
    Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
    Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
    Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
    Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
    Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: 
       a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. 
       b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. 
       c. PPC overrides the CPU's postRA settings by enabling postRA for everything. 
       d. X86 is the only target that actually has postRA specified via sched model info.

Differential Revision: http://reviews.llvm.org/D4217

llvm-svn: 213101
2014-07-15 22:39:58 +00:00
Hal Finkel
c52e65b830 Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).

This does represent a small functionality change:
 - For generic x86-64 (which uses the SB model and, thus, will get some
   unrolling).
 - For AMD cores (because they still currently use the SB scheduling model)
 - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
   the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.

llvm-svn: 208289
2014-05-08 09:14:44 +00:00
Preston Gurd
0411803c14 Adds support for Atom Silvermont (SLM) - -march=slm
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.

llvm-svn: 190717
2013-09-13 19:23:28 +00:00
Andrew Trick
31eeff56c7 Update machine models. Specify buffer sizes for OOO processors.
llvm-svn: 184033
2013-06-15 04:50:02 +00:00
Andrew Trick
5d13fe97ed Machine Model: Add MicroOpBufferSize and resource BufferSize.
Replace the ill-defined MinLatency and ILPWindow properties with
with straightforward buffer sizes:
MCSchedMode::MicroOpBufferSize
MCProcResourceDesc::BufferSize

These can be used to more precisely model instruction execution if desired.

Disabled some misched tests temporarily. They'll be reenabled in a few commits.

llvm-svn: 184032
2013-06-15 04:49:57 +00:00
Preston Gurd
ed3b81e028 Corrected Atom latencies for SSE SQRT instructions.
llvm-svn: 181346
2013-05-07 19:57:34 +00:00
Jakob Stoklund Olesen
0ac9ff5688 Remove IIC_DEFAULT from X86Schedule.td
All the instructions tagged with IIC_DEFAULT had nothing in common, and
we already have a NoItineraries class to represent untagged
instructions.

llvm-svn: 177937
2013-03-25 23:12:41 +00:00
Andrew Trick
c15e94c204 MIsched: add an ILP window property to machine model.
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.

I converted some in-order scheduling tests to A2. Hal is working on
more test cases.

llvm-svn: 171946
2013-01-09 03:36:49 +00:00
Andrew Trick
b9c8074dcd I'm introducing a new machine model to simultaneously allow simple
subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891
2012-07-07 04:00:00 +00:00
Andrew Trick
6d6fa07808 X86 itinerary properties.
llvm-svn: 157981
2012-06-05 03:44:46 +00:00
Andrew Trick
8b333df134 whitespace
llvm-svn: 157976
2012-06-05 03:44:29 +00:00