1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

72 Commits

Author SHA1 Message Date
Florian Hahn
6c9eb841b2 Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.
This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751
2017-09-20 11:54:37 +00:00
Hans Wennborg
634369c34f Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."
This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213
2017-09-13 23:23:09 +00:00
Eugene Zelenko
3d80249809 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 312971
2017-09-11 23:00:48 +00:00
Florian Hahn
e0cb40c03e [MachineCombiner] Update instruction depths incrementally for large BBs.
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 312719
2017-09-07 12:49:39 +00:00
Florian Hahn
cc61b82fdc [MachineTraceMetrics] Add computeDepth function (NFCI).
Summary:
This function is used in D36619 to update the instruction depths
incrementally.

Reviewers: efriedma, Gerolf, MatzeB, fhahn

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36696

llvm-svn: 312714
2017-09-07 11:51:30 +00:00
Chandler Carruth
eb66b33867 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Matthias Braun
120c5b7053 CodeGen: Rename DEBUG_TYPE to match passnames
Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE instead of repeating the passname where possible.

llvm-svn: 303921
2017-05-25 21:26:32 +00:00
Eugene Zelenko
6fa4aaca51 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295773
2017-02-21 22:07:52 +00:00
Tim Shen
e69d467a94 [ADT] Change PostOrderIterator to use NodeRef. NFC.
Reviewers: dblaikie

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23522

llvm-svn: 278752
2016-08-15 21:52:54 +00:00
Duncan P. N. Exon Smith
9d359ab468 CodeGen: Use MachineInstr& more in MachineTraceMetrics, NFC
Push MachineInstr& through helper APIs for consistency.  This doesn't
remove any more implicit conversions, but it's a nice cleanup after
r274300.

llvm-svn: 274301
2016-07-01 00:05:40 +00:00
Duncan P. N. Exon Smith
0895fcb255 CodeGen: Use MachineInstr& in MachineTraceMetrics, NFC
This avoids an implicit conversion from iterator to pointer.

llvm-svn: 274300
2016-06-30 23:53:20 +00:00
Duncan P. N. Exon Smith
3b54098f86 Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261510, effectively reapplying r261509.  The
original commit missed a caller in AArch64ConditionalCompares.

Original commit message:

Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261511
2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith
5d1b25325e Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261509.  I'm not sure how this compiled locally,
but something was out of whack.

llvm-svn: 261510
2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith
3cbb1cb653 CodeGen: Use references in MachineTraceMetrics::Trace, NFC
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261509
2016-02-22 03:07:49 +00:00
Richard Trieu
5a759985de Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Sanjay Patel
cbf65ba44a use range-based for loops; NFCI
llvm-svn: 255171
2015-12-09 22:45:45 +00:00
Sanjay Patel
393560b776 fix crash in machine trace metrics due to processing dbg_value instructions (PR24199)
The test in PR24199 ( https://llvm.org/bugs/show_bug.cgi?id=24199 ) crashes because machine
trace metrics was not ignoring dbg_value instructions when calculating data dependencies.

The machine-combiner pass asks machine trace metrics to calculate an instruction trace, 
does some reassociations, and calls MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval()
along with MachineTraceMetrics::invalidate(). The dbg_value instructions have their operands
invalidated, but the instructions are not expected to be deleted.

On a subsequent loop iteration of the machine-combiner pass, machine trace metrics would be
called again and die while accessing the invalid debug instructions.

Differential Revision: http://reviews.llvm.org/D11423

llvm-svn: 243057
2015-07-23 22:56:53 +00:00
Sanjay Patel
a3f313c186 use range-based for loops; NFCI
llvm-svn: 241468
2015-07-06 16:27:35 +00:00
Sanjay Patel
f258757f92 use range-based for loops; NFCI
llvm-svn: 241463
2015-07-06 16:19:14 +00:00
Sanjay Patel
c0d734a9ee use valid bits to avoid unnecessary machine trace metric recomputations
Although this does cut the number of traces recomputed by ~10% for the
test case mentioned in http://reviews.llvm.org/D10460, it doesn't
make a dent in the overall performance. That example needs to be more
selective when invalidating traces.

llvm-svn: 241393
2015-07-04 15:00:28 +00:00
Sanjay Patel
b781314531 use range-based for loops; NFCI
llvm-svn: 241076
2015-06-30 16:30:22 +00:00
Alexander Kornienko
f993659b8f Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Alexander Kornienko
40cb19d802 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
Matthias Braun
4eae512569 CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands
MIOperands/ConstMIOperands are classes iterating over the MachineOperand
of a MachineInstr, however MachineInstr::mop_iterator does the same
thing.

I assume these two iterators exist to have a uniform interface to
iterate over the operands of a machine instruction bundle and a single
machine instruction. However in practice I find it more confusing to have 2
different iterator classes, so this patch transforms (nearly all) the
code to use mop_iterators.

The only exception being MIOperands::anlayzePhysReg() and
MIOperands::analyzeVirtReg() still needing an equivalent, I leave that
as an exercise for the next patch.

Differential Revision: http://reviews.llvm.org/D9932

This version is slightly modified from the proposed revision in that it
introduces MachineInstr::getOperandNo to avoid the extra counting
variable in the few loops that previously used MIOperands::getOperandNo.

llvm-svn: 238539
2015-05-29 02:56:46 +00:00
Sanjay Patel
0009a99bbf use range-based for-loops; NFCI
llvm-svn: 237917
2015-05-21 17:22:45 +00:00
Daniel Berlin
8880b4b1c6 Add range iterators for post order and inverse post order. Use them
llvm-svn: 235026
2015-04-15 17:41:42 +00:00
Eric Christopher
12ccd90a3b The subtarget is cached on the MachineFunction. Access it directly.
llvm-svn: 227173
2015-01-27 07:31:29 +00:00
David Blaikie
60e6c80905 Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Pete Cooper
92fc86558d Change MCSchedModel to be a struct of statically initialized data.
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour

Reviewed by Andy Trick and Chandler C

llvm-svn: 216919
2014-09-02 17:43:54 +00:00
Craig Topper
43cee2f5fc Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216525
2014-08-27 05:25:25 +00:00
Eric Christopher
67c04e77e5 Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

llvm-svn: 214838
2014-08-05 02:39:49 +00:00
Eric Christopher
99307e99a2 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Gerolf Hoflehner
b4a5e33ee0 MachineCombiner Pass for selecting faster instruction
sequence -  target independent framework

 When the DAGcombiner selects instruction sequences
 it could increase the critical path or resource len.

 For example, on arm64 there are multiply-accumulate instructions (madd,
 msub). If e.g. the equivalent  multiply-add sequence is not on the
 crictial path it makes sense to select it instead of  the combined,
 single accumulate instruction (madd/msub). The reason is that the
 conversion from add+mul to the madd could lengthen the critical path
 by the latency of the multiply.

 But the DAGCombiner would always combine and select the madd/msub
 instruction.

 This patch uses machine trace metrics to estimate critical path length
 and resource length of an original instruction sequence vs a combined
 instruction sequence and picks the faster code based on its estimates.

 This patch only commits the target independent framework that evaluates
 and selects code sequences. The machine instruction combiner is turned
 off for all targets and expected to evolve over time by gradually
 handling DAGCombiner pattern in the target specific code.

 This framework lays the groundwork for fixing
 rdar://16319955

llvm-svn: 214666
2014-08-03 21:35:39 +00:00
Alexey Samsonov
8f5245ce6b Convert more loops to range-based equivalents
llvm-svn: 207714
2014-04-30 22:17:38 +00:00
Chandler Carruth
2361db41db [Modules] Remove potential ODR violations by sinking the DEBUG_TYPE
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.

Other sub-trees will follow.

llvm-svn: 206837
2014-04-22 02:02:50 +00:00
Craig Topper
30281a67fb [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206142
2014-04-14 00:51:57 +00:00
Benjamin Kramer
f6c0615b06 Retire llvm::array_endof in favor of non-member std::end.
While there make array_lengthof constexpr if we have support for it.

llvm-svn: 206112
2014-04-12 16:15:53 +00:00
Owen Anderson
e541764c5f Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing
operator* on the by-operand iterators to return a MachineOperand& rather than
a MachineInstr&.  At this point they almost behave like normal iterators!

Again, this requires making some existing loops more verbose, but should pave
the way for the big range-based for-loop cleanups in the future.

llvm-svn: 203865
2014-03-13 23:12:04 +00:00
Craig Topper
b3cfc7916b [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203220
2014-03-07 09:26:03 +00:00
Benjamin Kramer
3ac154a395 [C++11] Replace llvm::tie with std::tie.
The old implementation is no longer needed in C++11.

llvm-svn: 202644
2014-03-02 13:30:33 +00:00
Andrew Trick
5d13fe97ed Machine Model: Add MicroOpBufferSize and resource BufferSize.
Replace the ill-defined MinLatency and ILPWindow properties with
with straightforward buffer sizes:
MCSchedMode::MicroOpBufferSize
MCProcResourceDesc::BufferSize

These can be used to more precisely model instruction execution if desired.

Disabled some misched tests temporarily. They'll be reenabled in a few commits.

llvm-svn: 184032
2013-06-15 04:49:57 +00:00
Andrew Trick
b90f16d802 Generalize the MachineTraceMetrics public API.
Naturally, we should be able to pass in extra instructions, not just
extra blocks.

llvm-svn: 180667
2013-04-27 03:54:20 +00:00
Jakob Stoklund Olesen
60ec106a22 Allow MachineTraceMetrics to be used when the model has no resources.
It it still possible to extract information from itineraries, for
example.

llvm-svn: 178582
2013-04-02 22:27:45 +00:00
Jakob Stoklund Olesen
1c9a12661b Count processor resources individually in MachineTraceMetrics.
The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.

The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.

This gives more precise resource information, particularly in traces
that use one resource a lot more than others.

llvm-svn: 178553
2013-04-02 17:49:51 +00:00
Jakob Stoklund Olesen
b83588cd0b Rename isEarlierInSameTrace to isUsefulDominator.
In very rare cases caused by irreducible control flow, the dominating
block can have the same trace head without actually being part of the
trace.

As long as such a dominator still has valid instruction depths, it is OK
to use it for computing instruction depths.

Rename the function to avoid lying, and add a check that instruction
depths are computed for the dominator.

llvm-svn: 176668
2013-03-07 23:55:49 +00:00
Jakob Stoklund Olesen
10c1e03917 Move MachineTraceMetrics.h into include/llvm/CodeGen.
Let targets use it.

llvm-svn: 172688
2013-01-17 01:06:04 +00:00
Chandler Carruth
a490793037 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Jakob Stoklund Olesen
a7a8e389d2 Pass an explicit operand number to addLiveIns.
Not all instructions define a virtual register in their first operand.
Specifically, INLINEASM has a different format.

<rdar://problem/12472811>

llvm-svn: 165721
2012-10-11 16:46:07 +00:00
Jakob Stoklund Olesen
28526a6248 Don't crash on extra evil irreducible control flow.
When the CFG contains a loop with multiple entry blocks, the traces
computed by MachineTraceMetrics don't always have the same nice
properties. Loop back-edges are normally excluded from traces, but
MachineLoopInfo doesn't recognize loops with multiple entry blocks, so
those back-edges may be included.

Avoid asserting when that happens by adding an isEarlierInSameTrace()
function that accurately determines if a dominating block is part of the
same trace AND is above the currrent block in the trace.

llvm-svn: 165434
2012-10-08 22:06:44 +00:00
Jakob Stoklund Olesen
6555de9442 Switch MachineTraceMetrics to the new TargetSchedModel interface.
llvm-svn: 165235
2012-10-04 17:30:40 +00:00