This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
Adds a MachineFunctionPass that scans the body to find calls, and
update the register mask with the one saved by the
RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21180
llvm-svn: 272414
The costs are somewhat hand-wavy, but should be much closer to the truth
than what we get from BasicTTI.
Differential Revision: http://reviews.llvm.org/D21156
llvm-svn: 272406
Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.
When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.
The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.
An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D20769
llvm-svn: 272403
This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.
llvm-svn: 272395
This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.
Differential Revision: http://reviews.llvm.org/D20873
llvm-svn: 272385
Previously we could run only one machine pass with the run-pass option.
With that patch, we can now specify several passes with several run-pass
options (or just one option with a list of comma separated passes) and
llc will build the related pipeline.
This is great to test the interaction of two passes that are not
necessarily next to each other in the pipeline, or play with pass
ordering.
Now, we should be at parity with opt for the flexibility of running
passes.
Note: I also moved the run pass option from CommandFlags.h to llc.cpp
because, really, this is needed only there!
llvm-svn: 272356
Refactor the code so that we do not compute in two different places the
end iterator for the range of new virtual registers for a given operand.
Although this refactoring was intended as NFC, this is not the case
because it actually fixes a bug where we were returning a range off by 1
(too long). Right now, this could not result in an actual bug because we
were accessing this range via the BreakDown size of the related operand.
llvm-svn: 272208
Now, the target will be able to provide its how implementation to remap
an instruction. This open the way to crazier optimizations, but to
beginning with, we will be able to handle something else than the
default mapping.
llvm-svn: 272165
The generic implementation stated that all copies were free, which is
unlikely. Now, only the copies within the same register bank are free.
We assume they will get coalesced.
llvm-svn: 272085
The cost of a copy may be different based on how many bits we have to
copy around. E.g., a 8-bit copy may be different than a 32-bit copy.
llvm-svn: 272084
CALL_ONCE_... macro in the legacy pass manager with the new
llvm::call_once facility.
Nothing changed sicne the last attempt in r271781 which I reverted in
r271788. At least one of the failures I saw was spurious, and I want to
make sure the other failures are real before I work around them -- they
appeared to only effect ppc64le and ppc64be.
Original commit message of r271781:
----
[LPM] Reinstate r271652 to replace the CALL_ONCE_... macro in the legacy
pass manager with the new llvm::call_once facility.
This reverts commit r271657 and re-applies r271652 with a fix to
actually work with arguments. In the original version, we just ended up
directly calling std::call_once via ADL because of the std::once_flag
argument. The llvm::call_once never worked with arguments. Now,
llvm::call_once is a variadic template that perfectly forwards
everything. As a part of this it had to move to the header and we use
a generic functor rather than an explict function pointer. It would be
nice to use std::invoke here but we don't have it yet. That means
pointer to members won't work here, but that seems a tolerable
compromise.
I've also tested this by forcing the fallback path, so hopefully it
sticks this time.
----
Original commit message of r271652:
----
[LPM] Replace the CALL_ONCE_... macro in the legacy pass manager with
the new llvm::call_once facility.
This facility matches the standard APIs and when the platform supports
it actually directly uses the standard provided functionality. This is
both more efficient on some platforms and much more TSan friendly.
The only remaining user of the cas_flag and home-rolled atomics is the
fallback implementation of call_once. I have a patch that removes them
entirely, but it needs a Windows patch to land first.
This alone substantially cleans up the macros for the legacy pass
manager, and should subsume some of the work Mehdi was doing to clear
the path for TSan testing of ThinLTO, a really important step to have
reliable upstream testing of ThinLTO in all forms.
----
llvm-svn: 271800
There appears to be a strange exception thrown and crash using call_once
on a PPC build bot, and a *really* weird windows link error for
GCMetadata.obj. Still need to investigate the cause of both problems.
Original change summary:
[LPM] Reinstate r271652 to replace the CALL_ONCE_... macro in the legacy
pass manager with the new llvm::call_once facility.
llvm-svn: 271788
pass manager with the new llvm::call_once facility.
This reverts commit r271657 and re-applies r271652 with a fix to
actually work with arguments. In the original version, we just ended up
directly calling std::call_once via ADL because of the std::once_flag
argument. The llvm::call_once never worked with arguments. Now,
llvm::call_once is a variadic template that perfectly forwards
everything. As a part of this it had to move to the header and we use
a generic functor rather than an explict function pointer. It would be
nice to use std::invoke here but we don't have it yet. That means
pointer to members won't work here, but that seems a tolerable
compromise.
I've also tested this by forcing the fallback path, so hopefully it
sticks this time.
Original commit message:
----
[LPM] Replace the CALL_ONCE_... macro in the legacy pass manager with
the new llvm::call_once facility.
This facility matches the standard APIs and when the platform supports
it actually directly uses the standard provided functionality. This is
both more efficient on some platforms and much more TSan friendly.
The only remaining user of the cas_flag and home-rolled atomics is the
fallback implementation of call_once. I have a patch that removes them
entirely, but it needs a Windows patch to land first.
This alone substantially cleans up the macros for the legacy pass
manager, and should subsume some of the work Mehdi was doing to clear
the path for TSan testing of ThinLTO, a really important step to have
reliable upstream testing of ThinLTO in all forms.
llvm-svn: 271781
My first attempt at this had an overly aggressive assert - chain nodes
will only be removed, but we could hit the assert if a non-chain node
was CSE'd (NodeToMatch, for instance).
This reapplies r271706 by reverting r271713 and fixing an assert.
Original message:
Avoid relying on UB by looking into deleted nodes for a marker value.
Instead, update the list of chain nodes as we go.
llvm-svn: 271733
the new llvm::call_once facility.
This facility matches the standard APIs and when the platform supports
it actually directly uses the standard provided functionality. This is
both more efficient on some platforms and much more TSan friendly.
The only remaining user of the cas_flag and home-rolled atomics is the
fallback implementation of call_once. I have a patch that removes them
entirely, but it needs a Windows patch to land first.
This alone substantially cleans up the macros for the legacy pass
manager, and should subsume some of the work Mehdi was doing to clear
the path for TSan testing of ThinLTO, a really important step to have
reliable upstream testing of ThinLTO in all forms.
llvm-svn: 271652
Refactor LiveIntervals::renameDisconnectedComponents() to be a pass.
Also change the name to "RenameIndependentSubregs":
- renameDisconnectedComponents() worked on a MachineFunction at a time
so it is a natural candidate for a machine function pass.
- The algorithm is testable with a .mir test now.
- This also fixes a problem where the lazy renaming as part of the
MachineScheduler introduced IMPLICIT_DEF instructions after the number
of a nodes in a region were counted leading to a mismatch.
Differential Revision: http://reviews.llvm.org/D20507
llvm-svn: 271345
This adds support to the backed to actually support SjLj EH as an exception
model. This is *NOT* the default model, and requires explicitly opting into it
from the frontend. GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.
Addresses PR27749!
llvm-svn: 271244
In r268693, we started requiring that SelectionDAGISel::Select return
void, but provided a default implementation that did just that by
calling into the old interface. Now that all targets have been
updated, we'll just remove the default implementation.
llvm-svn: 270454
Prior to this patch, we were using 1 for all the repairing costs.
Now, we use the information from the target to get this information.
llvm-svn: 270304
We now use LiveRangeCalc::extendToUses() instead of a specially designed
algorithm in constructMainRangeFromSubranges():
- The original motivation for constructMainRangeFromSubranges() were
differences between the main liverange and subranges because of hidden
dead definitions. This case however cannot happen anymore with the
DetectDeadLaneMasks pass in place.
- It simplifies the code.
- This fixes a longstanding bug where we did not properly create new SSA
values on merging control flow (the MachineVerifier missed most of
these cases).
- Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and
LiveRangeCalc to better match the implementation/available helper
functions.
This re-applies r269016. The fixes from r270290 and r270259 should avoid
the machine verifier problems this time.
llvm-svn: 270291
Fix renameDisconnectedComponents() creating vreg uses that can be
reached from function begin withouthaving a definition (or explicit
live-in). Fix this by inserting IMPLICIT_DEF instruction before
control-flow joins as necessary.
Removes an assert from MachineScheduler because we may now get
additional IMPLICIT_DEF when preparing the scheduling policy.
This fixes the underlying problem of http://llvm.org/PR27705
llvm-svn: 270259
The Fast mode takes the first mapping, the greedy mode loops over all
the possible mapping for an instruction and choose the cheaper one.
Test case will come with target specific code, since we currently do not
have instructions that have several mappings.
llvm-svn: 270249
computeMapping.
Computing the cost of a mapping takes some time.
Since in Fast mode, the cost is irrelevant, just spare some cycles by not
computing it.
In Greedy mode, we need to choose the best cost, that means that when
the local cost gets more expensive than the best cost, we can stop
computing the repairing and cost for the current mapping.
llvm-svn: 270245
The previous choice of the insertion points for repairing was
straightfoward but may introduce some basic block or edge splitting. In
some situation this is something we can avoid.
For instance, when repairing a phi argument, instead of placing the
repairing on the related incoming edge, we may move it to the previous
block, before the terminators. This is only possible when the argument
is not defined by one of the terminator.
llvm-svn: 270232
an instruction.
Use the previously introduced RepairingPlacement class to split the code
computing the repairing placement from the code doing the actual
placement. That way, we will be able to consider different placement and
then, only apply the best one.
llvm-svn: 270168
When assigning the register banks we may have to insert repairing code
to move already assigned values accross register banks.
Introduce a few helper classes to keep track of what is involved in the
repairing of an operand:
- InsertPoint and its derived classes record the positions, in the CFG,
where repairing has to be inserted.
- RepairingPlacement holds all the insert points for the repairing of an
operand plus the kind of action that is required to do the repairing.
This is going to be used to keep track of how the repairing should be
done, while comparing different solutions for an instruction. Indeed, we
will need the repairing placement to capture the cost of a solution and
we do not want to compute it a second time when we do the actual
repairing.
llvm-svn: 270167
register bank twice.
Prior to this change, we were checking if the assignment for the current
machine operand was matching, then we would check if the mismatch
requires to insert repair code.
We actually already have this information from the first check, so just
pass it along.
NFCI.
llvm-svn: 270166
This helper class will be used to represent the cost of mapping an
instruction to a specific register bank.
The particularity of these costs is that they are mostly local, thus the
frequency of the basic block is irrelevant. However, for few
instructions (e.g., phis and terminators), the cost may be non-local and
then, we need to account for the frequency of the involved basic blocks.
This will be used by the greedy mode I am working on.
llvm-svn: 270163
There are at least 2 places (DAGCombiner, X86ISelLowering) where this could be used instead
of ad-hoc and watered down code that is trying to match a power-of-2 pattern.
Differential Revision: http://reviews.llvm.org/D20439
llvm-svn: 270073
Having an enum member named Default is quite confusing: Is it distinct
from the others?
This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.
llvm-svn: 269988