1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

20508 Commits

Author SHA1 Message Date
Hal Finkel
c1f6823ee0 [SDAG] Add a fallback multiplication expansion
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.

Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.

This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).

Fixes PR19797.

llvm-svn: 270720
2016-05-25 16:50:22 +00:00
Chad Rosier
8544c01533 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

llvm-svn: 270715
2016-05-25 16:22:14 +00:00
Matthias Braun
3b9611aad5 ScheduleDAGInstrs: Fix memory corruption
We have to modify V2SU before inserting new elements into the
CurrentVRegDefs set because that may move V2SU in memory invalidating
the reference.

llvm-svn: 270644
2016-05-25 01:18:00 +00:00
Haicheng Wu
5a1aef2909 [MBP] Factor out the optimizations on branch conditions and unanalyzable branches. NFCI.
The benefits of this patch are

-- We call AnalyzeBranch() to optimize unanalyzable branches, but the result of
   AnalyzeBranch() is not used. Now the result is useful.

-- Before the layout of all the MBBs is set, the result of AnalyzeBranch() is
   not correct and needs to be fixed before using it to optimize the branch
   conditions. Now this optimization is called after the layout, the code used
   to fix the result of AnalyzeBranch() is not needed.

-- The branch condition of the last block is not optimized before. Now it is
   optimized.

Differential Revision: http://reviews.llvm.org/D20177

llvm-svn: 270623
2016-05-24 22:16:14 +00:00
Matthias Braun
2c464b94e5 LiveIntervalAnalysis: Fix handleMove() re-using the wrong value number
This fixes http://llvm.org/PR27856

llvm-svn: 270619
2016-05-24 21:54:01 +00:00
David Blaikie
cc40d4c91f DWARF: Omit DW_AT_APPLE attributes (except ObjC ones) when not targeting LLDB
These attributes aren't used by other debuggers (& may be confused with
other DWARF extensions) so they just waste space (about 1.5% on .dwo
file size on a random large program I tested).

We could remove the ObjC property ones too, but I figured they were
probably more necessary when trying to understand ObjC (I could be wrong
though) & so any debugger interested in working with ObjC would use
them, perhaps? (also, there are some legacy tests in Clang that test for
them - making it one of those annoying cross-project commits and/or
cleanup to refactor those tests)

llvm-svn: 270613
2016-05-24 21:19:28 +00:00
Than McIntosh
bb4b34bbb7 Rework/enhance stack coloring data flow analysis.
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.

Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827

Bug: 25776
llvm-svn: 270559
2016-05-24 13:23:44 +00:00
Justin Bogner
621fa28dd0 PrologEpilogInserter: Avoid an infinite loop when MinCSFrameIndex == 0
Before r269750 we did the comparisons in this loop in signed ints so
that it DTRT when MinCSFrameIndex was 0. This was changed because it's
now possible for MinCSFrameIndex to be UINT_MAX, but that introduced a
bug when we were comparing `>= 0` - this is tautological in unsigned.

Rework the comparisons here to avoid issues with unsigned wrapping.

No test. I couldn't find a way to get any of the StackGrowsUp in-tree
targets to reach the code that sets MinCSFrameIndex.

llvm-svn: 270492
2016-05-23 21:40:52 +00:00
Reid Kleckner
4a1b9faea8 Modify emitTypeInformation to use MemoryTypeTableBuilder, take 2
This effectively revers commit r270389 and re-lands r270106, but it's
almost a rewrite.

The behavior change in r270106 was that we could no longer assume that
each LF_FUNC_ID record got its own type index. This patch adds a map
from DINode* to TypeIndex, so we can stop making that assumption.

This change also emits padding bytes between type records similar to the
way MSVC does. The size of the type record includes the padding bytes.

llvm-svn: 270485
2016-05-23 20:23:46 +00:00
Wei Mi
8e0a89d063 InsertPointAnalysis: Move current live interval from being a class member
to query interfaces argument; NFC

Differential Revision: http://reviews.llvm.org/D20532

llvm-svn: 270481
2016-05-23 19:39:19 +00:00
Justin Lebar
b72817c244 Fix DEBUG logs in MachineLICM.
Summary:
MBBs don't necessarily have a name (in my experience, they almost never
do), in which case this logging is quite unhelpful.  The number seems to
work well.

Reviewers: iteratee

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20533

llvm-svn: 270477
2016-05-23 18:56:07 +00:00
Zachary Turner
13dc004006 [codeview] Refactor symbol records to use same pattern as types.
This will pave the way to introduce a full fledged symbol visitor
similar to how we have a type visitor, thus allowing the same
dumping code to be used in llvm-readobj and llvm-pdbdump.

Differential Revision: http://reviews.llvm.org/D20384
Reviewed By: rnk

llvm-svn: 270475
2016-05-23 18:49:06 +00:00
David Majnemer
39c1ab7efb Revert "Modify emitTypeInformation to use MemoryTypeTableBuilder"
This reverts commit r270106.  It results in certain function types
omitted in the output.

llvm-svn: 270389
2016-05-23 01:37:45 +00:00
Hal Finkel
be81a6f808 [LiveIntervalAnalysis] Don't dereference an end iterator in repairIntervalsInRange
This fixes a bug introduced in:

  r262115 - CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC

The iterator End here might == MBB->end(), and so we can't unconditionally
dereference it. This often goes unnoticed (I don't have a test case that always
crashes, and ASAN does not catch it either) because the function call arguments are
turned right back into iterators. MachineInstrBundleIterator's constructor,
however, does have an assert which might randomly fire.

llvm-svn: 270323
2016-05-21 16:03:50 +00:00
Quentin Colombet
9b5e08fa9b [RegBankSelect] Compute the repairing cost for copies.
Prior to this patch, we were using 1 for all the repairing costs.
Now, we use the information from the target to get this information.

llvm-svn: 270304
2016-05-21 01:43:25 +00:00
Matthias Braun
13037577f3 LiveIntervalAnalysis: Rework constructMainRangeFromSubranges()
We now use LiveRangeCalc::extendToUses() instead of a specially designed
algorithm in constructMainRangeFromSubranges():
- The original motivation for constructMainRangeFromSubranges() were
  differences between the main liverange and subranges because of hidden
  dead definitions. This case however cannot happen anymore with the
  DetectDeadLaneMasks pass in place.
- It simplifies the code.
- This fixes a longstanding bug where we did not properly create new SSA
  values on merging control flow (the MachineVerifier missed most of
  these cases).
- Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and
  LiveRangeCalc to better match the implementation/available helper
  functions.

This re-applies r269016. The fixes from r270290 and r270259 should avoid
the machine verifier problems this time.

llvm-svn: 270291
2016-05-20 23:14:56 +00:00
Matthias Braun
327a7f0867 MachineVerifier: subregs so not require defs/valnos on every path
It is fine for subregister ranges to be undefined on some CFG paths as
we may have a "vregX:other_subreg<read-undef> =" def on that path. We
do not (and should not) have live segments for the subregister ranges.
The MachineVerifier should not complain about this.

This is a slight variant of http://llvm.org/PR27705

llvm-svn: 270290
2016-05-20 23:02:13 +00:00
Krzysztof Parzyszek
5b8abb9534 Use report_fatal_error after all
Depending on the compiler used to build LLVM, llvm_unreachable can either
expand to a call to abort(), or to a __builtin_unreachable. The latter
does not have a predictable behavior at runtime.

llvm-svn: 270260
2016-05-20 19:46:42 +00:00
Matthias Braun
03d346febe LiveIntervalAnalysis: Fix missing defs in renameDisconnectedComponents().
Fix renameDisconnectedComponents() creating vreg uses that can be
reached from function begin withouthaving a definition (or explicit
live-in). Fix this by inserting IMPLICIT_DEF instruction before
control-flow joins as necessary.

Removes an assert from MachineScheduler because we may now get
additional IMPLICIT_DEF when preparing the scheduling policy.

This fixes the underlying problem of http://llvm.org/PR27705

llvm-svn: 270259
2016-05-20 19:46:13 +00:00
Peter Collingbourne
2e1acaa19d CodeGen: Move the call to DwarfDebug::beginModule() out of the constructor.
This gives AsmPrinter a chance to initialize its DD field before
we call beginModule(), which is about to start using it.

Differential Revision: http://reviews.llvm.org/D20413

llvm-svn: 270258
2016-05-20 19:35:35 +00:00
Peter Collingbourne
162aaa4e2a CodeGen: Do not require a MachineFunction just to create a DIEDwarfExpression.
We are about to start using DIEDwarfExpression to create global variable
DIEs, which happens before we generate code for functions.

Differential Revision: http://reviews.llvm.org/D20412

llvm-svn: 270257
2016-05-20 19:35:17 +00:00
Quentin Colombet
1daaa38901 [RegBankSelect] Look for the best mapping in greedy mode.
The Fast mode takes the first mapping, the greedy mode loops over all
the possible mapping for an instruction and choose the cheaper one.
Test case will come with target specific code, since we currently do not
have instructions that have several mappings.

llvm-svn: 270249
2016-05-20 18:37:33 +00:00
Quentin Colombet
ea428b668a [RegBankSelect] Get rid of a now dead method: setSafeInsertPoint.
This is now encapsulated in the RepairingPlacement class.

llvm-svn: 270247
2016-05-20 18:17:16 +00:00
Quentin Colombet
a8b9dff8f7 [RegBankSelect] Take advantage of a potential best cost information in
computeMapping.

Computing the cost of a mapping takes some time.
Since in Fast mode, the cost is irrelevant, just spare some cycles by not
computing it.
In Greedy mode, we need to choose the best cost, that means that when
the local cost gets more expensive than the best cost, we can stop
computing the repairing and cost for the current mapping.

llvm-svn: 270245
2016-05-20 18:00:46 +00:00
Quentin Colombet
f5dd7f7b19 [RegBankSelect] Use frequency and probability information to compute
more precise cost in Greedy mode.

In Fast mode the cost is irrelevant so do not bother requiring that
those passes get scheduled.

llvm-svn: 270244
2016-05-20 17:54:09 +00:00
Quentin Colombet
bc0aff52a0 [RegBankSelect] Use the Fast mode for functions with the optnone attribute.
llvm-svn: 270242
2016-05-20 17:36:54 +00:00
Quentin Colombet
3d214a2bd4 [RegBankSelect] Specify different optimization mode for the pass.
The mode should be choose by the target when instantiating the pass.

llvm-svn: 270235
2016-05-20 16:55:35 +00:00
Krzysztof Parzyszek
e163deaf59 Fix error reporting in register scavenger (lack of emergency spill slot)
- Do not store Twine objects.
- Remove report_fatal_error, since llvm_unreachable does terminate the
  program in release mode.

llvm-svn: 270233
2016-05-20 16:38:34 +00:00
Quentin Colombet
e037aadc61 [RegBankSelect] Add a method to avoid splitting while repairing.
The previous choice of the insertion points for repairing was
straightfoward but may introduce some basic block or edge splitting. In
some situation this is something we can avoid.
For instance, when repairing a phi argument, instead of placing the
repairing on the related incoming edge, we may move it to the previous
block, before the terminators. This is only possible when the argument
is not defined by one of the terminator.

llvm-svn: 270232
2016-05-20 16:36:12 +00:00
Krzysztof Parzyszek
894bb33fce Correction to r270219: fix detection of invalid frame index
llvm-svn: 270220
2016-05-20 14:34:03 +00:00
Krzysztof Parzyszek
3535601298 Skip entries with invalid indexes in the search loop in register scavenger
llvm-svn: 270219
2016-05-20 14:18:54 +00:00
Diana Picus
2fe376f63f Fix some comment typos in SelectionDAGBuilder. NFC
llvm-svn: 270190
2016-05-20 08:06:31 +00:00
Quentin Colombet
831af7c5fc [RegBankSelect] Refactor the code to split the repairing and mapping of
an instruction.

Use the previously introduced RepairingPlacement class to split the code
computing the repairing placement from the code doing the actual
placement. That way, we will be able to consider different placement and
then, only apply the best one.

llvm-svn: 270168
2016-05-20 00:55:51 +00:00
Quentin Colombet
6648df1d84 [RegBankSelect] Add helper class for repairing code placement.
When assigning the register banks we may have to insert repairing code
to move already assigned values accross register banks.

Introduce a few helper classes to keep track of what is involved in the
repairing of an operand:
- InsertPoint and its derived classes record the positions, in the CFG,
  where repairing has to be inserted.
- RepairingPlacement holds all the insert points for the repairing of an
  operand plus the kind of action that is required to do the repairing.

This is going to be used to keep track of how the repairing should be
done, while comparing different solutions for an instruction. Indeed, we
will need the repairing placement to capture the cost of a solution and
we do not want to compute it a second time when we do the actual
repairing.

llvm-svn: 270167
2016-05-20 00:49:10 +00:00
Quentin Colombet
3b5a443ef3 [RegBankSelect] Refactor assignmentMatch to avoid testing the current
register bank twice.

Prior to this change, we were checking if the assignment for the current
machine operand was matching, then we would check if the mismatch
requires to insert repair code.
We actually already have this information from the first check, so just
pass it along.

NFCI.

llvm-svn: 270166
2016-05-20 00:42:57 +00:00
Rafael Espindola
2844c2359f Fix pr27728.
Sorry for the lack testcase. There is one in the pr, but it depends on
std::sort and the .ll version is 110 lines, so I don't think it is
wort it.

The bug was that we were sorting after adding a terminator, and the
sorting algorithm could end up putting the terminator in the middle of
the List vector.

With that we would create a Spans map entry keyed on nullptr which would
then be added to CUs and fail in that sorting.

llvm-svn: 270165
2016-05-20 00:38:28 +00:00
Quentin Colombet
b67f0beee7 [RegBankSelect] Introduce MappingCost helper class.
This helper class will be used to represent the cost of mapping an
instruction to a specific register bank.
The particularity of these costs is that they are mostly local, thus the
frequency of the basic block is irrelevant. However, for few
instructions (e.g., phis and terminators), the cost may be non-local and
then, we need to account for the frequency of the involved basic blocks.

This will be used by the greedy mode I am working on.

llvm-svn: 270163
2016-05-20 00:35:26 +00:00
Rafael Espindola
83561daf9a clang-format. NFC.
llvm-svn: 270156
2016-05-19 23:17:37 +00:00
Quentin Colombet
970400db38 Reapply r263460: [SpillPlacement] Fix a quadratic behavior in spill placement.
Using Chandler's words from r265331:
This commit was greatly exacerbating PR17409 and effectively regressed
build time for lot of (very large) code when compiled with ASan or MSan.

PR17409 is fixed by r269249, so this is fine to reapply r263460.

Original commit message:
The bad behavior happens when we have a function with a long linear
chain of basic blocks, and have a live range spanning most of this
chain, but with very few uses.

Let say we have only 2 uses.

The Hopfield network is only seeded with two active blocks where the
uses are, and each iteration of the outer loop in
`RAGreedy::growRegion()` only adds two new nodes to the network due to
the completely linear shape of the CFG.  Meanwhile,
`SpillPlacer->iterate()` visits the whole set of discovered nodes, which
adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening
on the frontier where new nodes are being added. The internal nodes in
the network are not likely to be flip-flopping much, or they will at
least settle down very quickly. This means that while
`SpillPlacer->iterate()` is recomputing all the nodes in the network, it
is probably only the two frontier nodes that are changing their output.

Instead of recomputing the whole network on each iteration, we can
maintain a SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its
  neighbors are added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should
converge to a local minimum, but there is no guarantee that it will find
a global minimum. It is possible that updating nodes in a different
order will cause us to switch to a different local minimum. In other
words, this is not NFC, but although I saw a few runtime improvements
and regressions when I benchmarked this change, those were side effects
and actually the performance change is in the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his
feedbacks, guidance and time for the review.

llvm-svn: 270149
2016-05-19 22:40:37 +00:00
Matthew Simpson
f1715d1306 [ARM, AArch64] Match additional patterns to ldN instructions
When matching an interleaved load to an ldN pattern, the interleaved access
pass checks that all users of the load are shuffles. If the load is used by an
instruction other than a shuffle, the pass gives up and an ldN is not
generated. This patch considers users of the load that are extractelement
instructions. It attempts to modify the extracts to use one of the available
shuffles rather than the load. After the transformation, the load is only used
by shuffles and will then be matched with an ldN pattern.

Differential Revision: http://reviews.llvm.org/D20250

llvm-svn: 270142
2016-05-19 21:39:00 +00:00
Adrian McCarthy
f3a3c62ffb Modify emitTypeInformation to use MemoryTypeTableBuilder
A baby step toward translating DIType records to CodeView.

This does not (yet) combine the record length with the record data. I'm going back and forth trying to determine if that's a good idea.

llvm-svn: 270106
2016-05-19 20:12:56 +00:00
Matthew Simpson
84afa7c057 [ARM, AArch64] Properly initialize InterleavedAccessPass
InterleavedAccessPass is an IR-level pass, so this change will enable testing
it with opt. This is part of D20250.

llvm-svn: 270101
2016-05-19 20:08:32 +00:00
Mitch Bodart
c83bd382db CodeGen: Move check of EnablePostRAScheduler to avoid disabling antidependency breaker
Previously, specifying -post-RA-scheduler=true had the side effect of
disabling the antidependency breaker, yielding different behavior than
if the post-RA-scheduler was enabled via the scheduling model.

Differential Revision: http://reviews.llvm.org/D20186

llvm-svn: 270077
2016-05-19 16:40:49 +00:00
Sanjay Patel
0aebfa9d81 [SelectionDAG] rename/move isKnownToBeAPowerOfTwo() from TargetLowering (NFC)
There are at least 2 places (DAGCombiner, X86ISelLowering) where this could be used instead
of ad-hoc and watered down code that is trying to match a power-of-2 pattern.

Differential Revision: http://reviews.llvm.org/D20439

llvm-svn: 270073
2016-05-19 15:53:52 +00:00
Peter Collingbourne
1d69c09bab CodeGen: Make the global-merge pass independently testable, and add a test.
llvm-svn: 270023
2016-05-19 04:38:56 +00:00
Sanjay Patel
8833a2ff67 reduce indentation; NFCI
llvm-svn: 270007
2016-05-19 00:33:07 +00:00
Haicheng Wu
93eb34a12b [MBP] Remove a redundant skipFunction(). NFC.
skipFunction() is called twice.

Differential Revision: http://reviews.llvm.org/D20377

llvm-svn: 269994
2016-05-18 22:34:45 +00:00
Krzysztof Parzyszek
72f4c82ed6 When looking for a spill slot in reg scavenger, find one that matches RC
When looking for an available spill slot, the register scavenger would stop
after finding the first one with no register assigned to it. That slot may
have size and alignment that do not meet the requirements of the register
that is to be spilled. Instead, find an available slot that is the closest
in size and alignment to one that is needed to spill a register from RC.

Differential Revision: http://reviews.llvm.org/D20295

llvm-svn: 269969
2016-05-18 18:16:00 +00:00
Hans Wennborg
5b89989aa5 Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.

llvm-svn: 269949
2016-05-18 16:10:17 +00:00
Zachary Turner
d4ae961cf6 [codeview] Some cleanup of Symbol Records.
* Reworks the CVSymbolTypes.def to work similarly to TypeRecords.def.
* Moves some enums from SymbolRecords.h to CodeView.h to maintain
  consistency with how we do type records.
* Generalize a few simple things like the record prefix
* Define the leaf enum and the kind enum similar to how we do with tyep
  records.

Differential Revision: http://reviews.llvm.org/D20342
Reviewed By: amccarth, rnk

llvm-svn: 269867
2016-05-17 23:50:21 +00:00