1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

19470 Commits

Author SHA1 Message Date
James Y Knight
1bc277004d Make the SelectionDAG graph printer use SDNode::PersistentId labels.
r248010 changed the -debug output to use short ids, but did not
similarly modify the graph printer. Change to be consistent, for ease of
cross-reference.

llvm-svn: 251465
2015-10-27 23:09:03 +00:00
Sanjay Patel
5399c1d335 Use the 'arcp' fast-math-flag when combining repeated FP divisors
This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. 
This was originally part of D8900.

Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and 
possibly other changes.

Differential Revision: http://reviews.llvm.org/D9708

llvm-svn: 251450
2015-10-27 20:27:25 +00:00
Cong Hou
7f73476be8 Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add successors when optimization is disabled.
When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights.

We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled.

In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list.

Differential revision: http://reviews.llvm.org/D13963

llvm-svn: 251429
2015-10-27 17:59:36 +00:00
Mehdi Amini
5eb276336b Do not use "else" when both branches return (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 251398
2015-10-27 08:12:08 +00:00
Steve King
b9e2588ce5 Fix llc crash processing S/UREM for -Oz builds caused by rL250825.
When taking the remainder of a value divided by a constant, visitREM()
attempts to convert the REM to a longer but faster sequence of instructions.
This conversion calls combine() on a speculative DIV instruction. Commit
rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes.
Flow eventually hits unreachable().

This patch adds a test case and a check to prevent visitREM() from trying
to convert the REM instruction in cases where a DIVREM is possible.
See http://reviews.llvm.org/D14035

llvm-svn: 251373
2015-10-27 00:14:06 +00:00
Ivan Krasin
7169afcbee Fix indents. It's a follow up to r251353.
llvm-svn: 251364
2015-10-26 22:35:40 +00:00
Ivan Krasin
01683c7544 Move imported entities into DwarfCompilationUnit to speed up LTO linking.
Summary:
In particular, this CL speeds up the official Chrome linking with LTO by
1.8x.

See more details in https://crbug.com/542426

Reviewers: dblaikie

Subscribers: jevinskie

Differential Revision: http://reviews.llvm.org/D13918

llvm-svn: 251353
2015-10-26 21:36:35 +00:00
David Blaikie
a667b6ee3c Remove assert(false) in favor of asserting the if conditional it is contained within.
Also adjust the code to avoid 3 redundant map lookups.

llvm-svn: 251327
2015-10-26 18:41:13 +00:00
Evgeniy Stepanov
6d86bc6f79 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

The previous iteration of this change was reverted in r250461. This
version leaves the generic, compiler-rt based implementation in
SafeStack.cpp instead of moving it to TargetLoweringBase in order to
allow testing without a TargetMachine.

llvm-svn: 251324
2015-10-26 18:28:25 +00:00
Elena Demikhovsky
b4a97e3e11 Scalarizer for masked.gather and masked.scatter intrinsics.
When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations.
If the mask is not constant, the scalarizer will build a chain of conditional basic blocks.
I added isLegalMaskedGather() isLegalMaskedScatter() APIs.

Differential Revision: http://reviews.llvm.org/D13722

llvm-svn: 251237
2015-10-25 15:37:55 +00:00
Michael Kuperstein
b8a24f50b1 [X86] Use correct calling convention for MCU psABI libcalls
When using the MCU psABI, compiler-generated library calls should pass
some parameters in-register. However, since inreg marking for x86 is currently
done by the front end, it will not be applied to backend-generated calls.

This is a workaround for PR3997, which describes a similar issue for -mregparm.

Differential Revision: http://reviews.llvm.org/D13977

llvm-svn: 251223
2015-10-25 08:14:05 +00:00
Rafael Espindola
4f88a5b854 Refactor: Simplify boolean conditional return statements in lib/CodeGen.
Patch by Richard.

llvm-svn: 251213
2015-10-24 23:11:13 +00:00
Simon Pilgrim
08ba5466b2 [DAGCombiner] Tidy up ConstantFP commutation. NFCI
Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code.

llvm-svn: 251203
2015-10-24 20:06:18 +00:00
Simon Pilgrim
bcf7364290 [DAGCombiner] Generalize masking of constant rotates.
We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded.

Followup to D13851.

llvm-svn: 251197
2015-10-24 18:44:52 +00:00
Simon Pilgrim
2da7bcfef6 [X86][XOP] Add support for lowering vector rotations
This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions.

This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future.

Differential Revision: http://reviews.llvm.org/D13851

llvm-svn: 251188
2015-10-24 13:17:26 +00:00
Joseph Tremoulet
d888298118 [CodeGen] Mark setjmp/catchret MBBs address-taken
Summary:
This ensures that BranchFolding (and similar) won't remove these blocks.

Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks.  I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.

This fixes PR25168.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13774

llvm-svn: 251113
2015-10-23 15:06:05 +00:00
Davide Italiano
12490b7b2c [CodeGen] Remove usage of NDEBUG in header.
Moreover, this seems unused.

llvm-svn: 251081
2015-10-23 00:17:40 +00:00
Matthias Braun
b42deafbca MachineScheduler: Add a way to disable the 'ReduceLatency' heuristic
llvm-svn: 251037
2015-10-22 18:07:31 +00:00
Craig Topper
853fdeb009 Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC
llvm-svn: 251032
2015-10-22 17:05:00 +00:00
Zia Ansari
4fd5110a5a [X86] - Catch extra combine opportunities for redundant imuls.
When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple
users which would result in an extra add instruction.
In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add.

I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works).

Differential Revision: http://reviews.llvm.org/D13740

llvm-svn: 251028
2015-10-22 16:14:45 +00:00
David Majnemer
0ac35c5845 [WinEH] Remove extraneous call to emitEHRegistrationOffsetLabel
It's a relic from the earlier implementation, let's remove it.

llvm-svn: 250964
2015-10-21 23:20:39 +00:00
Matt Arsenault
c838a10829 LegalizeDAG: Implement promote for build_vector
This will be used in future commits for AMDGPU to promote
operations on i64 vectors into operations on 32-bit vector
components.

This will be used / tested in future AMDGPU commits.

llvm-svn: 250945
2015-10-21 21:10:10 +00:00
Elena Demikhovsky
5392c128b3 Masked Load/Store optimization for scalar code
When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks.
I added optimization for constant mask vector.

Differential Revision: http://reviews.llvm.org/D13855

llvm-svn: 250893
2015-10-21 11:50:54 +00:00
Jonas Paulsson
e7a99295db Let MachineVerifier be aware of mem-to-mem instructions.
A mem-to-mem instruction (that both loads and stores), which store to an
FI, cannot pass the verifier since it thinks it is loading from the FI.

For the mem-to-mem instruction, do a looser check in visitMachineOperand()
and only check liveness at the reg-slot while analyzing a frame index operand.

Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs,
which now runs with this flag.

Reviewed by Evan Cheng and Quentin Colombet.

llvm-svn: 250885
2015-10-21 07:39:47 +00:00
Krzysztof Parzyszek
2460f90c19 Tail duplication can mix incompatible registers in phi nodes
Do not tail duplicate blocks where the successor has a phi node,
and the corresponding value in that phi node uses a subregister.

http://reviews.llvm.org/D13922

llvm-svn: 250877
2015-10-21 02:40:06 +00:00
Artyom Skrobov
c99c692d8e Two switch blocks in VectorLegalizer::LegalizeOp already have a
default: llvm_unreachable("This action is not supported yet!");

-- so I'm adding one to the third switch block, too.

This is a follow-up fix for http://reviews.llvm.org/D13862

llvm-svn: 250830
2015-10-20 15:06:37 +00:00
Artyom Skrobov
d5f3afc063 Adding support for TargetLoweringBase::LibCall
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:

- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
  computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall

This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.

The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.

The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.

Reviewers: t.p.northover, rengolin

Subscribers: t.p.northover, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D13862

llvm-svn: 250826
2015-10-20 13:14:52 +00:00
Artyom Skrobov
54a462ae0f Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner.
Summary:
In addition to moving the code over, this patch amends the DIV,REM -> DIVREM
combining to run on all affected nodes at once: if the nodes are converted
to DIVREM one at a time, then the resulting DIVREM may get legalized by the
backend into something target-specific that we won't be able to recognize
and correlate with the remaining nodes.

The motivation is to "prepare terrain" for D13862: when we set DIV and REM
to be legalized to libcalls, instead of the DIVREM, we otherwise lose the
ability to combine them together. To prevent this, we need to take the
DIV,REM -> DIVREM combining out of the lowering stage.

Reviewers: RKSimon, eli.friedman, rengolin

Subscribers: john.brawn, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13733

llvm-svn: 250825
2015-10-20 13:06:02 +00:00
Duncan P. N. Exon Smith
d72f83fbed AsmPrinter: Remove implicit ilist iterator conversion, NFC
llvm-svn: 250776
2015-10-20 00:36:08 +00:00
Cong Hou
96c5c49122 Enhance loop rotation with existence of profile data in MachineBlockPlacement pass.
Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation:

1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header.
2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop.
3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain.

Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies.

Differential revision: http://reviews.llvm.org/D10717

llvm-svn: 250754
2015-10-19 23:16:40 +00:00
Sanjay Patel
659d8f6866 [CGP] transform select instructions into branches and sink expensive operands
This was originally checked in at r250527, but reverted at r250570 because of PR25222.
There were at least 2 problems: 
1. The cost check was checking for an instruction with an exact cost of TCC_Expensive;
that should have been >=.
2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions;
we can't sink instructions that may have side effects / are not safe to execute speculatively.

Fixed those conditions in sinkSelectOperand() and added test cases.

Original commit message:
This is a follow-up to the discussion in D12882.

Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).

Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.

Differential Revision: http://reviews.llvm.org/D13297

llvm-svn: 250743
2015-10-19 21:59:12 +00:00
Owen Anderson
c398b9a4d4 Restore the original behavior of SelectionDAG::getTargetIndex().
It looks like an extra negation snuck in as apart of restoring it.

llvm-svn: 250726
2015-10-19 19:27:40 +00:00
Benjamin Kramer
2339b6e6ad Put back SelectionDAG::getTargetIndex.
While technically this is untested dead code, it has out-of-tree users.
This reverts a part of r250434.

llvm-svn: 250717
2015-10-19 18:26:16 +00:00
Matthias Braun
c2e8632c14 Revert "RegisterPressure: allocatable physreg uses are always kills"
This reverts commit r250596.

Reverted for now as the commit triggers assert in the AMDGPU target
pending investigation.

llvm-svn: 250713
2015-10-19 17:44:22 +00:00
Elena Demikhovsky
2e0208e770 Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore().
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.

Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.

Differential Revision: http://reviews.llvm.org/D13850

llvm-svn: 250686
2015-10-19 07:43:38 +00:00
Simon Pilgrim
023d75d398 Use SDValue bool check. NFCI.
llvm-svn: 250653
2015-10-18 12:33:54 +00:00
Simon Pilgrim
9766a65452 Move one-use variable inside test. NFC.
llvm-svn: 250651
2015-10-18 11:47:23 +00:00
Simon Pilgrim
05edb4cc0d [DAG] Ensure vector constant folding uses correct scalar undef types
Minor fix to D13665 found during post-commit review.

llvm-svn: 250616
2015-10-17 16:49:43 +00:00
Matthias Braun
3b9509b65c RegisterPressure: Unify the sparse sets in LiveRegsSet; NFC
Also do some cleanups comment improvements.

llvm-svn: 250598
2015-10-17 01:03:44 +00:00
Matthias Braun
894abcd9d3 RegisterPressure: allocatable physreg uses are always kills
This property was already used in the code path when no liveness
intervals are present. Unfortunately the code path that uses liveness
intervals tried to query a cached live interval for an allocatable
physreg, those are usually not computed so a conservative default was
used.

This doesn't affect any of the lit testcases. This is a foreclosure to
upcoming changes which should be NFC but without this patch this tidbit
wouldn't be NFC.

llvm-svn: 250596
2015-10-17 00:46:57 +00:00
Matthias Braun
24b9e3b760 RegisterPressure: Remove 0 entries from PressureChange
This should not change behaviour because as far as I can see all code
reading the pressure changes has no effect if the PressureInc is 0.
Removing these entries however does avoid unnecessary computation, and
results in a more stable debug output. I want the stable debug output to
check that some upcoming changes are indeed NFC and identical even at
the debug output level.

llvm-svn: 250595
2015-10-17 00:35:59 +00:00
Matthias Braun
cfaf06e08d RegisterPressure: Hide non-const iterators of PressureDiff
It is too easy to accidentally violate the ordering requirements when
modifying the PressureDiff entries through iterators.

llvm-svn: 250590
2015-10-17 00:08:48 +00:00
Joseph Tremoulet
f8fd84612f [WinEH] Fix eh.exceptionpointer intrinsic lowering
Summary:
Some shared code for handling eh.exceptionpointer and eh.exceptioncode
needs to not share the part that truncates to 32 bits, which is intended
just for exception codes.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13747

llvm-svn: 250588
2015-10-17 00:08:08 +00:00
Reid Kleckner
a26f7de3d1 [WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation
Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset
is incorrect when CSRs are involved. We were supposed to have a test
case to catch this, but it wasn't very rigorous.

The main effect here is that calling _CxxThrowException inside a
catchpad doesn't immediately crash on MOVAPS when you have an odd number
of CSRs.

llvm-svn: 250583
2015-10-16 23:43:27 +00:00
Matthias Braun
77ae40c1ea RegisterPressure: Use range based for, cleanup
llvm-svn: 250579
2015-10-16 23:25:09 +00:00
Benjamin Kramer
4393ca2076 Revert "This is a follow-up to the discussion in D12882."
Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528.

llvm-svn: 250570
2015-10-16 23:00:29 +00:00
Joseph Tremoulet
a6fa6510d0 [WinEH] Fix CatchRetSuccessorColorMap accounting
Summary:
We now use the block for the catchpad itself, rather than its normal
successor, as the funclet entry.
Putting the normal successor in the map leads downstream funclet
membership computations to erroneous results.

Reviewers: majnemer, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D13798

llvm-svn: 250552
2015-10-16 21:22:54 +00:00
David Majnemer
f195e5ed7d [WinEH] Remove dead code/includes from WinEHPrepare
No functionality change is intended.

llvm-svn: 250545
2015-10-16 19:59:52 +00:00
Joseph Tremoulet
16f1fe4cea [WinEH] Fix endpad coloring/numbering
Summary:
When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop
trying to propagate the cleanup's parent's color to the catchendpad, since
what's needed is the cleanup's grandparent's color and the catchendpad
will get that color from the catchpad linkage already.  We already had
this exclusion for invokes, but were missing it for
cleanupendpad/cleanupret.

Also add a missing line that tags cleanupendpads' states in the
EHPadStateMap, without with lowering invokes that target cleanupendpads
which unwind to other handlers (and so don't have the -1 state) will fail.

This fixes the reduced IR repro in PR25163.


Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13797

llvm-svn: 250534
2015-10-16 18:08:16 +00:00
Sanjay Patel
060480ec5a This is a follow-up to the discussion in D12882.
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations. 
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).

Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its 
select-formation behavior that changed with r248439.

Differential Revision: http://reviews.llvm.org/D13297

llvm-svn: 250527
2015-10-16 16:54:30 +00:00