1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

3614 Commits

Author SHA1 Message Date
Volkan Keles
4b7dec5ac6 Add a TargetOption to enable/disable GlobalISel
Summary:
This patch adds a new target option in order to control GlobalISel.
This will allow the users to enable/disable GlobalISel prior to the
backend by calling `TargetMachine::setGlobalISel(bool Enable)`.

No test case as there is already a test to check GlobalISel
command line options.
See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll.

Reviewers: qcolombet, aemerson, ab, dsanders

Reviewed By: qcolombet

Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42137

llvm-svn: 322773
2018-01-17 22:34:21 +00:00
Volkan Keles
eea46f246c [GlobalISel][TableGen] Add support for SDNodeXForm
Summary:
This patch adds CustomRenderer which renders the matched
operands to the specified instruction.

Targets can enable the matching of SDNodeXForm by adding
a definition that inherits from GICustomOperandRenderer and
GISDNodeXFormEquiv as follows.

def gi_imm8 : GICustomOperandRenderer<"renderImm8”>,
                       GISDNodeXFormEquiv<imm8_xform>;

Custom renderer functions should be of the form:
void render(MachineInstrBuilder &MIB, const MachineInstr &I);

Reviewers: dsanders, ab, rovka

Reviewed By: dsanders

Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet

Differential Revision: https://reviews.llvm.org/D42012

llvm-svn: 322582
2018-01-16 18:44:05 +00:00
Sanjoy Das
df40ece177 (Re-landing) Expose a TargetMachine::getTargetTransformInfo function
Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321375
2017-12-22 18:21:59 +00:00
Sanjoy Das
259bcf37bc Revert "Expose a TargetMachine::getTargetTransformInfo function"
This reverts commit r321234.  It breaks the -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 321243
2017-12-21 02:34:39 +00:00
Sanjoy Das
90359bc4d6 Expose a TargetMachine::getTargetTransformInfo function
Summary:
This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321234
2017-12-21 01:06:58 +00:00
Matt Arsenault
422c27e8aa TableGen: Allow setting SDNodeProperties on intrinsics
Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.

Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.

llvm-svn: 321212
2017-12-20 19:36:28 +00:00
Alex Bradbury
5a992b5fc4 [TableGen] Give the option of tolerating duplicate register names
A number of architectures re-use the same register names (e.g. for both 32-bit 
FPRs and 64-bit FPRs). They are currently unable to use the tablegen'erated 
MatchRegisterName and MatchRegisterAltName, as tablegen (when built with 
asserts enabled) will fail.

When the AllowDuplicateRegisterNames in AsmParser is set, duplicated register 
names will be tolerated. A backend can then coerce registers to the desired 
register class by (for instance) implementing validateTargetOperandClass.

At least the in-tree Sparc backend could benefit from this, as does RISC-V 
(single and double precision floating point registers).

Differential Revision: https://reviews.llvm.org/D39845

llvm-svn: 320018
2017-12-07 09:51:55 +00:00
Daniel Sanders
f0a9960826 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Daniel Sanders
37451b86e7 Allow similar TargetOpcodes to use inheritance to factor out commonality. NFC.
Summary:
While implementing atomicrmw in https://reviews.llvm.org/D40092 I found that
inheritance is unusable for all the Generic Opcodes in GlobalISel. This is
because the whole header is included inside a 'let mayLoad = 0, mayStore = 0 ... in'
block. In TableGen, the order of precedence for field assignments is:
  1. Values from classes the record inherits from.
  2. Values from 'let Name=Value in { ... }'
  3. Values from 'let Name=Value;'
As such the 'let mayLoad = 0, mayStore = 0, ... in' surrounding the
'include "GenericOpcodes.td"' was overriding any values provided via inheritance.
We hadn't noticed this before because we were only using 'let Name=Value;' to
specialize opcodes.

Fix this by moving the default values to the lowest precedence. This is
accomplished by moving the values to a common base class
(StandardPseudoInstruction for most TargetOpcodes, and GenericOpcode for
GlobalISel specific TargetOpcodes)

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D40096

llvm-svn: 319701
2017-12-04 21:40:57 +00:00
Daniel Sanders
2a3d1acd34 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Volkan Keles
c660180f94 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Daniel Sanders
f43958d13f [globalisel][tablegen] Add support for relative AtomicOrderings
No test yet because the relevant rules are blocked on the atomic_load,
and atomic_store nodes.

llvm-svn: 319475
2017-11-30 21:05:59 +00:00
Daniel Sanders
fe50afca94 [aarch64][globalisel] Legalize G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_*
G_ATOMICRMW_* is generally legal on AArch64. The exception is G_ATOMICRMW_NAND.

G_ATOMIC_CMPXCHG_WITH_SUCCESS needs to be lowered to G_ATOMIC_CMPXCHG with an
external comparison.

Note that IRTranslator doesn't generate these instructions yet.

llvm-svn: 319466
2017-11-30 20:11:42 +00:00
Sean Eveson
d3fdef109a [MC] Function stack size section.
Re applying after fixing issues in the diff, sorry for any painful conflicts/merges!

Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319430
2017-11-30 13:05:14 +00:00
Sean Eveson
b9a62958c9 Revert r319423: [MC] Function stack size section.
I messed up the diff.

llvm-svn: 319429
2017-11-30 12:43:25 +00:00
Sean Eveson
4b5d214bc5 [MC] Function stack size section.
Summary:
Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

I wasn't sure who to put as reviewers, so please add/remove people as appropriate.

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319423
2017-11-30 12:01:16 +00:00
Daniel Sanders
ce08391be8 [globalisel][tablegen] Add support for importing G_ATOMIC_CMPXCHG, G_ATOMICRMW_* rules from SelectionDAG.
GIM_CheckNonAtomic has been replaced by GIM_CheckAtomicOrdering to allow it to support a wider
range of orderings. This has then been used to import patterns using nodes such
as atomic_cmp_swap, atomic_swap, and atomic_load_*.

llvm-svn: 319232
2017-11-28 22:07:05 +00:00
Daniel Sanders
cd1fb2fd55 [aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make them legal
The IRTranslator cannot generate these instructions at the moment so there's no
issue with not having implemented ISel for them yet. D40092 will add
G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a
further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into
G_ATOMIC_CMPXCHG with an external success check via the `Lower` action.

The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is
to import SelectionDAG rules while still supporting targets that prefer to
custom lower the original LLVM-IR-like operation.

llvm-svn: 319216
2017-11-28 20:21:15 +00:00
Daniel Sanders
3676c002af [globalisel][tablegen] Generalize pointer-type inference by introducing ptypeN. NFC
ptypeN is functionally the same as typeN except that it informs the
SelectionDAG importer that an operand should be treated as a pointer even
if it was written as iN. This is important for patterns that use iN instead
of iPTR to represent pointers. E.g.:
  (set GPR64:$dst, (load GPR64:$addr))

Previously, this was handled as a hardcoded special case for the appropriate
operands to G_LOAD and G_STORE.

llvm-svn: 318574
2017-11-18 00:16:44 +00:00
David Blaikie
e01dc73ad2 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Alex Bradbury
8f39d57648 Set hasSideEffects=0 for TargetOpcode::{CFI_INSTRUCTION,EH_LABEL,GC_LABEL,ANNOTATION_LABEL}
D37065 (committed as rL317674) explicitly set hasSideEffects for all 
TargetOpcode::* instructions where it was inferred previously. This is a 
follow-up to that patch, setting hasSideEffects=0 for CFI_INSTRUCTION, 
EH_LABEL, GC_LABEL and ANNOTATION_LABEL. All LLVM tests pass after this 
change.

This patch also modifies MachineInstr::isLabel returns true for a 
TargetOpcode::ANNOTATION_LABEL, which ensures that an annotation label won't 
be incorrectly considered safe to move.

Differential Revision: https://reviews.llvm.org/D39941

llvm-svn: 318174
2017-11-14 19:16:08 +00:00
Daniel Sanders
052a3d35ec [tablegen] Handle atomic predicates for ordering inside tablegen. NFC.
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common atomic predicates related to
ordering into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

llvm-svn: 318102
2017-11-13 23:03:47 +00:00
Daniel Sanders
5d4ac8cff2 [tablegen] Handle atomic predicates for memory type inside tablegen. NFC.
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common atomic predicates related to
memory type into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

llvm-svn: 318095
2017-11-13 22:26:13 +00:00
Jonas Paulsson
1b514619dd [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
Reid Kleckner
c37cd8d78d Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317579, originally committed as r317100.

There is a design issue with marking CFI instructions duplicatable. Not
all targets support the CFIInstrInserter pass, and targets like Darwin
can't cope with duplicated prologue setup CFI instructions. The compact
unwind info emission fails.

When the following code is compiled for arm64 on Mac at -O3, the CFI
instructions end up getting tail duplicated, which causes compact unwind
info emission to fail:
  int a, c, d, e, f, g, h, i, j, k, l, m;
  void n(int o, int *b) {
    if (g)
      f = 0;
    for (; f < o; f++) {
      m = a;
      if (l > j * k > i)
        j = i = k = d;
      h = b[c] - e;
    }
  }

We get assembly that looks like this:
; BB#1:                                 ; %if.then
Lloh3:
	adrp	x9, _f@GOTPAGE
Lloh4:
	ldr	x9, [x9, _f@GOTPAGEOFF]
	mov	 w8, wzr
Lloh5:
	str		wzr, [x9]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.lt	LBB0_3
	b	LBB0_7
LBB0_2:                                 ; %entry.if.end_crit_edge
Lloh6:
	adrp	x8, _f@GOTPAGE
Lloh7:
	ldr	x8, [x8, _f@GOTPAGEOFF]
Lloh8:
	ldr		w8, [x8]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.ge	LBB0_7
LBB0_3:                                 ; %for.body.lr.ph

Note the multiple .cfi_def* directives. Compact unwind info emission
can't handle that.

llvm-svn: 317726
2017-11-08 21:31:14 +00:00
Alex Bradbury
1e5ce63ec5 Set hasSideEffects=0 for PHI and fix affected passes
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred 
as 1. D37065 sets the previously inferred properties explicitly. This patch sets 
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has 
been updated so it still returns false for PHI.

Additionally, HexagonBitSimplify relied on a PHI node having the 
hasUnmodeledSideEffects property. This patch fixes that assumption.

Differential Revision: https://reviews.llvm.org/D37097

llvm-svn: 317721
2017-11-08 20:19:16 +00:00
Alex Bradbury
1b9512e762 [NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0
rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target 
sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if 
it has to guess properties from patterns. Unfortunately, 
guessInstructionProperties=0 can't be used with current upstream LLVM as 
instructions in the TargetOpcode namespace are always included and sometimes 
have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch 
provides the simplest possible fix to this problem, setting default values for 
these fields in the TargetOpcode scope. There is no intended functional 
change, as the explicitly set properties should match what was previously 
inferred. A number of the instructions had hasSideEffects=1 inferred 
unintentionally. This patch makes it explicit, while future patches (such as 
D37097) correct the property.

Differential Revision: https://reviews.llvm.org/D37065

llvm-svn: 317674
2017-11-08 09:26:06 +00:00
Matt Arsenault
38099e08bc DAG: Add computeKnownBitsForFrameIndex
Some of the AMDGPU stack addressing modes require knowing the sign
bit is zero. We used to accomplish this by custom lowering
frame indexes, and then putting an AssertZext around a
TargetFrameIndex. This required specifically looking for
the AssextZext + frame index pattern which was moderately
disgusting. The same could probably be accomplished
with a target specific node, but would still
require special handling of frame indexes.

llvm-svn: 317671
2017-11-08 08:52:31 +00:00
David Blaikie
45b647d5eb Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Petar Jovanovic
5b6a90db77 Reland "Correct dwarf unwind information in function epilogue for X86"
Reland r317100 with minor fix regarding ComputeCommonTailLength function in
BranchFolding.cpp. Skipping top CFI instructions block needs to executed on
several more return points in ComputeCommonTailLength().

Original r317100 message:

"Correct dwarf unwind information in function epilogue for X86"

This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.

Patch by Violeta Vukobrat.

llvm-svn: 317579
2017-11-07 14:40:27 +00:00
David Blaikie
10180bb2a1 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Petar Jovanovic
8c7ceb5bd9 Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf
buildbot failure (build #15606).

llvm-svn: 317136
2017-11-01 23:05:52 +00:00
Petar Jovanovic
d61746acde Correct dwarf unwind information in function epilogue for X86
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.


Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D35844

llvm-svn: 317100
2017-11-01 16:04:11 +00:00
David Blaikie
5974607431 TargetOpcodes.h: Don't mark header functions as file local
llvm-svn: 316514
2017-10-24 21:29:19 +00:00
Daniel Sanders
572038831c [globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero.
This patch enables the import of stores. Unfortunately, doing so by itself,
loses an optimization where storing 0 to memory makes use of WZR/XZR.

To mitigate this, this patch also introduces a new feature that allows register
operands to nominate a zero register. When this is done, GlobalISel will
substitute (G_CONSTANT 0) with the nominated register automatically. This
is currently configured to only apply to the stores.

Applying it to GPR32/GPR64 register classes in general will be done after
review see (https://reviews.llvm.org/D39150).

llvm-svn: 316360
2017-10-23 18:19:24 +00:00
Marina Yatsina
b5c80eef49 Add logic to greedy reg alloc to avoid bad eviction chains
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
llvm-svn: 316295
2017-10-22 17:59:38 +00:00
Daniel Sanders
ca098ff70b [globalisel][tablegen] Map ld and st to G_LOAD and G_STORE. NFC
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.

This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.

Depends on D37443

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37445

llvm-svn: 315843
2017-10-15 02:41:12 +00:00
Daniel Sanders
fc690adc20 [tablegen] Handle common load/store predicates inside tablegen. NFC.
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
   SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
   GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

Depends on D36618

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37443

Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().

llvm-svn: 315841
2017-10-15 02:06:44 +00:00
Daniel Sanders
eb33df745e [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36569

llvm-svn: 315761
2017-10-13 21:28:03 +00:00
Daniel Sanders
4c2aa5c732 [aarch64] Support APInt and APFloat in ImmLeaf subclasses and make AArch64 use them.
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.

With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.

This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.

For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.

Depends on D36086

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36534

llvm-svn: 315747
2017-10-13 20:42:18 +00:00
Matt Arsenault
1086f79305 DAG: Add opcode and source type to isFPExtFree
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.

llvm-svn: 315740
2017-10-13 19:55:45 +00:00
Craig Topper
bbfe102642 [SelectionDAG] Const-correct the DemandedMask argument to one of the overloads of SimplifyDemandedBits. NFC
llvm-svn: 315641
2017-10-12 23:46:05 +00:00
Matthias Braun
6cdc04a20c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Matthias Braun
e1c491ab05 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Wei Ding
725c541dbe Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

llvm-svn: 315610
2017-10-12 19:37:14 +00:00
Wei Mi
f44870d1d1 Revert r307036 because of PR34919.
llvm-svn: 315540
2017-10-12 00:24:52 +00:00
Alex Bradbury
d4ed8e7017 [TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfo
The NumFixedArgs field of CallLoweringInfo is used by
TargetLowering::LowerCallTo to determine whether a given argument is passed
using the vararg calling convention or not (specifically, to set IsFixed for
each ISD::OutputArg).

Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both
incorrectly set NumFixedArgs based on the _previous_ args list. Secondly,
TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying
the argument list so a pointer is passed for the return value.

If your backend uses the IsFixed property or directly accesses NumFixedArgs, 
it is _possible_ this change could result in codegen changes (although the 
previous behaviour would have been incorrect). No such cases have been
identified during code review for any in-tree architecture.

Differential Revision: https://reviews.llvm.org/D37898

llvm-svn: 315457
2017-10-11 13:48:45 +00:00
Oliver Stannard
20a7d7760f [AsmParser] Add DiagnosticString to register classes in tablegen
This allows a DiagnosticType and/or DiagnosticString to be associated
with a RegisterClass in tablegen, so that we can emit diagnostics in the
assembler when a register operand is incorrect.

DiagnosticType creates a predictable enum value, which gets returned as
the error code when an operand does not match, and can be used by the
assembly parser to map to a user-facing diagnostic. DiagnosticString
creates an anonymous enum value (currently based on the tablegen class
name), and a function to map from enum values to strings will be
generated. Both of these work the same was as they do for AsmOperand.

This isn't used by any targets yet, but has one (positive) side-effect.
It improves the diagnostic codes returned by validateOperandClass - we
always want to emit the diagnostic that relates to the expected operand
class, but this wasn't always being done when the expected and actual
classes were completely different (token/register/custom). This causes a
few AArch64 diagnostics to be improved, as Match_InvalidOperand was
being returned instead of a specific diagnostic type.

Differential revision: https://reviews.llvm.org/D36691

llvm-svn: 315295
2017-10-10 11:00:40 +00:00
Jessica Paquette
5ca1dd864d [MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2.
Say that the outliner outlines A,B,C from one function, and D,E,F from another
function (where letters are instructions). Now those functions are not
identical, and cannot be deduped. Locally to M1 and M2, these outlining
choices would be good-- to the whole program, however, this might not be true!

To mitigate this, this commit makes it so that the outliner sees linkonceodr
functions as unsafe to outline from. It also adds a flag,
-enable-linkonceodr-outlining, which allows the user to specify that they
want to outline from such functions when they know what they're doing.

Changing this handles most code size regressions in the test suite caused by
competing with linker dedupe. It also doesn't have a huge impact on the code
size improvements from the outliner. There are 6 tests that regress > 5% from
outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most
tests either improve or are not impacted.

Not outlined vs outlined without linkonceodrs:
https://hastebin.com/raw/qeguxavuda

Not outlined vs outlined with linkonceodrs:
https://hastebin.com/raw/edepoqoqic

Outlined with linkonceodrs vs outlined without linkonceodrs:
https://hastebin.com/raw/awiqifiheb

Numbers generated using compare.py with -m size.__text. Tests run for AArch64
with -Oz -mllvm -enable-machine-outliner -mno-red-zone.

llvm-svn: 315136
2017-10-07 00:16:34 +00:00
Oliver Stannard
a8cc0cc3a1 [AsmParser] Add DiagnosticString to AsmOperands in tablegen
This adds a DiagnosticString member to the AsmOperand tablegen class, so
that the diagnostic text to be used when an assembly operand is
incorrect can be stored in the tablegen description of the operand,
rather than in a separate switch statement in the AsmParser.

If DiagnosticString is used for any operands, tablegen will emit a
getMatchKindDiag function, to map from diagnostic enums to strings.

Differential revision: https://reviews.llvm.org/D31606

llvm-svn: 314803
2017-10-03 14:34:57 +00:00