1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

3819 Commits

Author SHA1 Message Date
Reid Kleckner
b3c566c278 Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations"
It causes LegalizerHelperTest.LowerBitCountingCTTZ1 to fail.

llvm-svn: 340186
2018-08-20 16:50:19 +00:00
Aditya Nandakumar
d12808310c [GISel]: Add Legalization/lowering code for bit counting operations
https://reviews.llvm.org/D48847#inline-448257

Ported legalization expansions for CTLZ/CTTZ from DAG to GISel.

Reviewed by rtereshin.

llvm-svn: 340111
2018-08-18 00:01:54 +00:00
Chandler Carruth
6edb18f7dd Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics
This is breaking ~all the bots.

llvm-svn: 339982
2018-08-17 04:47:16 +00:00
Aditya Nandakumar
5123be13d4 [GISel]: Add Opcodes for a few LLVM Intrinsics
https://reviews.llvm.org/D50401

Add opcodes for llvm.intrinsic.trunc, round, and update the IRTranslator
for the same.

Reviewed by: dsanders.

llvm-svn: 339977
2018-08-17 01:41:56 +00:00
Andrea Di Biagio
8add466d59 [Tablegen][MCInstPredicate] Removed redundant template argument from class TIIPredicate, and implemented verification rules for TIIPredicates.
This patch removes redundant template argument `TargetName` from TIIPredicate.
Tablegen can always infer the target name from the context. So we don't need to
force users of TIIPredicate to always specify it.

This allows us to better modularize the tablegen class hierarchy for the
so-called "function predicates". class FunctionPredicateBase has been added; it
is currently used as a building block for TIIPredicates. However, I plan to
reuse that class to model other function predicate classes too (i.e. not just
TIIPredicates). For example, this can be a first step towards implementing
proper support for dependency breaking instructions in tablegen.

This patch also adds a verification step on TIIPredicates in tablegen.
We cannot have multiple TIIPredicates with the same name. Otherwise, this will
cause build errors later on, when tablegen'd .inc files are included by cpp
files and then compiled.

Differential Revision: https://reviews.llvm.org/D50708

llvm-svn: 339706
2018-08-14 18:36:54 +00:00
Andrea Di Biagio
e84b02f47e [Tablegen][SubtargetEmitter] Improve expansion of predicates of a variant scheduling class.
This patch refactors the logic that expands predicates of a variant scheduling
class.

The idea is to improve the readability of the auto-generated code by removing
redundant parentheses around predicate expressions, and by removing redundant
if(true) statements.

This patch replaces the definition of NoSchedPred in TargetSchedule.td with an
instance of MCSchedPredicate. The new definition is sematically equivalent to
the previous one. The main difference is that now SubtargetEmitter knows that it
represents predicate "true".

Before this patch, we always generated an if (true) for the default transition
of a variant scheduling class.

Example (taken from AArch64GenSubtargetInfo.inc) :

```
if (SchedModel->getProcessorID() == 3) { // CycloneModel
  if ((TII->isScaledAddr(*MI)))
    return 927; // (WriteIS_WriteLD)_ReadBaseRS
  if ((true))
    return 928; // WriteLD_ReadDefault
}
```

Extra parentheses were also generated around the predicate expressions.

With this patch, we get the following auto-generated checks:

```
if (SchedModel->getProcessorID() == 3) { // CycloneModel
  if (TII->isScaledAddr(*MI))
    return 927; // (WriteIS_WriteLD)_ReadBaseRS
  return 928; // WriteLD_ReadDefault
}
```

The new auto-generated code behaves exactly the same as before. So, technically
this is a non functional change.

Differential revision: https://reviews.llvm.org/D50566

llvm-svn: 339552
2018-08-13 11:09:04 +00:00
Reid Kleckner
5f0f124d6a [MC] Move EH DWARF encodings from MC to CodeGen, NFC
Summary:
The TType encoding, LSDA encoding, and personality encoding are all
passed explicitly by CodeGen to the assembler through .cfi_* directives,
so only the AsmPrinter needs to know about them.

The FDE CFI encoding however, controls the encoding of the label
implicitly created by the .cfi_startproc directive. That directive seems
to be special in that it doesn't take an encoding, so the assembler just
has to know how to encode one DSO-local label reference from .eh_frame
to .text.

As a result, it looks like MC will continue to have to know when the
large code model is in use. Perhaps we could invent a '.cfi_startproc
[large]' flag so that this knowledge doesn't need to pollute the
assembler.

Reviewers: davide, lliu0, JDevlieghere

Subscribers: hiraditya, fedor.sergeev, llvm-commits

Differential Revision: https://reviews.llvm.org/D50533

llvm-svn: 339397
2018-08-09 22:24:04 +00:00
Andrea Di Biagio
6413a2dd47 [MC][PredicateExpander] Extend the grammar to support simple switch and return statements.
This patch introduces tablegen class MCStatement.

Currently, an MCStatement can be either a return statement, or a switch
statement.

```
MCStatement:
   MCReturnStatement
   MCOpcodeSwitchStatement
```

A MCReturnStatement expands to a return statement, and the boolean expression
associated with the return statement is described by a MCInstPredicate.

An MCOpcodeSwitchStatement is a switch statement where the condition is a check
on the machine opcode. It allows the definition of multiple checks, as well as a
default case. More details on the grammar implemented by these two new
constructs can be found in the diff for TargetInstrPredicates.td.

This patch makes it easier to read the body of auto-generated TargetInstrInfo
predicates.

In future, I plan to reuse/extend the MCStatement grammar to describe more
complex target hooks. For now, this is just a first step (mostly a minor
cosmetic change to polish the new predicates framework).

Differential Revision: https://reviews.llvm.org/D50457

llvm-svn: 339352
2018-08-09 15:32:48 +00:00
Craig Topper
bb1bde9246 [SelectionDAG][X86][SystemZ] Add a generic nonvolatile_store/nonvolatile_load pattern fragment in TargetSelectionDAG.td
Differential Revision: https://reviews.llvm.org/D50358

llvm-svn: 339156
2018-08-07 17:34:59 +00:00
Andrea Di Biagio
7444c6289e [Tablegen] In TargetSchedule.td: Remove unused argument pfmCounters from ProcResourceUnits.
PFM counters don't need to be passed in input to the definition of
ProcResourceUnits. class PfmIssueCounter (see r329675) is used to map resources
to PFM counter(s).

Differential Revision: https://reviews.llvm.org/D50333

llvm-svn: 339125
2018-08-07 10:33:46 +00:00
Aditya Nandakumar
6188b99859 [GISel]: Add Opcodes for CTLZ/CTTZ/CTPOP
https://reviews.llvm.org/D48600

Added IRTranslator support to translate these known intrinsics into GISel opcodes.

llvm-svn: 338944
2018-08-04 01:22:12 +00:00
Lei Liu
4779c3a4b6 [AArch64] DWARF: do not generate AT_location for thread local
AArch64 ELF ABI does not define a static relocation type for TLS offset within
a module, which makes it impossible for compiler to generate a valid
DW_AT_location content for thread local variables. Currently LLVM generates an
invalid R_AARCH64_ABS64 relocation at the DW_AT_location field for a TLS
variable. That causes trouble for linker because thread local variable does
not have an absolute address at link time. AArch64 GCC solves the problem by
not generating DW_AT_location for thread local variables. We should do the
same in LLVM.

Differential Revision: https://reviews.llvm.org/D43860

llvm-svn: 338655
2018-08-01 23:46:49 +00:00
Amara Emerson
836debbb53 [GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.
Differential Revision: https://reviews.llvm.org/D49900

llvm-svn: 338335
2018-07-31 00:08:50 +00:00
Fangrui Song
121474a01b Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Andrea Di Biagio
667da2dc91 [TargetInstPredicate] Add definition of CheckInvalidRegisterOperand.
This should have been part of r337378. I forgot to svn add it before committing
the change.

llvm-svn: 337380
2018-07-18 11:16:31 +00:00
Peter Collingbourne
52168cfcad CodeGen: Add a target option for emitting .addrsig directives for all address-significant symbols.
Differential Revision: https://reviews.llvm.org/D48143

llvm-svn: 337331
2018-07-17 22:40:08 +00:00
Fangrui Song
c76988a0c0 [CodeGen] Fix inconsistent declaration parameter name
llvm-svn: 337200
2018-07-16 18:51:40 +00:00
Andrea Di Biagio
c19db3b1d5 [llvm-mca][BtVer2] teach how to identify false dependencies on partially written
registers.

The goal of this patch is to improve the throughput analysis in llvm-mca for the
case where instructions perform partial register writes.

On x86, partial register writes are quite difficult to model, mainly because
different processors tend to implement different register merging schemes in
hardware.

When the code contains partial register writes, the IPC (instructions per
cycles) estimated by llvm-mca tends to diverge quite significantly from the
observed IPC (using perf).

Modern AMD processors (at least, from Bulldozer onwards) don't rename partial
registers. Quoting Agner Fog's microarchitecture.pdf:
" The processor always keeps the different parts of an integer register together.
For example, AL and AH are not treated as independent by the out-of-order
execution mechanism. An instruction that writes to part of a register will
therefore have a false dependence on any previous write to the same register or
any part of it."

This patch is a first important step towards improving the analysis of partial
register updates. It changes the semantic of RegisterFile descriptors in
tablegen, and teaches llvm-mca how to identify false dependences in the presence
of partial register writes (for more details: see the new code comments in
include/Target/TargetSchedule.h - class RegisterFile).

This patch doesn't address the case where a write to a part of a register is
followed by a read from the whole register.  On Intel chips, high8 registers
(AH/BH/CH/DH)) can be stored in separate physical registers. However, a later
(dirty) read of the full register (example: AX/EAX) triggers a merge uOp, which
adds extra latency (and potentially affects the pipe usage).
This is a very interesting article on the subject with a very informative answer
from Peter Cordes:
https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to

In future, the definition of RegisterFile can be extended with extra information
that may be used to identify delays caused by merge opcodes triggered by a dirty
read of a partial write.

Differential Revision: https://reviews.llvm.org/D49196

llvm-svn: 337123
2018-07-15 11:01:38 +00:00
Joel Galenson
9249622410 [cfi-verify] Support AArch64.
This patch adds support for AArch64 to cfi-verify.

This required three changes to cfi-verify.  First, it generalizes checking if an instruction is a trap by adding a new isTrap flag to TableGen (and defining it for x86 and AArch64).  Second, the code that ensures that the operand register is not clobbered between the CFI check and the indirect call needs to allow a single dereference (in x86 this happens as part of the jump instruction).  Third, we needed to ensure that return instructions are not counted as indirect branches.  Technically, returns are indirect branches and can be covered by CFI, but LLVM's forward-edge CFI does not protect them, and x86 does not consider them, so we keep that behavior.

In addition, we had to improve AArch64's code to evaluate the branch target of a MCInst to handle calls where the destination is not the first operand (which it often is not).

Differential Revision: https://reviews.llvm.org/D48836

llvm-svn: 337007
2018-07-13 15:19:33 +00:00
Ulrich Weigand
535942804d [TableGen] Support multi-alternative pattern fragments
A TableGen instruction record usually contains a DAG pattern that will
describe the SelectionDAG operation that can be implemented by this
instruction. However, there will be cases where several different DAG
patterns can all be implemented by the same instruction. The way to
represent this today is to write additional patterns in the Pattern
(or usually Pat) class that map those extra DAG patterns to the
instruction. This usually also works fine.

However, I've noticed cases where the current setup seems to require
quite a bit of extra (and duplicated) text in the target .td files.
For example, in the SystemZ back-end, there are quite a number of
instructions that can implement an "add-with-overflow" operation.
The same instructions also need to be used to implement just plain
addition (simply ignoring the extra overflow output). The current
solution requires creating extra Pat pattern for every instruction,
duplicating the information about which particular add operands
map best to which particular instruction.

This patch enhances TableGen to support a new PatFrags class, which
can be used to encapsulate multiple alternative patterns that may
all match to the same instruction.  It operates the same way as the
existing PatFrag class, except that it accepts a list of DAG patterns
to match instead of just a single one.  As an example, we can now define
a PatFrags to match either an "add-with-overflow" or a regular add
operation:

  def z_sadd : PatFrags<(ops node:$src1, node:$src2),
                        [(z_saddo node:$src1, node:$src2),
                         (add node:$src1, node:$src2)]>;

and then use this in the add instruction pattern:

  defm AR : BinaryRRAndK<"ar", 0x1A, 0xB9F8, z_sadd, GR32, GR32>;

These SystemZ target changes are implemented here as well.


Note that PatFrag is now defined as a subclass of PatFrags, which
means that some users of internals of PatFrag need to be updated.
(E.g. instead of using PatFrag.Fragment you now need to use
!head(PatFrag.Fragments).)


The implementation is based on the following main ideas:
- InlinePatternFragments may now replace each original pattern
  with several result patterns, not just one.
- parseInstructionPattern delays calling InlinePatternFragments
  and InferAllTypes.  Instead, it extracts a single DAG match
  pattern from the main instruction pattern.
- Processing of the DAG match pattern part of the main instruction
  pattern now shares most code with processing match patterns from
  the Pattern class.
- Direct use of main instruction patterns in InferFromPattern and
  EmitResultInstructionAsOperand is removed; everything now operates
  solely on DAG match patterns.


Reviewed by: hfinkel

Differential Revision: https://reviews.llvm.org/D48545

llvm-svn: 336999
2018-07-13 13:18:00 +00:00
Francis Visoiu Mistrih
f144cbd9ce [XRay] Fix machine verifier issues in X86
I'm not sure if this fix is the right thing to do, but it seemed to me
that PATCHABLE_RET and PATCHABLE_TAIL_CALL don't have any defs.

Running the following:

```
LLVM_ENABLE_MACHINE_VERIFIER=1 ./build/bin/llvm-lit -v -a test/CodeGen/X86/xray-*
```

results in the following tests to fail (along others):

```
LLVM :: CodeGen/X86/xray-attribute-instrumentation.ll
LLVM :: CodeGen/X86/xray-custom-log.ll
LLVM :: CodeGen/X86/xray-log-args.ll
LLVM :: CodeGen/X86/xray-loop-detection.ll
LLVM :: CodeGen/X86/xray-multiplerets-in-blocks.mir
LLVM :: CodeGen/X86/xray-section-group.ll
LLVM :: CodeGen/X86/xray-selective-instrumentation.ll
LLVM :: CodeGen/X86/xray-tail-call-sled.ll
LLVM :: CodeGen/X86/xray-typed-event-log.ll
```

The errors are:

```
*** Bad machine code: Explicit definition must be a register ***
- function:    fn
- basic block: %bb.0  (0x7fa31a84d908)
- instruction: PATCHABLE_RET 2560, $eax
- operand 0:   2560
```

and

```
*** Bad machine code: Explicit definition must be a register ***
- function:    caller
- basic block: %bb.0  (0x7fbff3044108)
- instruction: PATCHABLE_TAIL_CALL 3009, @callee, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp, implicit $edi
- operand 0:   3009
```

Differential Revision: https://reviews.llvm.org/D49187

llvm-svn: 336906
2018-07-12 14:36:43 +00:00
Jessica Paquette
2c6ef18647 [MachineOutliner] Add support for target-default outlining.
This adds functionality to the outliner that allows targets to
specify certain functions that should be outlined from by default.

If a target supports default outlining, then it specifies that in
its TargetOptions. In the case that it does, and the user hasn't
specified that they *never* want to outline, the outliner will
be added to the pass pipeline and will run on those default functions.

This is a preliminary patch for turning the outliner on by default
under -Oz for AArch64.

https://reviews.llvm.org/D48776

llvm-svn: 336040
2018-06-30 03:56:03 +00:00
Jessica Paquette
848840c089 [MachineOutliner] Define MachineOutliner support in TargetOptions
Targets should be able to define whether or not they support the outliner
without the outliner being added to the pass pipeline. Before this, the
outliner pass would be added, and ask the target whether or not it supports the
outliner.

After this, it's possible to query the target in TargetPassConfig, before the
outliner pass is created. This ensures that passing -enable-machine-outliner
will not modify the pass pipeline of any target that does not support it.

https://reviews.llvm.org/D48683

llvm-svn: 335887
2018-06-28 17:45:43 +00:00
Matthias Braun
719039ac2c SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O)
Add NoTrapAfterNoreturn target option which skips emission of traps
behind noreturn calls even if TrapUnreachable is enabled.

Enable the feature on Mach-O to save code size; Comments suggest it is
not possible to enable it for the other users of TrapUnreachable.

rdar://41530228

DifferentialRevision: https://reviews.llvm.org/D48674
llvm-svn: 335877
2018-06-28 17:00:45 +00:00
Aditya Nandakumar
ec43439ab4 [GISel]: Add G_ADDRSPACE_CAST Opcode
Added IRTranslator support for addrspacecast.

https://reviews.llvm.org/D48469

reviewed by: volkan

llvm-svn: 335388
2018-06-22 20:58:51 +00:00
Daniel Sanders
50735a1161 [globalisel][tablegen] Add support for C++ predicates on PatFrags and use it to support BFC on ARM.
So far, we've only handled special cases of PatFrag like ImmLeaf. This patch
adds support for the remaining cases using similar mechanisms.

Like most C++ code from SelectionDAG, GISel and DAGISel expect to operate on
different types and representations and as such the code is not compatible
between the two. It's therefore necessary to add an alternative implementation
in the GISelPredicateCode field.

The target test for this feature could easily be done with IntImmLeaf and this
would save on a little boilerplate. The reason I've chosen to implement this
using PatFrag.GISelPredicateCode and not IntImmLeaf is because I was unable to
find a rule that was blocked solely by lack of support for PatFrag predicates. I
found that the ones I investigated as being likely candidates for the test
were further blocked by other things.

llvm-svn: 334871
2018-06-15 23:13:43 +00:00
Clement Courbet
e3e2fa9c0a [TableGen] Emit a fatal error on inconsistencies in resource units vs cycles.
Summary:
For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files.
For more obvious cases, I've ventured a fix.

Some notes:
 - Exynos is especially fishy.
 - AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice.
 - The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit.

Also see PR37310.

Reviewers: RKSimon, craig.topper, javed.absar

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D46356

llvm-svn: 334586
2018-06-13 09:41:49 +00:00
Andrea Di Biagio
c26fd3adc6 [RFC][Patch 1/3] Add a new class of predicates for variant scheduling classes.
This patch is the first of a sequence of three patches described by the LLVM-dev
RFC "MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html

The goal of this patch is to introduce a new class of scheduling predicates for
SchedReadVariant and SchedWriteVariant.

An MCSchedPredicate can be used instead of a normal SchedPredicate to model
checks on the instruction (either a MachineInstr or a MCInst).
Internally, an MCSchedPredicate encapsulates an MCInstPredicate definition.
MCInstPredicate allows the definition of expressions with a well-known semantic,
that can be used to generate code for both MachineInstr and MCInst.

This is the first step toward teaching to tools like lllvm-mca how to resolve
variant scheduling classes.

Differential Revision: https://reviews.llvm.org/D46695

llvm-svn: 333282
2018-05-25 15:55:37 +00:00
Petar Jovanovic
8c311a61bc [X86][MIPS][ARM] New machine instruction property 'isMoveReg'
This property is needed in order to follow values movement between
registers. This property is used in TII to implement method that
returns true if simple copy like instruction is recognized, along
with source and destination machine operands.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D45204

llvm-svn: 333093
2018-05-23 15:28:28 +00:00
Simon Dardis
b7fb5f7be7 [FastISel] Permit instructions to be skipped for FastISel generation.
Some ISA's such as microMIPS32(R6) have instructions which are near identical
for code generation purposes, e.g. xor and xor16. These instructions take the
same value types for operands and return values, have the same
instruction predicates and map to the same ISD opcode. (These instructions do
differ by register classes.)

In such cases, the FastISel generator rejects the instruction definition.

This patch borrows the 'FastIselShouldIgnore' bit from rL129692 and enables
applying it to an instruction definition.

Reviewers: mcrosier

Differential Revision: https://reviews.llvm.org/D46953

llvm-svn: 332983
2018-05-22 14:36:58 +00:00
Peter Collingbourne
7f9b89f0e2 CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47089

llvm-svn: 332881
2018-05-21 20:16:41 +00:00
Shiva Chen
1127e5026d [DebugInfo] Convert intrinsic llvm.dbg.label to MachineInstr.
In order to convert LLVM IR to MachineInstr, we need a new TargetOpcode,
DBG_LABEL, to ‘lower’ intrinsic llvm.dbg.label. The patch
creates this new TargetOpcode and convert intrinsic llvm.dbg.label to
MachineInstr through SelectionDAG.

In SelectionDAG, debug information is stored in SDDbgInfo. We create a
new data member of SDDbgInfo for labels and use the new data member,
SDDbgLabel, to create DBG_LABEL MachineInstr.

The new DBG_LABEL MachineInstr uses label metadata from LLVM IR as its
parameter. So, the backend could get metadata information of labels from
DBG_LABEL MachineInstr.

Differential Revision: https://reviews.llvm.org/D45341

Patch by Hsiangkai Wang.

llvm-svn: 331842
2018-05-09 02:41:08 +00:00
Daniel Sanders
96623363e9 [globalisel] Update GlobalISel emitter to match new representation of extending loads
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
  registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
  extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
  improve optimization of extends and truncates, this legality requirement
  would spread without considerable care w.r.t when certain combines were
  permitted.
* The SelectionDAG importer required some ugly and fragile pattern
  rewriting to translate patterns into this style.

This patch changes the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.

Each extending load can be lowered by the legalizer into separate extends
and loads, however a target that supports s1 will need the any-extending
load to extend to at least s8 since LLVM does not represent memory accesses
smaller than 8 bit. The legalizer can widenScalar G_LOAD into an
any-extending load but sign/zero-extending loads need help from something
else like a combiner pass. A follow-up patch that adds combiner helpers for
for this will follow.

The new representation requires that the MMO correctly reflect the memory
access so this has been corrected in a couple tests. I've also moved the
extending loads to their own tests since they are (mostly) separate opcodes
now. Additionally, the re-write appears to have invalidated two tests from
select-with-no-legality-check.mir since the matcher table no longer contains
loads that result in s1's and they aren't legal in AArch64 anymore.

Depends on D45540

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar

Reviewed By: rtereshin

Subscribers: javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45541

llvm-svn: 331601
2018-05-05 20:53:24 +00:00
Adrian Prantl
076a6683eb Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Craig Topper
1a57227400 [X86] Restrict many of the InstAliases to either to only att or intel syntax. NFCI
Many of these aliases exist to give one syntax or the other a slightly different mnemonic and the other variant gets a duplicate of its normal mnemonic

This patch restricts a lot of these to only one variant so we don't get the duplication.

This removes a lot of duplicate entries from the matcher table. It also reduces the number of warnings printed when you enable the ambiguous match warning in tablegen.

llvm-svn: 331117
2018-04-28 18:46:11 +00:00
Daniel Sanders
491fd6384e [globalisel][legalizerinfo] Introduce dedicated extending loads and add lowerings for them
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
  registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
  extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
  improve optimization of extends and truncates, this legality requirement
  would spread without considerable care w.r.t when certain combines were
  permitted.
* The SelectionDAG importer required some ugly and fragile pattern
  rewriting to translate patterns into this style.

This patch begins changing the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.

This patch introduces the new generic instructions and new variation on
G_LOAD and adds lowering for them to convert back to the existing
representations.

Depends on D45466

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, aemerson, javed.absar

Reviewed By: aemerson

Subscribers: aemerson, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45540

llvm-svn: 331115
2018-04-28 18:14:50 +00:00
Eric Christopher
9e3701d460 Remove unused argument from emitModuleMetadata.
NFCI.

llvm-svn: 330470
2018-04-20 19:07:57 +00:00
Keith Wyss
2c7eab97b2 [XRay] Typed event logging intrinsic
Summary:
Add an LLVM intrinsic for type discriminated event logging with XRay.
Similar to the existing intrinsic for custom events, but also accepts
a type tag argument to allow plugins to be aware of different types
and semantically interpret logged events they know about without
choking on those they don't.

Relies on a symbol defined in compiler-rt patch D43668. I may wait
to submit before I can see demo everything working together including
a still to come clang patch.

Reviewers: dberris, pelikan, eizan, rSerge, timshen

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45633

llvm-svn: 330219
2018-04-17 21:30:29 +00:00
Clement Courbet
221e6ccd99 [MC][TableGen] Add optional libpfm counter names for ProcResUnits.
Summary:
Subtargets can define the libpfm counter names that can be used to
measure cycles and uops issued on ProcResUnits.
This allows making llvm-exegesis available on more targets.
Fixes PR36984.

Reviewers: gchatelet, RKSimon, andreadb, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45360

llvm-svn: 329675
2018-04-10 08:16:37 +00:00
Andrea Di Biagio
01d033c489 [MC][Tablegen] Allow models to describe the retire control unit for llvm-mca.
This patch adds the ability to describe properties of the hardware retire
control unit.

Tablegen class RetireControlUnit has been added for this purpose (see
TargetSchedule.td).

A RetireControlUnit specifies the size of the reorder buffer, as well as the
maximum number of opcodes that can be retired every cycle.

A zero (or negative) value for the reorder buffer size means: "the size is
unknown". If the size is unknown, then llvm-mca defaults it to the value of
field SchedMachineModel::MicroOpBufferSize.  A zero or negative number of
opcodes retired per cycle means: "there is no restriction on the number of
instructions that can be retired every cycle".

Models can optionally specify an instance of RetireControlUnit. There can only
be up-to one RetireControlUnit definition per scheduling model.

Information related to the RCU (RetireControlUnit) is stored in (two new fields
of) MCExtraProcessorInfo.  llvm-mca loads that information when it initializes
the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp).

This patch fixes PR36661.

Differential Revision: https://reviews.llvm.org/D45259

llvm-svn: 329304
2018-04-05 15:41:41 +00:00
Andrea Di Biagio
f425ba9f0f [MC][Tablegen] Allow the definition of processor register files in the scheduling model for llvm-mca
This patch allows the description of register files in processor scheduling
models. This addresses PR36662.

A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td.
Targets can optionally describe register files for their processors using that
class. In particular, class RegisterFile allows to specify:
 - The total number of physical registers.
 - Which target registers are accessible through the register file.
 - The cost of allocating a register at register renaming stage.

Example (from this patch - see file X86/X86ScheduleBtVer2.td)

  def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>

Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar
(btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM
register definitions only cost 1 physical register.

The syntax allows to specify an empty set of register classes.  An empty set of
register classes means: this register file models all the registers specified by
the Target.  For each register class, users can specify an optional register
cost. By default, register costs default to 1.  A value of 0 for the number of
physical registers means: "this register file has an unbounded number of
physical registers".

This patch is structured in two parts.

* Part 1 - MC/Tablegen *

A first part adds the tablegen definition of RegisterFile, and teaches the
SubtargetEmitter how to emit information related to register files.

Information about register files is accessible through an instance of
MCExtraProcessorInfo.
The idea behind this design is to logically partition the processor description
which is only used by external tools (like llvm-mca) from the processor
information used by the llvm machine schedulers.
I think that this design would make easier for targets to get rid of the extra
processor information if they don't want it.

* Part 2 - llvm-mca related *

The second part of this patch is related to changes to llvm-mca.

The main differences are:
 1) class RegisterFile now needs to take into account the "cost of a register"
when allocating physical registers at register renaming stage.
 2) Point 1. triggered a minor refactoring which lef to the removal of the
"maximum 32 register files" restriction.
 3) The BackendStatistics view has been updated so that we can print out extra
details related to each register file implemented by the processor.

The effect of point 3. is also visible in tests register-files-[1..5].s.

Differential Revision: https://reviews.llvm.org/D44980

llvm-svn: 329067
2018-04-03 13:36:24 +00:00
David Blaikie
938d7fe477 Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
David Blaikie
af9abae7af Fix layering by moving Support/CodeGenCWrappers.h to Target
This includes llvm-c/TargetMachine.h which is logically part of
libTarget (since libTarget implements llvm-c/TargetMachine.h's
functions).

llvm-svn: 328394
2018-03-23 23:58:21 +00:00
David Blaikie
0bf763875a Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
Krzysztof Parzyszek
e7d626ae4a [X86] Add phony registers for high halves of regs with low halves
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.

This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.

Differential Revision: https://reviews.llvm.org/D43353

llvm-svn: 328016
2018-03-20 18:46:55 +00:00
Nicolai Haehnle
a64a4772e8 TableGen: Check the dynamic type of !cast<Rec>(string)
Summary:
The docs already claim that this happens, but so far it hasn't. As a
consequence, existing TableGen files get this wrong a lot, but luckily
the fixes are all reasonably straightforward.

To make this work with all the existing forms of self-references (since
the true type of a record is only built up over time), the lookup of
self-references in !cast is delayed until the final resolving step.

Change-Id: If5923a72a252ba2fbc81a889d59775df0ef31164

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D44475

llvm-svn: 327849
2018-03-19 14:14:20 +00:00
Craig Topper
ead1756da7 [TableGen] When trying to reuse a scheduler class for instructions from an InstRW, make sure we haven't already seen another InstRW containing this instruction on this CPU.
This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check.

So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag.

A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found.

llvm-svn: 327808
2018-03-18 19:56:15 +00:00
Matt Arsenault
3a3a83a86f TargetMachine: Add address space to getPointerSize
llvm-svn: 327467
2018-03-14 00:36:23 +00:00
Peter Collingbourne
1834ad0e4a Use branch funnels for virtual calls when retpoline mitigation is enabled.
The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the
branch predictor, and as a result it can lead to a measurable loss of
performance. We can reduce the performance impact of retpolined virtual
calls by replacing them with a special construct known as a branch
funnel, which is an instruction sequence that implements virtual calls
to a set of known targets using a binary tree of direct branches. This
allows the processor to speculately execute valid implementations of the
virtual function without allowing for speculative execution of of calls
to arbitrary addresses.

This patch extends the whole-program devirtualization pass to replace
certain virtual calls with calls to branch funnels, which are
represented using a new llvm.icall.jumptable intrinsic. It also extends
the LowerTypeTests pass to recognize the new intrinsic, generate code
for the branch funnels (x86_64 only for now) and lay out virtual tables
as required for each branch funnel.

The implementation supports full LTO as well as ThinLTO, and extends the
ThinLTO summary format used for whole-program devirtualization to
support branch funnels.

For more details see RFC:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html

Differential Revision: https://reviews.llvm.org/D42453

llvm-svn: 327163
2018-03-09 19:11:44 +00:00
Volkan Keles
6a307ed4fd GlobalISel: IRTranslate llvm.fabs.* intrinsic
Summary:
Fabs is a common floating-point operation, especially for some expansions. This patch adds
a new generic opcode for llvm.fabs.* intrinsic in order to avoid building/matching this intrinsic.

Reviewers: qcolombet, aditya_nandakumar, dsanders, rovka

Reviewed By: aditya_nandakumar

Subscribers: kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D43864

llvm-svn: 326749
2018-03-05 22:31:55 +00:00
Chih-Hung Hsieh
13e607925f [TLS] use emulated TLS if the target supports only this mode
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.

Differential Revision: https://reviews.llvm.org/D42999

llvm-svn: 326341
2018-02-28 17:48:55 +00:00
Geoff Berry
097bf66bf4 [MachineOperand][Target] MachineOperand::isRenamable semantics changes
Summary:
Add a target option AllowRegisterRenaming that is used to opt in to
post-register-allocation renaming of registers.  This is set to 0 by
default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq
fields of all opcodes to be set to 1, causing
MachineOperand::isRenamable to always return false.

Set the AllowRegisterRenaming flag to 1 for all in-tree targets that
have lit tests that were effected by enabling COPY forwarding in
MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC,
RISCV, Sparc, SystemZ and X86).

Add some more comments describing the semantics of the
MachineOperand::isRenamable function and how it is set and maintained.

Change isRenamable to check the operand's opcode
hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of
relying on it being consistently reflected in the IsRenamable bit
setting.

Clear the IsRenamable bit when changing an operand's register value.

Remove target code that was clearing the IsRenamable bit when changing
registers/opcodes now that this is done conservatively by default.

Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in
one place covering all opcodes that have constant pipe read limit
restrictions.

Reviewers: qcolombet, MatzeB

Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D43042

llvm-svn: 325931
2018-02-23 18:25:08 +00:00
Oliver Stannard
1a1cf03fd4 [AArch64] Improve v8.1-A code-gen for atomic load-and
Armv8.1-A added an atomic load-clear instruction (which performs bitwise
and with the complement of it's operand), but not a load-and
instruction. Our current code-generation for atomic load-and always
inserts an MVN instruction to invert its argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-and
operation into an xor with -1 and a load-clear, allowing the normal DAG
optimisations to work on it.

To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't
see any easy way to do this with an AArch64-specific ISD node, because
the code-generation for atomic operations assumes the SDNodes are of
type AtomicSDNode.

I've left the old tablegen patterns in because they are still needed for
global isel.

Differential revision: https://reviews.llvm.org/D42478

llvm-svn: 324908
2018-02-12 17:03:11 +00:00
Krzysztof Parzyszek
8b0ac822d5 [NFC] Fix comment of class InstrStage
Patch by Wei-Ren Chen.

Differential Revision: https://reviews.llvm.org/D42905

llvm-svn: 324894
2018-02-12 15:02:49 +00:00
Clement Courbet
d7ed5a1734 [TargetSchedule] Expose sub-units of a ProcResGroup in MCProcResourceDesc.
Summary:
Right now using a ProcResource automatically counts as usage of all
super ProcResGroups. All this is done during codegen, so there is no
way for schedulers to get this information at runtime.

This adds the information of which individual ProcRes units are
contained in a ProcResGroup in MCProcResourceDesc.

Reviewers: gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43023

llvm-svn: 324582
2018-02-08 08:46:48 +00:00
Volkan Keles
4b7dec5ac6 Add a TargetOption to enable/disable GlobalISel
Summary:
This patch adds a new target option in order to control GlobalISel.
This will allow the users to enable/disable GlobalISel prior to the
backend by calling `TargetMachine::setGlobalISel(bool Enable)`.

No test case as there is already a test to check GlobalISel
command line options.
See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll.

Reviewers: qcolombet, aemerson, ab, dsanders

Reviewed By: qcolombet

Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42137

llvm-svn: 322773
2018-01-17 22:34:21 +00:00
Volkan Keles
eea46f246c [GlobalISel][TableGen] Add support for SDNodeXForm
Summary:
This patch adds CustomRenderer which renders the matched
operands to the specified instruction.

Targets can enable the matching of SDNodeXForm by adding
a definition that inherits from GICustomOperandRenderer and
GISDNodeXFormEquiv as follows.

def gi_imm8 : GICustomOperandRenderer<"renderImm8”>,
                       GISDNodeXFormEquiv<imm8_xform>;

Custom renderer functions should be of the form:
void render(MachineInstrBuilder &MIB, const MachineInstr &I);

Reviewers: dsanders, ab, rovka

Reviewed By: dsanders

Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet

Differential Revision: https://reviews.llvm.org/D42012

llvm-svn: 322582
2018-01-16 18:44:05 +00:00
Sanjoy Das
df40ece177 (Re-landing) Expose a TargetMachine::getTargetTransformInfo function
Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321375
2017-12-22 18:21:59 +00:00
Sanjoy Das
259bcf37bc Revert "Expose a TargetMachine::getTargetTransformInfo function"
This reverts commit r321234.  It breaks the -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 321243
2017-12-21 02:34:39 +00:00
Sanjoy Das
90359bc4d6 Expose a TargetMachine::getTargetTransformInfo function
Summary:
This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321234
2017-12-21 01:06:58 +00:00
Matt Arsenault
422c27e8aa TableGen: Allow setting SDNodeProperties on intrinsics
Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.

Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.

llvm-svn: 321212
2017-12-20 19:36:28 +00:00
Alex Bradbury
5a992b5fc4 [TableGen] Give the option of tolerating duplicate register names
A number of architectures re-use the same register names (e.g. for both 32-bit 
FPRs and 64-bit FPRs). They are currently unable to use the tablegen'erated 
MatchRegisterName and MatchRegisterAltName, as tablegen (when built with 
asserts enabled) will fail.

When the AllowDuplicateRegisterNames in AsmParser is set, duplicated register 
names will be tolerated. A backend can then coerce registers to the desired 
register class by (for instance) implementing validateTargetOperandClass.

At least the in-tree Sparc backend could benefit from this, as does RISC-V 
(single and double precision floating point registers).

Differential Revision: https://reviews.llvm.org/D39845

llvm-svn: 320018
2017-12-07 09:51:55 +00:00
Daniel Sanders
f0a9960826 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Daniel Sanders
37451b86e7 Allow similar TargetOpcodes to use inheritance to factor out commonality. NFC.
Summary:
While implementing atomicrmw in https://reviews.llvm.org/D40092 I found that
inheritance is unusable for all the Generic Opcodes in GlobalISel. This is
because the whole header is included inside a 'let mayLoad = 0, mayStore = 0 ... in'
block. In TableGen, the order of precedence for field assignments is:
  1. Values from classes the record inherits from.
  2. Values from 'let Name=Value in { ... }'
  3. Values from 'let Name=Value;'
As such the 'let mayLoad = 0, mayStore = 0, ... in' surrounding the
'include "GenericOpcodes.td"' was overriding any values provided via inheritance.
We hadn't noticed this before because we were only using 'let Name=Value;' to
specialize opcodes.

Fix this by moving the default values to the lowest precedence. This is
accomplished by moving the values to a common base class
(StandardPseudoInstruction for most TargetOpcodes, and GenericOpcode for
GlobalISel specific TargetOpcodes)

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D40096

llvm-svn: 319701
2017-12-04 21:40:57 +00:00
Daniel Sanders
2a3d1acd34 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Volkan Keles
c660180f94 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Daniel Sanders
f43958d13f [globalisel][tablegen] Add support for relative AtomicOrderings
No test yet because the relevant rules are blocked on the atomic_load,
and atomic_store nodes.

llvm-svn: 319475
2017-11-30 21:05:59 +00:00
Daniel Sanders
fe50afca94 [aarch64][globalisel] Legalize G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_*
G_ATOMICRMW_* is generally legal on AArch64. The exception is G_ATOMICRMW_NAND.

G_ATOMIC_CMPXCHG_WITH_SUCCESS needs to be lowered to G_ATOMIC_CMPXCHG with an
external comparison.

Note that IRTranslator doesn't generate these instructions yet.

llvm-svn: 319466
2017-11-30 20:11:42 +00:00
Sean Eveson
d3fdef109a [MC] Function stack size section.
Re applying after fixing issues in the diff, sorry for any painful conflicts/merges!

Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319430
2017-11-30 13:05:14 +00:00
Sean Eveson
b9a62958c9 Revert r319423: [MC] Function stack size section.
I messed up the diff.

llvm-svn: 319429
2017-11-30 12:43:25 +00:00
Sean Eveson
4b5d214bc5 [MC] Function stack size section.
Summary:
Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

I wasn't sure who to put as reviewers, so please add/remove people as appropriate.

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319423
2017-11-30 12:01:16 +00:00
Daniel Sanders
ce08391be8 [globalisel][tablegen] Add support for importing G_ATOMIC_CMPXCHG, G_ATOMICRMW_* rules from SelectionDAG.
GIM_CheckNonAtomic has been replaced by GIM_CheckAtomicOrdering to allow it to support a wider
range of orderings. This has then been used to import patterns using nodes such
as atomic_cmp_swap, atomic_swap, and atomic_load_*.

llvm-svn: 319232
2017-11-28 22:07:05 +00:00
Daniel Sanders
cd1fb2fd55 [aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make them legal
The IRTranslator cannot generate these instructions at the moment so there's no
issue with not having implemented ISel for them yet. D40092 will add
G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a
further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into
G_ATOMIC_CMPXCHG with an external success check via the `Lower` action.

The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is
to import SelectionDAG rules while still supporting targets that prefer to
custom lower the original LLVM-IR-like operation.

llvm-svn: 319216
2017-11-28 20:21:15 +00:00
Daniel Sanders
3676c002af [globalisel][tablegen] Generalize pointer-type inference by introducing ptypeN. NFC
ptypeN is functionally the same as typeN except that it informs the
SelectionDAG importer that an operand should be treated as a pointer even
if it was written as iN. This is important for patterns that use iN instead
of iPTR to represent pointers. E.g.:
  (set GPR64:$dst, (load GPR64:$addr))

Previously, this was handled as a hardcoded special case for the appropriate
operands to G_LOAD and G_STORE.

llvm-svn: 318574
2017-11-18 00:16:44 +00:00
David Blaikie
e01dc73ad2 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Alex Bradbury
8f39d57648 Set hasSideEffects=0 for TargetOpcode::{CFI_INSTRUCTION,EH_LABEL,GC_LABEL,ANNOTATION_LABEL}
D37065 (committed as rL317674) explicitly set hasSideEffects for all 
TargetOpcode::* instructions where it was inferred previously. This is a 
follow-up to that patch, setting hasSideEffects=0 for CFI_INSTRUCTION, 
EH_LABEL, GC_LABEL and ANNOTATION_LABEL. All LLVM tests pass after this 
change.

This patch also modifies MachineInstr::isLabel returns true for a 
TargetOpcode::ANNOTATION_LABEL, which ensures that an annotation label won't 
be incorrectly considered safe to move.

Differential Revision: https://reviews.llvm.org/D39941

llvm-svn: 318174
2017-11-14 19:16:08 +00:00
Daniel Sanders
052a3d35ec [tablegen] Handle atomic predicates for ordering inside tablegen. NFC.
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common atomic predicates related to
ordering into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

llvm-svn: 318102
2017-11-13 23:03:47 +00:00
Daniel Sanders
5d4ac8cff2 [tablegen] Handle atomic predicates for memory type inside tablegen. NFC.
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common atomic predicates related to
memory type into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

llvm-svn: 318095
2017-11-13 22:26:13 +00:00
Jonas Paulsson
1b514619dd [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
Reid Kleckner
c37cd8d78d Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317579, originally committed as r317100.

There is a design issue with marking CFI instructions duplicatable. Not
all targets support the CFIInstrInserter pass, and targets like Darwin
can't cope with duplicated prologue setup CFI instructions. The compact
unwind info emission fails.

When the following code is compiled for arm64 on Mac at -O3, the CFI
instructions end up getting tail duplicated, which causes compact unwind
info emission to fail:
  int a, c, d, e, f, g, h, i, j, k, l, m;
  void n(int o, int *b) {
    if (g)
      f = 0;
    for (; f < o; f++) {
      m = a;
      if (l > j * k > i)
        j = i = k = d;
      h = b[c] - e;
    }
  }

We get assembly that looks like this:
; BB#1:                                 ; %if.then
Lloh3:
	adrp	x9, _f@GOTPAGE
Lloh4:
	ldr	x9, [x9, _f@GOTPAGEOFF]
	mov	 w8, wzr
Lloh5:
	str		wzr, [x9]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.lt	LBB0_3
	b	LBB0_7
LBB0_2:                                 ; %entry.if.end_crit_edge
Lloh6:
	adrp	x8, _f@GOTPAGE
Lloh7:
	ldr	x8, [x8, _f@GOTPAGEOFF]
Lloh8:
	ldr		w8, [x8]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.ge	LBB0_7
LBB0_3:                                 ; %for.body.lr.ph

Note the multiple .cfi_def* directives. Compact unwind info emission
can't handle that.

llvm-svn: 317726
2017-11-08 21:31:14 +00:00
Alex Bradbury
1e5ce63ec5 Set hasSideEffects=0 for PHI and fix affected passes
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred 
as 1. D37065 sets the previously inferred properties explicitly. This patch sets 
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has 
been updated so it still returns false for PHI.

Additionally, HexagonBitSimplify relied on a PHI node having the 
hasUnmodeledSideEffects property. This patch fixes that assumption.

Differential Revision: https://reviews.llvm.org/D37097

llvm-svn: 317721
2017-11-08 20:19:16 +00:00
Alex Bradbury
1b9512e762 [NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0
rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target 
sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if 
it has to guess properties from patterns. Unfortunately, 
guessInstructionProperties=0 can't be used with current upstream LLVM as 
instructions in the TargetOpcode namespace are always included and sometimes 
have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch 
provides the simplest possible fix to this problem, setting default values for 
these fields in the TargetOpcode scope. There is no intended functional 
change, as the explicitly set properties should match what was previously 
inferred. A number of the instructions had hasSideEffects=1 inferred 
unintentionally. This patch makes it explicit, while future patches (such as 
D37097) correct the property.

Differential Revision: https://reviews.llvm.org/D37065

llvm-svn: 317674
2017-11-08 09:26:06 +00:00
Matt Arsenault
38099e08bc DAG: Add computeKnownBitsForFrameIndex
Some of the AMDGPU stack addressing modes require knowing the sign
bit is zero. We used to accomplish this by custom lowering
frame indexes, and then putting an AssertZext around a
TargetFrameIndex. This required specifically looking for
the AssextZext + frame index pattern which was moderately
disgusting. The same could probably be accomplished
with a target specific node, but would still
require special handling of frame indexes.

llvm-svn: 317671
2017-11-08 08:52:31 +00:00
David Blaikie
45b647d5eb Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Petar Jovanovic
5b6a90db77 Reland "Correct dwarf unwind information in function epilogue for X86"
Reland r317100 with minor fix regarding ComputeCommonTailLength function in
BranchFolding.cpp. Skipping top CFI instructions block needs to executed on
several more return points in ComputeCommonTailLength().

Original r317100 message:

"Correct dwarf unwind information in function epilogue for X86"

This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.

Patch by Violeta Vukobrat.

llvm-svn: 317579
2017-11-07 14:40:27 +00:00
David Blaikie
10180bb2a1 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Petar Jovanovic
8c7ceb5bd9 Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf
buildbot failure (build #15606).

llvm-svn: 317136
2017-11-01 23:05:52 +00:00
Petar Jovanovic
d61746acde Correct dwarf unwind information in function epilogue for X86
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.


Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D35844

llvm-svn: 317100
2017-11-01 16:04:11 +00:00
David Blaikie
5974607431 TargetOpcodes.h: Don't mark header functions as file local
llvm-svn: 316514
2017-10-24 21:29:19 +00:00
Daniel Sanders
572038831c [globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero.
This patch enables the import of stores. Unfortunately, doing so by itself,
loses an optimization where storing 0 to memory makes use of WZR/XZR.

To mitigate this, this patch also introduces a new feature that allows register
operands to nominate a zero register. When this is done, GlobalISel will
substitute (G_CONSTANT 0) with the nominated register automatically. This
is currently configured to only apply to the stores.

Applying it to GPR32/GPR64 register classes in general will be done after
review see (https://reviews.llvm.org/D39150).

llvm-svn: 316360
2017-10-23 18:19:24 +00:00
Marina Yatsina
b5c80eef49 Add logic to greedy reg alloc to avoid bad eviction chains
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
llvm-svn: 316295
2017-10-22 17:59:38 +00:00
Daniel Sanders
ca098ff70b [globalisel][tablegen] Map ld and st to G_LOAD and G_STORE. NFC
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.

This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.

Depends on D37443

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37445

llvm-svn: 315843
2017-10-15 02:41:12 +00:00
Daniel Sanders
fc690adc20 [tablegen] Handle common load/store predicates inside tablegen. NFC.
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
   SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
   GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

Depends on D36618

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37443

Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().

llvm-svn: 315841
2017-10-15 02:06:44 +00:00
Daniel Sanders
eb33df745e [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36569

llvm-svn: 315761
2017-10-13 21:28:03 +00:00
Daniel Sanders
4c2aa5c732 [aarch64] Support APInt and APFloat in ImmLeaf subclasses and make AArch64 use them.
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.

With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.

This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.

For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.

Depends on D36086

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36534

llvm-svn: 315747
2017-10-13 20:42:18 +00:00
Matt Arsenault
1086f79305 DAG: Add opcode and source type to isFPExtFree
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.

llvm-svn: 315740
2017-10-13 19:55:45 +00:00
Craig Topper
bbfe102642 [SelectionDAG] Const-correct the DemandedMask argument to one of the overloads of SimplifyDemandedBits. NFC
llvm-svn: 315641
2017-10-12 23:46:05 +00:00
Matthias Braun
6cdc04a20c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Matthias Braun
e1c491ab05 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Wei Ding
725c541dbe Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

llvm-svn: 315610
2017-10-12 19:37:14 +00:00
Wei Mi
f44870d1d1 Revert r307036 because of PR34919.
llvm-svn: 315540
2017-10-12 00:24:52 +00:00
Alex Bradbury
d4ed8e7017 [TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfo
The NumFixedArgs field of CallLoweringInfo is used by
TargetLowering::LowerCallTo to determine whether a given argument is passed
using the vararg calling convention or not (specifically, to set IsFixed for
each ISD::OutputArg).

Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both
incorrectly set NumFixedArgs based on the _previous_ args list. Secondly,
TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying
the argument list so a pointer is passed for the return value.

If your backend uses the IsFixed property or directly accesses NumFixedArgs, 
it is _possible_ this change could result in codegen changes (although the 
previous behaviour would have been incorrect). No such cases have been
identified during code review for any in-tree architecture.

Differential Revision: https://reviews.llvm.org/D37898

llvm-svn: 315457
2017-10-11 13:48:45 +00:00
Oliver Stannard
20a7d7760f [AsmParser] Add DiagnosticString to register classes in tablegen
This allows a DiagnosticType and/or DiagnosticString to be associated
with a RegisterClass in tablegen, so that we can emit diagnostics in the
assembler when a register operand is incorrect.

DiagnosticType creates a predictable enum value, which gets returned as
the error code when an operand does not match, and can be used by the
assembly parser to map to a user-facing diagnostic. DiagnosticString
creates an anonymous enum value (currently based on the tablegen class
name), and a function to map from enum values to strings will be
generated. Both of these work the same was as they do for AsmOperand.

This isn't used by any targets yet, but has one (positive) side-effect.
It improves the diagnostic codes returned by validateOperandClass - we
always want to emit the diagnostic that relates to the expected operand
class, but this wasn't always being done when the expected and actual
classes were completely different (token/register/custom). This causes a
few AArch64 diagnostics to be improved, as Match_InvalidOperand was
being returned instead of a specific diagnostic type.

Differential revision: https://reviews.llvm.org/D36691

llvm-svn: 315295
2017-10-10 11:00:40 +00:00
Jessica Paquette
5ca1dd864d [MachineOutliner] Disable outlining from LinkOnceODRs by default
Say you have two identical linkonceodr functions, one in M1 and one in M2.
Say that the outliner outlines A,B,C from one function, and D,E,F from another
function (where letters are instructions). Now those functions are not
identical, and cannot be deduped. Locally to M1 and M2, these outlining
choices would be good-- to the whole program, however, this might not be true!

To mitigate this, this commit makes it so that the outliner sees linkonceodr
functions as unsafe to outline from. It also adds a flag,
-enable-linkonceodr-outlining, which allows the user to specify that they
want to outline from such functions when they know what they're doing.

Changing this handles most code size regressions in the test suite caused by
competing with linker dedupe. It also doesn't have a huge impact on the code
size improvements from the outliner. There are 6 tests that regress > 5% from
outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most
tests either improve or are not impacted.

Not outlined vs outlined without linkonceodrs:
https://hastebin.com/raw/qeguxavuda

Not outlined vs outlined with linkonceodrs:
https://hastebin.com/raw/edepoqoqic

Outlined with linkonceodrs vs outlined without linkonceodrs:
https://hastebin.com/raw/awiqifiheb

Numbers generated using compare.py with -m size.__text. Tests run for AArch64
with -Oz -mllvm -enable-machine-outliner -mno-red-zone.

llvm-svn: 315136
2017-10-07 00:16:34 +00:00
Oliver Stannard
a8cc0cc3a1 [AsmParser] Add DiagnosticString to AsmOperands in tablegen
This adds a DiagnosticString member to the AsmOperand tablegen class, so
that the diagnostic text to be used when an assembly operand is
incorrect can be stored in the tablegen description of the operand,
rather than in a separate switch statement in the AsmParser.

If DiagnosticString is used for any operands, tablegen will emit a
getMatchKindDiag function, to map from diagnostic enums to strings.

Differential revision: https://reviews.llvm.org/D31606

llvm-svn: 314803
2017-10-03 14:34:57 +00:00
Oliver Stannard
14e90c0414 [Assembler] Report multiple near misses for invalid instructions
The current table-generated assembly instruction matcher returns a
64-bit error code when matching fails. Since multiple instruction
encodings with the same mnemonic can fail for different reasons, it uses
some heuristics to decide which message is important.

This heuristic does not work well for targets that have many encodings
with the same mnemonic but different operands, or which have different
versions of instructions controlled by subtarget features, as it is hard
to know which encoding the user was intending to use.

Instead of trying to improve the heuristic in the table-generated
matcher, this patch changes it to report a list of near-miss encodings.
This list contains an entry for each encoding with the correct mnemonic,
but with exactly one thing preventing it from being valid. This thing
could be a single invalid operand, a missing target feature or a failed
target-specific validation function.

The target-specific assembly parser can then report an error message
giving multiple options for instruction variants that the user may have
been trying to use. For example, I am working on a patch to use this for
ARM, which can give this error for an invalid instruction for ARMv6-M:

  <stdin>:8:3: error: invalid instruction, multiple near-miss encodings found
    adds r0, r1, #0x8
    ^
  <stdin>:8:3: note: for one encoding: instruction requires: thumb2
    adds r0, r1, #0x8
    ^
  <stdin>:8:16: note: for one encoding: expected an integer in range [0, 7]
    adds r0, r1, #0x8
                 ^
  <stdin>:8:16: note: for one encoding: expected a register in range [r0, r7]
    adds r0, r1, #0x8
                 ^

This also allows the target-specific assembly parser to apply its own
heuristics to suppress some errors. For example, the error "instruction
requires: arm-mode" is never going to be useful when targeting an
M-profile architecture (which does not have ARM mode).

This patch just adds the target-independent mechanism for doing this,
all targets still use the old mechanism. I've added a bit in the
AsmParser tablegen class to allow targets to switch to this new
mechanism. To use this, the target-specific assembly parser will have to
be modified for the change in signature of MatchInstructionImpl, and to
report errors based on the list of near-misses.

Differential revision: https://reviews.llvm.org/D27620

llvm-svn: 314774
2017-10-03 09:33:12 +00:00
Jonas Paulsson
46e43049cf [SystemZ] implement shouldCoalesce()
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.

If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).

This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610

llvm-svn: 314516
2017-09-29 14:31:39 +00:00
Jessica Paquette
0fa8483b0c [MachineOutliner] AArch64: Avoid saving + restoring LR if possible
This commit allows the outliner to avoid saving and restoring the link register
on AArch64 when it is dead within an entire class of candidates.

This introduces changes to the way the outliner interfaces with the target.
For example, the target now interfaces with the outliner using a
MachineOutlinerInfo struct rather than by using getOutliningCallOverhead and
getOutliningFrameOverhead.

This also improves several comments on the outliner's cost model.

https://reviews.llvm.org/D36721

llvm-svn: 314341
2017-09-27 20:47:39 +00:00
Quentin Colombet
ed9a033b66 [GlobalISel] Update the documentation and comment for G_[UN]MERGE_VALUES
In r296921, we added the G_[UN]MERGE_VALUES node, but did not update the
documentation. Fixing that.

NFC.

llvm-svn: 314168
2017-09-25 22:03:06 +00:00
Quentin Colombet
2e2793f727 [GlobalISel] Update the documentation and comments for G_EXTRACT
In r297100, G_EXTRACT changed from a multiple results instruction to a
single result one. Update the documentation accordingly.

NFC.

llvm-svn: 314166
2017-09-25 22:03:01 +00:00
Tim Shen
7c634910d0 [XRay] support conditional return on PPC.
Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest.

Reviewers: dberris, echristo

Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D38102

llvm-svn: 314005
2017-09-22 18:30:02 +00:00
Daniel Sanders
ca6a043f07 [globalisel] Add a G_BSWAP instruction and support bswap using it.
llvm-svn: 313633
2017-09-19 14:25:15 +00:00
Daniel Sanders
a591b205f2 [globalisel] Add support for intrinsic_void
llvm-svn: 313629
2017-09-19 13:23:01 +00:00
Daniel Sanders
4b1144e7ae [globalisel] Add support for intrinsic_w_chain.
This maps directly to G_INTRINSIC_W_SIDE_EFFECTS.

llvm-svn: 313627
2017-09-19 12:56:36 +00:00
Sanjay Patel
7ae3cb976f [DAG, x86] allow store merging before and after legalization (PR34217)
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217

We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.

I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.

Differential Revision: https://reviews.llvm.org/D37987

llvm-svn: 313564
2017-09-18 20:54:26 +00:00
Jan Sjodin
242b2dcd0b Add AddresSpace to PseudoSourceValue.
Differential Revision: https://reviews.llvm.org/D35089

llvm-svn: 313297
2017-09-14 20:53:51 +00:00
Krzysztof Parzyszek
97d598594a Subtarget support for parameterized register class information
Implement "checkFeatures" and emitting HW mode check code.

Differential Revision: https://reviews.llvm.org/D31959

llvm-svn: 313295
2017-09-14 20:44:20 +00:00
Benjamin Kramer
b6a866ac7b Remove usages of deprecated std::unary_function and std::binary_function.
These are removed in C++17. We still have some users of
unary_function::argument_type, so just spell that typedef out. No
functionality change intended.

Note that many of the argument types are actually wrong :)

llvm-svn: 313287
2017-09-14 18:33:25 +00:00
Krzysztof Parzyszek
f817a9b5c3 TableGen support for parameterized register class information
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.

This affects the way that types and type sets are printed, and the
tests relying on that have been updated.

There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)

For more information, please refer to the review page.

Differential Revision: https://reviews.llvm.org/D31951

llvm-svn: 313271
2017-09-14 16:56:21 +00:00
Stanislav Mekhanoshin
fbfa163a41 Allow target to decide when to cluster loads/stores in misched
MachineScheduler when clustering loads or stores checks if base
pointers point to the same memory. This check is done through
comparison of base registers of two memory instructions. This
works fine when instructions have separate offset operand. If
they require a full calculated pointer such instructions can
never be clustered according to such logic.

Changed shouldClusterMemOps to accept base registers as well and
let it decide what to do about it.

Differential Revision: https://reviews.llvm.org/D37698

llvm-svn: 313208
2017-09-13 22:20:47 +00:00
Ahmed Bougacha
698d6831e1 [AArch64][GlobalISel] Select all fptruncs.
We already support these in tablegen, but we're matching the wrong
operator (libm ftrunc).  Fix that.

While there, drop the c++ code, support COPYs of FPR16, and add tests
for the other types.

llvm-svn: 313073
2017-09-12 21:04:10 +00:00
Sam Clegg
7b5d755577 [WebAssembly] Remove flags from MCSectionWasm
Looks like these were copied from the ELF sections but
don't apply to Wasm and were not used anywhere.

Also remove unused Wasm methods in MCContext.

Differential Revision: https://reviews.llvm.org/D37633

llvm-svn: 313058
2017-09-12 18:31:24 +00:00
Krzysztof Parzyszek
613168dabd Fix a couple of comments, NFC
llvm-svn: 313030
2017-09-12 14:10:48 +00:00
Reid Kleckner
a0dfc5a477 Add llvm.codeview.annotation to implement MSVC __annotation
Summary:
This intrinsic represents a label with a list of associated metadata
strings. It is modelled as reading and writing inaccessible memory so
that it won't be removed as dead code. I think the intention is that the
annotation strings should appear at most once in the debug info, so I
marked it noduplicate. We are allowed to inline code with annotations as
long as we strip the annotation, but that can be done later.

Reviewers: majnemer

Subscribers: eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D36904

llvm-svn: 312569
2017-09-05 20:14:58 +00:00
Matt Arsenault
1286bf2696 DAG: Fix naming crime
Because isOperationCustom was only checking for custom
lowering on illegal types, this was behaving inconsistently
with the other isOperation* functions, so that
isOperationLegalOrCustom != (isOperationLegal || isOperationCustom)

Luckily this is only used in one place which already checks the
type legality on its own.

llvm-svn: 311743
2017-08-25 01:26:13 +00:00
Sanjay Patel
7708490317 [DAG] convert vector select-of-constants to logic/math
This goes back to a discussion about IR canonicalization. We'd like to preserve and convert
more IR to 'select' than we currently do because that's likely the best choice in IR:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html
...but that's often not true for codegen, so we need to account for this pattern coming in
to the backend and transform it to better DAG ops.

Steps in this patch:

  1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely
     enable this transform. Other targets will probably want that anyway to distinguish scalars
     from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary.

  2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the
     vector ext is often free.

Implementing a more general fold using xor+and can be a follow-up for targets that don't have
a legal vselect. It's also possible that we can remove the TLI hook for the special case fold
implemented here because we're eliminating a constant, but it needs to be tested on other
targets.

Differential Revision: https://reviews.llvm.org/D36840

llvm-svn: 311731
2017-08-24 23:24:43 +00:00
Aditya Nandakumar
f4dfc939c9 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Matthias Braun
ccf939b56f TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Matt Arsenault
1708c62dd1 IPRA: Allow target to enable IPRA by default
llvm-svn: 310876
2017-08-14 19:54:47 +00:00
Craig Topper
84c8374521 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Nirav Dave
c5dcb602f8 [X86][DAG] Switch X86 Target to post-legalized store merge
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.

Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.

Reviewers: craig.topper, efriedma, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34559

llvm-svn: 310710
2017-08-11 13:21:35 +00:00
Krzysztof Parzyszek
3786e693af Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Jonas Paulsson
54a000e514 [LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess()
isLegalAddressingMode() has recently gained the extra optional Instruction*
parameter, and therefore it can now do the job that previously only
isFoldableMemAccess() could do.

The SystemZ implementation of isLegalAddressingMode() has gained the
functionality of checking for offsets, which used to be done with
isFoldableMemAccess().

The isFoldableMemAccess() hook has been removed everywhere.

Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D35933

llvm-svn: 310463
2017-08-09 11:28:01 +00:00
Daniel Sanders
364773e4e5 [globalisel][tablegen] Add support for importing 'imm' operands.
Summary:
This patch enables the import of rules containing 'imm' operands that do not
constrain the acceptable values using predicates. Support for ImmLeaf will
arrive in a later patch.

Depends on D35681

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35833

llvm-svn: 310343
2017-08-08 10:44:31 +00:00
Joel Jones
cee6711d56 [AArch64] LSE Atomics reorg - part 1
Add memory synchronization semantics to LSE Atomics.

The memory semantics feature will be added in a subsequent patch.

In this patch, several corrections were added to the existing LSE Atomics
implementation, based on the ARM Errata D11904 from 05/12/2017.

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D35319

llvm-svn: 310167
2017-08-05 04:30:55 +00:00
Rafael Espindola
f2011a3ae7 Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Quentin Colombet
6d5f774fb6 [TargetPassConfig] Feature generic options to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.

NFC.

Differential Revision: https://reviews.llvm.org/D30913

llvm-svn: 309599
2017-07-31 18:24:07 +00:00
Jessica Paquette
09730f1b79 [MachineOutliner] NFC: Change IsTailCall to a call class + frame class
This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

llvm-svn: 309475
2017-07-29 02:55:46 +00:00
Jessica Paquette
ffc8e4d730 [MachineOutliner] NFC: Split up getOutliningBenefit
This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

llvm-svn: 309356
2017-07-28 03:21:58 +00:00
Peter Collingbourne
b84653db71 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Zvi Rackover
f421e46c8f DAGCombiner: Extend reduceBuildVecToTrunc to handle non-zero offset
Summary:
Adding support for combining power2-strided build_vector's where the
first build_vectori's operand is extracted from a non-zero index.

Example:

 v4i32 build_vector((extract_elt V, 1),
                    (extract_elt V, 3),
                    (extract_elt V, 5),
                    (extract_elt V, 7))
 -->
 v4i32 truncate (bitcast (shuffle<1,u,3,u,5,u,7,u> V, u) to v4i64)

Reviewers: delena, RKSimon, guyblank

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35700

llvm-svn: 309108
2017-07-26 12:57:03 +00:00
Zvi Rackover
422bf896b2 TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI.
Changing mask argument type from const SmallVectorImpl<int>& to
ArrayRef<int>.

This came up in D35700 where a mask is received as an ArrayRef<int> and
we want to pass it to TargetLowering::isShuffleMaskLegal().
Also saves a few lines of code.

llvm-svn: 309085
2017-07-26 08:06:58 +00:00
Jonas Paulsson
c38a4eb7d4 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Daniel Sanders
5bdf5fa992 [globalisel][tablegen] Enable the import of rules involving fma.
Summary:
G_FMA was recently added to GlobalISel which enables the import of rules
involving fma. Add the mapping to allow it.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35130

llvm-svn: 308308
2017-07-18 14:10:07 +00:00
Haicheng Wu
907a054583 [TTI] Refine the cost of EXT in getUserCost()
Now, getUserCost() only checks the src and dst types of EXT to decide it is free
or not. This change first checks the types, then calls isExtFreeImpl(), and
check if EXT can form ExtLoad at last. Currently, only AArch64 has customized
implementation of isExtFreeImpl() to check if EXT can be folded into its use.

Differential Revision: https://reviews.llvm.org/D34458

llvm-svn: 308076
2017-07-15 02:12:16 +00:00
Geoff Berry
b564df0f8b [TargetLowering] Add hook for adding target MMO flags when doing ISel.
Summary: Add TargetLowering hook getMMOFlags() to add target specific
MMO flags to load/store instructions created by ISel.

Reviewers: bogner, hfinkel, qcolombet, MatzeB

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34962

llvm-svn: 307879
2017-07-13 03:49:42 +00:00
Geoff Berry
9a32a3bacd [MIR] Add support for printing and parsing target MMO flags
Summary: Add target hooks for printing and parsing target MMO flags.
Targets may override getSerializableMachineMemOperandTargetFlags() to
return a mapping from string to flag value for target MMO values that
should be serialized/parsed in MIR output.

Add implementation of this hook for AArch64 SuppressPair MMO flag.

Reviewers: bogner, hfinkel, qcolombet, MatzeB

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34962

llvm-svn: 307877
2017-07-13 02:28:54 +00:00
Nirav Dave
c55f8bfffc Add DAG argument to canMergeStoresTo NFC.
llvm-svn: 307583
2017-07-10 20:25:54 +00:00
Daniel Sanders
cf625ba818 [globalisel][tablegen] Import rules containing intrinsic_wo_chain.
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.

Depends on D32275

Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: dberris, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32278

llvm-svn: 307240
2017-07-06 08:12:20 +00:00
Vadim Chugunov
1a77669bc6 Fix libcall expansion creating DAG nodes with invalid type post type legalization.
If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values.

Patch by Jatin Bhateja and Eli Friedman.

Differential Revision: https://reviews.llvm.org/D34240

llvm-svn: 307207
2017-07-05 22:01:49 +00:00
Zvi Rackover
e1f310fac6 DAGCombine: Combine BUILD_VECTOR to TRUNCATE
Summary:
Add a combine for creating a truncate to replace a build_vector composed of extracts with
indices that form a stride-2^N series.

Example:
v8i32 V = ...

v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6))
-->
v4i32 truncate (bitcast V to v4i64)

Related discussion in llvm-dev about canonicalizing shuffles to
truncates in LLVM IR:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html.

Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena

Reviewed By: delena

Subscribers: guyblank, delena, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34077

llvm-svn: 307036
2017-07-03 15:47:40 +00:00
Hiroshi Inoue
6c3ccb62c6 fix trivial typos, NFC
llvm-svn: 306952
2017-07-01 07:12:15 +00:00
Tim Northover
f5949ef891 GlobalISel: add G_IMPLICIT_DEF instruction.
It looks like there are two target-independent but not GISel instructions that
need legalization, IMPLICIT_DEF and PHI. These are already anomalies since
their operands have important LLTs attached, so to make things more uniform it
seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF.

llvm-svn: 306875
2017-06-30 20:27:36 +00:00
Simon Pilgrim
177518f67c Remove unnecessary commented out argument. NFCI.
llvm-svn: 306824
2017-06-30 13:26:17 +00:00
Aditya Nandakumar
7c00869517 [GISel]: New Opcode G_FLOG/G_FLOG2
https://reviews.llvm.org/D34837

llvm-svn: 306766
2017-06-29 23:43:44 +00:00
Daniel Jasper
8d76d09d77 Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue"
I am 99% sure that this breaks the PPC ASAN build bot:
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio

If it doesn't go back to green, we can recommit (and fix the original
commit message at the same time :) ).

llvm-svn: 306676
2017-06-29 13:58:24 +00:00
Petar Jovanovic
0199002e6e [X86] Correct dwarf unwind information in function epilogue
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.

Majority of the changes in this patch:

1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.

These changes are target independent and described below.

Changed CFI instructions so that they:

1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal

Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.

Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.

Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D18046

llvm-svn: 306529
2017-06-28 10:21:17 +00:00
Aditya Nandakumar
83ff413d56 [GISel]: Add G_FEXP, G_FEXP2 opcodes
Also add IRTranslator support.
https://reviews.llvm.org/D34710

llvm-svn: 306475
2017-06-27 22:19:32 +00:00
Tim Northover
2454dd4f6e GlobalISel: remove G_SEQUENCE instruction.
It was trying to do too many things. The basic lumping together of values for
legalization purposes is now handled by G_MERGE_VALUES. More complex things
involving gaps and odd sizes are handled by G_INSERT sequences.

llvm-svn: 306120
2017-06-23 16:15:55 +00:00
whitequark
6e99d9f4a2 [X86] Add support for "probe-stack" attribute
This commit adds prologue code emission for stack probe function
calls.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D34387

llvm-svn: 306010
2017-06-22 15:42:53 +00:00
Nirav Dave
58a9006547 [DAG] Add Target Store Merge pass ordering function
Allow targets to specify if they should merge stores before or after
legalization.

llvm-svn: 306006
2017-06-22 15:07:49 +00:00
Aditya Nandakumar
88f603bd96 [GISel]: NFC. Add comment to G_FMA opcode as requested in rL305824
llvm-svn: 305837
2017-06-20 19:52:29 +00:00
Aditya Nandakumar
48c7ec9aa0 [GISel]: Add G_FMA opcode for fused multiply adds
https://reviews.llvm.org/D34372

Reviewed by dsanders

llvm-svn: 305824
2017-06-20 19:25:23 +00:00
Eugene Zelenko
027b1deacd [Target] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 305757
2017-06-19 22:43:19 +00:00
Lei Huang
cdf47c5983 [MachineLICM] Hoist TOC-based address instructions
Add condition for MachineLICM to safely hoist instructions that utilize
non constant registers that are reserved.

On PPC, global variable access is done through the table of contents (TOC)
which is always in register X2.  The ABI reserves this register in any
functions that have calls or access global variables.

A call through a function pointer involves saving, changing and restoring
this register around the call and thus MachineLICM does not consider it to
be invariant. We can however guarantee the register is preserved across the
call and thus is invariant.

Differential Revision: https://reviews.llvm.org/D33562

llvm-svn: 305490
2017-06-15 18:29:59 +00:00
Peter Collingbourne
54103de7c1 IR: Replace the "Linker Options" module flag with "llvm.linker.options" named metadata.
The new metadata is easier to manipulate than module flags.

Differential Revision: https://reviews.llvm.org/D31349

llvm-svn: 305227
2017-06-12 20:10:48 +00:00
Simon Dardis
398cd5e620 Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

The previous version of this patch had a "conditional move or jump depends on
uninitialized value".

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 305083
2017-06-09 14:37:08 +00:00
Saleem Abdulrasool
b3bb143c65 sink DebugCompressionType into MC for exposing to clang
This is a preparatory change to expose the debug compression style to
clang.  It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.

Minor tweak to the ELF Object Writer to use a variable for re-used
values.  Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.

llvm-svn: 305038
2017-06-09 00:40:19 +00:00
Simon Pilgrim
18db4739a5 [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.
This will allow commutation of target-specific DAG nodes in future patches

Differential Revision: https://reviews.llvm.org/D33882

llvm-svn: 304911
2017-06-07 14:05:04 +00:00
Chandler Carruth
eb66b33867 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Matthias Braun
17b01651c8 CodeGen: Refactor MIR parsing
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.

This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.

Differential Revision: https://reviews.llvm.org/D33809

llvm-svn: 304758
2017-06-06 00:44:35 +00:00
Matthias Braun
13c1e17841 CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCI
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function

Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.

llvm-svn: 304754
2017-06-06 00:26:13 +00:00
Matthias Braun
6d04c8c1dc TargetMachine: Indicate whether machine verifier passes.
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.

This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!

Differential Revision: https://reviews.llvm.org/D33696

llvm-svn: 304320
2017-05-31 18:41:23 +00:00
Zaara Syeda
39139cb634 [PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313
2017-05-31 17:12:38 +00:00
Craig Topper
9d032a4588 [SelectionDAG] Fix off by one in a compare in getOperationAction.
If Op is equal to array_lengthof, the lookup would be out of bounds, but we were only checking for greater than. I suspect nothing ever passes in the equal value because its a sentinel to mark the end of the builtin opcodes and not a real opcode.

So really this fix is just so that the code looks right and makes sense.

llvm-svn: 303840
2017-05-25 05:38:40 +00:00
Nirav Dave
904f5d5652 [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.
llvm-svn: 303673
2017-05-23 18:53:02 +00:00
Daniel Sanders
4ecbdc4da7 Revert r303259 - [globalisel][tablegen] Import rules containing intrinsic_wo_chain.
It's causing some buildbots to timeout whenever tablegen needs re-compilation,
particularly those with -fsanitize=memory but not only them. A compile time
regression was expected since it triples the amount of SelectionDAG rules we
are able to import but it's currently too high.

llvm-svn: 303542
2017-05-22 10:14:33 +00:00
Daniel Sanders
d328e1e55a Re-commit: [globalisel][tablegen] Import rules containing intrinsic_wo_chain.
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.

Depends on D32275

Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: dberris, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32278

The previous commit failed on test-suite/Bitcode/simd_ops/AArch64_halide_runtime.bc
because isImmOperandEqual() assumed MO was a register operand and that's not
always true.

llvm-svn: 303341
2017-05-18 10:33:36 +00:00
Quentin Colombet
5b90920e34 Revert "[globalisel][tablegen] Import rules containing intrinsic_wo_chain."
This reverts commit r303259.

This breaks the GISel bot:
http://lab.llvm.org:8080/green/job/Compiler_Verifiers_GlobalISEL/5163/consoleFull#-134276167849ba4694-19c4-4d7e-bec5-911270d8a58c

llvm-svn: 303313
2017-05-17 23:17:29 +00:00
Daniel Sanders
c6132632b6 [globalisel][tablegen] Import rules containing intrinsic_wo_chain.
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.

Depends on D32275

Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: dberris, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32278

llvm-svn: 303259
2017-05-17 13:39:49 +00:00
Sam Kolton
396cd8fa77 [TableGen] Add EncoderMethod to RegisterOperand
Reviewers: stoklund, grosbach, vpykhtin

Differential Revision: https://reviews.llvm.org/D32493

llvm-svn: 303044
2017-05-15 10:13:07 +00:00
Tim Shen
1a093915b1 [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC.
Summary:
Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc.,
because those are not defined for b > sizeof(a) * 8, even after some of
the combiners run.

However, PPCISD::SHL defines that behavior (as the instructions themselves).
Move the combination to the backend.

The tests in shift_mask.ll still pass.

Reviewers: echristo, hfinkel, efriedma, iteratee

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D33076

llvm-svn: 302937
2017-05-12 19:25:37 +00:00
Tim Shen
5f6285f048 [Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.

The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.

Differential Revision: https://reviews.llvm.org/D32762

llvm-svn: 302539
2017-05-09 15:27:17 +00:00
Serge Pavlov
b8ce9ec478 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Geoff Berry
06e750c92e Fix comment typos.
llvm-svn: 302432
2017-05-08 15:33:08 +00:00
Dean Michael Berris
5083dffff0 [XRay] Custom event logging intrinsic
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

llvm-svn: 302405
2017-05-08 05:45:21 +00:00
Daniel Sanders
7215d4496b [globalisel][tablegen] Add several GINodeEquiv's for operators that do not require additional support.
Summary:
As of this patch, 350 out of 3938 rules are currently imported.

Depends on D32229

Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar

Reviewed By: ab

Subscribers: dberris, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D32275

llvm-svn: 302154
2017-05-04 14:24:50 +00:00
Tim Shen
0808c94b7f [PowerPC, DAGCombiner] Fold a << (b % (sizeof(a) * 8)) back to a single instruction
Summary:
This is the corresponding llvm change to D28037 to ensure no performance
regression.

Reviewers: bogner, kbarton, hfinkel, iteratee, echristo

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28329

llvm-svn: 301990
2017-05-03 00:07:02 +00:00
Daniel Sanders
6ad3513c78 [globalisel][tablegen] Compute available feature bits correctly.
Summary:
Predicate<> now has a field to indicate how often it must be recomputed.
Currently, there are two frequencies, per-module (RecomputePerFunction==0)
and per-function (RecomputePerFunction==1). Per-function predicates are
currently recomputed more frequently than necessary since the only predicate
in this category is cheap to test. Per-module predicates are now computed in
getSubtargetImpl() while per-function predicates are computed in selectImpl().

Tablegen now manages the PredicateBitset internally. It should only be
necessary to add the required includes.

Also fixed a problem revealed by the test case where
constrainSelectedInstRegOperands() would attempt to tie operands that
BuildMI had already tied.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32491

llvm-svn: 301750
2017-04-29 17:30:09 +00:00
Matthias Braun
61aeea9876 TargetLowering: Add finalizeLowering() function; NFC
Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.

This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.

Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.

Differential Revision: https://reviews.llvm.org/D32621

llvm-svn: 301679
2017-04-28 20:25:05 +00:00
Jun Bum Lim
93f61e6588 [InlineCost] Improve the cost heuristic for Switch
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:                                   ; preds = %if.then
  switch i32 %Status, label %if.then27 [
    i32 -7012, label %if.end35
    i32 -10008, label %if.end35
    i32 -10016, label %if.end35
    i32 15000, label %if.end35
    i32 14013, label %if.end35
    i32 10114, label %if.end35
    i32 10107, label %if.end35
    i32 10105, label %if.end35
    i32 10013, label %if.end35
    i32 10011, label %if.end35
    i32 7008, label %if.end35
    i32 7007, label %if.end35
    i32 5002, label %if.end35
  ]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9:                              // %lor.lhs.false2
        mov     w8, #10012
        cmp             w19, w8
        b.gt    .LBB853_14
// BB#10:                               // %lor.lhs.false2
        mov     w8, #5001
        cmp             w19, w8
        b.gt    .LBB853_18
// BB#11:                               // %lor.lhs.false2
        mov     w8, #-10016
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#12:                               // %lor.lhs.false2
        mov     w8, #-10008
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#13:                               // %lor.lhs.false2
        mov     w8, #-7012
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_14:                             // %lor.lhs.false2
        mov     w8, #14012
        cmp             w19, w8
        b.gt    .LBB853_21
// BB#15:                               // %lor.lhs.false2
        mov     w8, #-10105
        add             w8, w19, w8
        cmp             w8, #9          // =9
        b.hi    .LBB853_17
// BB#16:                               // %lor.lhs.false2
        orr     w9, wzr, #0x1
        lsl     w8, w9, w8
        mov     w9, #517
        and             w8, w8, w9
        cbnz    w8, .LBB853_23
.LBB853_17:                             // %lor.lhs.false2
        mov     w8, #10013
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_18:                             // %lor.lhs.false2
        mov     w8, #-7007
        add             w8, w19, w8
        cmp             w8, #2          // =2
        b.lo    .LBB853_23
// BB#19:                               // %lor.lhs.false2
        mov     w8, #5002
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#20:                               // %lor.lhs.false2
        mov     w8, #10011
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_21:                             // %lor.lhs.false2
        mov     w8, #14013
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#22:                               // %lor.lhs.false2
        mov     w8, #15000
        cmp             w19, w8
        b.ne    .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
  -inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

llvm-svn: 301649
2017-04-28 16:04:03 +00:00
Craig Topper
af680f3cea [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Krzysztof Parzyszek
6f5b00014e Move value type list from TargetRegisterClass to TargetRegisterInfo
Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301234
2017-04-24 19:51:12 +00:00
Krzysztof Parzyszek
d4ea61ba19 Revert r301231: Accidentally committed stale files
I forgot to commit local changes before commit.

llvm-svn: 301232
2017-04-24 19:48:51 +00:00
Krzysztof Parzyszek
334675baa0 Move value type list from TargetRegisterClass to TargetRegisterInfo
Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301231
2017-04-24 19:43:45 +00:00
Krzysztof Parzyszek
ce1e95e40d Move size and alignment information of regclass to TargetRegisterInfo
1. RegisterClass::getSize() is split into two functions:
   - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
   - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
   - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783

llvm-svn: 301221
2017-04-24 18:55:33 +00:00
Yaxun Liu
c5f291f5da CodeGen: Add a hook for getFenceOperandTy
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.

This patch has no effect on targets other than amdgcn.

Differential Revision: https://reviews.llvm.org/D32186

llvm-svn: 301215
2017-04-24 18:26:27 +00:00
Daniel Sanders
4cd719403f [globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.

In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.

The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
	InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
	return OptionalComplexRendererFn(
	       [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.

As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.

Depends on D31418

Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31761

llvm-svn: 301079
2017-04-22 15:11:04 +00:00
Hans Wennborg
9099fd3422 Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows.  This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.

llvm-svn: 301047
2017-04-21 21:48:41 +00:00
Hans Wennborg
c49631b624 Revert r301040 "X86: Don't emit zero-byte functions on Windows"
This broke almost all bots. Reverting while fixing.

llvm-svn: 301041
2017-04-21 21:10:37 +00:00