1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

33038 Commits

Author SHA1 Message Date
Craig Topper
ef6d65ff72 [X86] Rename the VEX scalar fma builtins to end with a '3' to match gcc
I think we need to use different builtins for the FMA4 instructions since those instructions zero the upper bits and FMA3 instructions pass the bits through.

So this moves the existing builtins to be the FMA3 versions. New versions will be added for FMA4.

llvm-svn: 317765
2017-11-09 04:10:42 +00:00
Craig Topper
4a6c481dbb [X86] X86MaskedGatherSDNode shouldn't inherit from MaskedGatherScatterSDNode
The classof implementation in MaskedGatherScatterSDNode doesn't consider X86MaskedGatherSDNode so its misleading.

llvm-svn: 317733
2017-11-08 22:26:41 +00:00
Adrian Prantl
4f3b6de7f9 Let replaceVTableHolder accept any type.
In Rust, a trait can be implemented for any type, and if a trait
object pointer is used for the type, then a virtual table will be
emitted for that trait/type combination.

We would like debuggers to be able to inspect trait objects, which
requires finding the concrete type associated with a given vtable.

This patch changes LLVM so that any type can be passed to
replaceVTableHolder. This allows the Rust compiler to emit the needed
debug info -- associating a vtable with the concrete type for which it
was emitted.

This is a DWARF extension: DWARF only specifies the meaning of
DW_AT_containing_type in one specific situation. This style of DWARF
extension is routine, though, and LLVM already has one such case for
DW_AT_containing_type.

Patch by Tom Tromey!

Differential Revision: https://reviews.llvm.org/D39503

llvm-svn: 317730
2017-11-08 22:04:43 +00:00
Dan Gohman
39c48b9d3e Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].

Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.

As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.

[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html

Differential Revision: https://reviews.llvm.org/D38336

llvm-svn: 317729
2017-11-08 21:59:51 +00:00
Reid Kleckner
c37cd8d78d Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317579, originally committed as r317100.

There is a design issue with marking CFI instructions duplicatable. Not
all targets support the CFIInstrInserter pass, and targets like Darwin
can't cope with duplicated prologue setup CFI instructions. The compact
unwind info emission fails.

When the following code is compiled for arm64 on Mac at -O3, the CFI
instructions end up getting tail duplicated, which causes compact unwind
info emission to fail:
  int a, c, d, e, f, g, h, i, j, k, l, m;
  void n(int o, int *b) {
    if (g)
      f = 0;
    for (; f < o; f++) {
      m = a;
      if (l > j * k > i)
        j = i = k = d;
      h = b[c] - e;
    }
  }

We get assembly that looks like this:
; BB#1:                                 ; %if.then
Lloh3:
	adrp	x9, _f@GOTPAGE
Lloh4:
	ldr	x9, [x9, _f@GOTPAGEOFF]
	mov	 w8, wzr
Lloh5:
	str		wzr, [x9]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.lt	LBB0_3
	b	LBB0_7
LBB0_2:                                 ; %entry.if.end_crit_edge
Lloh6:
	adrp	x8, _f@GOTPAGE
Lloh7:
	ldr	x8, [x8, _f@GOTPAGEOFF]
Lloh8:
	ldr		w8, [x8]
	stp	x20, x19, [sp, #-16]!   ; 8-byte Folded Spill
	.cfi_def_cfa_offset 16
	.cfi_offset w19, -8
	.cfi_offset w20, -16
	cmp		w8, w0
	b.ge	LBB0_7
LBB0_3:                                 ; %for.body.lr.ph

Note the multiple .cfi_def* directives. Compact unwind info emission
can't handle that.

llvm-svn: 317726
2017-11-08 21:31:14 +00:00
Alex Bradbury
1e5ce63ec5 Set hasSideEffects=0 for PHI and fix affected passes
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred 
as 1. D37065 sets the previously inferred properties explicitly. This patch sets 
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has 
been updated so it still returns false for PHI.

Additionally, HexagonBitSimplify relied on a PHI node having the 
hasUnmodeledSideEffects property. This patch fixes that assumption.

Differential Revision: https://reviews.llvm.org/D37097

llvm-svn: 317721
2017-11-08 20:19:16 +00:00
Adrian McCarthy
8a8ad647e9 NFC: Rename MCSafeSEHFragment to MCSymbolIdFragment
Summary:
This fragment emits a symbol ID and will be useful for more than just Safe SEH
tables (e.g., I plan to re-use it for Control Flow Guard tables).  This is
simply a rename refactor.

Reviewers: rnk

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39770

llvm-svn: 317703
2017-11-08 18:57:02 +00:00
Alex Bradbury
1b9512e762 [NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0
rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target 
sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if 
it has to guess properties from patterns. Unfortunately, 
guessInstructionProperties=0 can't be used with current upstream LLVM as 
instructions in the TargetOpcode namespace are always included and sometimes 
have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch 
provides the simplest possible fix to this problem, setting default values for 
these fields in the TargetOpcode scope. There is no intended functional 
change, as the explicitly set properties should match what was previously 
inferred. A number of the instructions had hasSideEffects=1 inferred 
unintentionally. This patch makes it explicit, while future patches (such as 
D37097) correct the property.

Differential Revision: https://reviews.llvm.org/D37065

llvm-svn: 317674
2017-11-08 09:26:06 +00:00
Matt Arsenault
38099e08bc DAG: Add computeKnownBitsForFrameIndex
Some of the AMDGPU stack addressing modes require knowing the sign
bit is zero. We used to accomplish this by custom lowering
frame indexes, and then putting an AssertZext around a
TargetFrameIndex. This required specifically looking for
the AssextZext + frame index pattern which was moderately
disgusting. The same could probably be accomplished
with a target specific node, but would still
require special handling of frame indexes.

llvm-svn: 317671
2017-11-08 08:52:31 +00:00
Rafael Espindola
88fc6d16de Convert FileOutputBuffer::commit to Error.
llvm-svn: 317656
2017-11-08 01:50:29 +00:00
Rafael Espindola
b0606b4221 Convert FileOutputBuffer to Expected. NFC.
llvm-svn: 317649
2017-11-08 01:05:44 +00:00
David Blaikie
45b647d5eb Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Justin Lebar
8f48b0a3d7 [NVPTX] Implement __nvvm_atom_add_gen_d builtin.
Summary:
This just seems to have been an oversight.  We already supported the f64
atomic add with an explicit scope (e.g. "cta"), but not the scopeless
version.

Reviewers: tra

Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39638

llvm-svn: 317623
2017-11-07 22:10:54 +00:00
Mitch Phillips
c4d16d7686 Extend SpecialCaseList to allow users to blame matches on entries in the file.
Summary:
Extends SCL functionality to allow users to find the line number in the file the SCL is built from through SpecialCaseList::inSectionBlame(...).

Also removes the need to compile the SCL before use. As the matcher now contains a list of regexes to test against instead of a single regex, the regexes can be individually built on each insertion rather than one large compilation at the end of construction.

This change also fixes a bug where blank lines would cause the parser to become out-of-sync with the line number. An error on line `k` was being reported as being on line `k - num_blank_lines_before_k`.

Note: This change has a cyclical dependency on D39486. Both these changes must be submitted at the same time to avoid a build breakage.

Reviewers: vlad.tsyrklevich

Reviewed By: vlad.tsyrklevich

Subscribers: kcc, pcc, llvm-commits

Differential Revision: https://reviews.llvm.org/D39485

llvm-svn: 317617
2017-11-07 21:16:46 +00:00
Paul Robinson
fa59d7afc5 [DWARFv5] Support DW_FORM_strp in the .debug_line header.
Supporting this form in .debug_line.dwo will be done as a follow-up.

Differential Revision: https://reviews.llvm.org/D33155

llvm-svn: 317607
2017-11-07 19:57:12 +00:00
Petar Jovanovic
5b6a90db77 Reland "Correct dwarf unwind information in function epilogue for X86"
Reland r317100 with minor fix regarding ComputeCommonTailLength function in
BranchFolding.cpp. Skipping top CFI instructions block needs to executed on
several more return points in ComputeCommonTailLength().

Original r317100 message:

"Correct dwarf unwind information in function epilogue for X86"

This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.

Patch by Violeta Vukobrat.

llvm-svn: 317579
2017-11-07 14:40:27 +00:00
Kristof Beyls
80ff878c32 [GlobalISel] Enable legalizing non-power-of-2 sized types.
This changes the interface of how targets describe how to legalize, see
the below description.

1. Interface for targets to describe how to legalize.

In GlobalISel, the API in the LegalizerInfo class is the main interface
for targets to specify which types are legal for which operations, and
what to do to turn illegal type/operation combinations into legal ones.

For each operation the type sizes that can be legalized without having
to change the size of the type are specified with a call to setAction.
This isn't different to how GlobalISel worked before. For example, for a
target that supports 32 and 64 bit adds natively:

  for (auto Ty : {s32, s64})
    setAction({G_ADD, 0, s32}, Legal);

or for a target that needs a library call for a 32 bit division:

  setAction({G_SDIV, s32}, Libcall);

The main conceptual change to the LegalizerInfo API, is in specifying
how to legalize the type sizes for which a change of size is needed. For
example, in the above example, how to specify how all types from i1 to
i8388607 (apart from s32 and s64 which are legal) need to be legalized
and expressed in terms of operations on the available legal sizes
(again, i32 and i64 in this case). Before, the implementation only
allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0,
s128}, NarrowScalar).  A worse limitation was that if you'd wanted to
specify how to legalize all the sized types as allowed by the LLVM-IR
LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times
and probably would need a lot of memory to store all of these
specifications.

Instead, the legalization actions that need to change the size of the
type are specified now using a "SizeChangeStrategy".  For example:

   setLegalizeScalarToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerAndNarrowToLargest);

This example indicates that for type sizes for which there is a larger
size that can be legalized towards, do it by Widening the size.
For example, G_ADD on s17 will be legalized by first doing WidenScalar
to make it s32, after which it's legal.
The "NarrowToLargest" indicates what to do if there is no larger size
that can be legalized towards. E.g. G_ADD on s92 will be legalized by
doing NarrowScalar to s64.

Another example, taken from the ARM backend is:
   for (unsigned Op : {G_SDIV, G_UDIV}) {
     setLegalizeScalarToDifferentSizeStrategy(Op, 0,
         widenToLargerTypesUnsupportedOtherwise);
     if (ST.hasDivideInARMMode())
       setAction({Op, s32}, Legal);
     else
       setAction({Op, s32}, Libcall);
   }

For this example, G_SDIV on s8, on a target without a divide
instruction, would be legalized by first doing action (WidenScalar,
s32), followed by (Libcall, s32).

The same principle is also followed for when the number of vector lanes
on vector data types need to be changed, e.g.:

   setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal);
   setLegalizeVectorElementToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerTypesUnsupportedOtherwise);

As currently implemented here, vector types are legalized by first
making the vector element size legal, followed by then making the number
of lanes legal. The strategy to follow in the first step is set by a
call to setLegalizeVectorElementToDifferentSizeStrategy, see example
above.  The strategy followed in the second step
"moreToWiderTypesAndLessToWidest" (see code for its definition),
indicating that vectors are widened to more elements so they map to
natively supported vector widths, or when there isn't a legal wider
vector, split the vector to map it to the widest vector supported.

Therefore, for the above specification, some example legalizations are:
  * getAction({G_ADD, LLT::vector(3, 3)})
    returns {WidenScalar, LLT::vector(3, 8)}
  * getAction({G_ADD, LLT::vector(3, 8)})
    then returns {MoreElements, LLT::vector(8, 8)}
  * getAction({G_ADD, LLT::vector(20, 8)})
    returns {FewerElements, LLT::vector(16, 8)}


2. Key implementation aspects.

How to legalize a specific (operation, type index, size) tuple is
represented by mapping intervals of integers representing a range of
size types to an action to take, e.g.:

       setScalarAction({G_ADD, LLT:scalar(1)},
                       {{1, WidenScalar},  // bit sizes [ 1, 31[
                        {32, Legal},       // bit sizes [32, 33[
                        {33, WidenScalar}, // bit sizes [33, 64[
                        {64, Legal},       // bit sizes [64, 65[
                        {65, NarrowScalar} // bit sizes [65, +inf[
                       });

Please note that most of the code to do the actual lowering of
non-power-of-2 sized types is currently missing, this is just trying to
make it possible for targets to specify what is legal, and how non-legal
types should be legalized.  Probably quite a bit of further work is
needed in the actual legalizing and the other passes in GlobalISel to
support non-power-of-2 sized types.

I hope the documentation in LegalizerInfo.h and the examples provided in the
various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well
enough how this is meant to be used.

This drops the need for LLT::{half,double}...Size().


Differential Revision: https://reviews.llvm.org/D30529

llvm-svn: 317560
2017-11-07 10:34:34 +00:00
Adrian Prantl
201051f526 Make DIExpression::createFragmentExpression() return an Optional.
We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.

llvm-svn: 317534
2017-11-07 00:45:34 +00:00
Davide Italiano
42befee304 [IPO/LowerTypesTest] Skip blockaddress(es) when replacing uses.
Blockaddresses refer to the function itself, therefore replacing them
would cause an assertion in doRAUW.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35201

This was found when trying CFI on a proprietary kernel by Dmitry Mikulin.

Differential Revision:  https://reviews.llvm.org/D39695

llvm-svn: 317527
2017-11-07 00:09:25 +00:00
Vedant Kumar
de7b7d7dae [DebugInfo] Unify logic to merge DILocations. NFC.
This makes DILocation::getMergedLocation() do what its comment says it
does when merging locations for an Instruction: set the common inlineAt
scope. This simplifies Instruction::applyMergedLocation() a bit.

Testing: check-llvm, check-clang

Differential Revision: https://reviews.llvm.org/D39628

llvm-svn: 317524
2017-11-06 23:15:21 +00:00
Bjorn Pettersson
491c957c8a [MIRPrinter] Use %subreg.xxx syntax for subregister index operands
Summary:
Print %subreg.<subregidxname> instead of just the subregister
index when printing immediate operands corresponding to subreg
indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and
REG_SEQUENCE.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39696

llvm-svn: 317513
2017-11-06 21:46:06 +00:00
Sanjay Patel
fd69991264 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
Martin Storsjo
da26b595ac [ObjectYAML] Map relocation types for COFF ARMNT and ARM64
Differential Revision: https://reviews.llvm.org/D39668

llvm-svn: 317459
2017-11-06 07:20:58 +00:00
David L. Jones
389d85f7ec [PassManager, SimplifyCFG] Revert r316908 and r316869.
These cause Clang to crash with a segfault. See PR35210 for details.

llvm-svn: 317444
2017-11-06 00:32:01 +00:00
Harlan Haskins
fab532d4a0 Use code voice for DIBuilder in LLVM C API
(This is a test commit)

llvm-svn: 317422
2017-11-04 20:31:20 +00:00
Aaron Ballman
a931962faf Move the srpm, ocaml_make_directory, llvm_vcsrevision_h, and llvm-headers projects into the Misc folder on IDEs like Visual Studio rather than leave them in the root directory. NFC.
llvm-svn: 317416
2017-11-04 19:59:14 +00:00
Sean Fertile
8d0d8c3f01 [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Originally commited as r317374, but reverted in r317395 to update some missed
tests.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317408
2017-11-04 17:04:39 +00:00
Sean Fertile
896ef740a9 Revert "[LTO][ThinLTO] Use the linker resolutions to mark global values ..."
Changes more tests then expected on one of the build bots.
reverting to investigate.

This reverts https://llvm.org/svn/llvm-project/llvm/trunk@317374

llvm-svn: 317395
2017-11-04 01:54:20 +00:00
David Blaikie
10180bb2a1 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Sean Fertile
e9996bd6a1 [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317374
2017-11-03 21:45:55 +00:00
Peter Collingbourne
08823f2e0d Revert r317046, "Object: Move some code from ELF.h into ELF.cpp."
This change resulted in a measured 1.5-2% perf regression linking
chrome.

llvm-svn: 317371
2017-11-03 21:30:06 +00:00
David Blaikie
2b916d7011 GCOV: Move GCOV from IR & Support into ProfileData to fix layering
This class was split between libIR and libSupport, which breaks under
modular code generation. Move it into the one library that uses it,
ProfileData, to resolve this issue.

llvm-svn: 317366
2017-11-03 20:57:10 +00:00
Jun Bum Lim
3e63efac67 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
David Blaikie
6bc180a4d3 Modularize: Include some required headers
DenseMaps require the definition of a type to be available when using a
pointer to that type as a key to know how many bits are available for
tombstone/etc.

llvm-svn: 317360
2017-11-03 20:24:19 +00:00
Aaron Ballman
9583dc1ef1 Correcting some CRLFs that snuck in with my previous commit; NFC.
llvm-svn: 317357
2017-11-03 20:05:51 +00:00
Aaron Ballman
b071b10879 Add llvm::for_each as a range-based extensions to <algorithm> and make use of it in some cases where it is a more clear alternative to std::for_each.
llvm-svn: 317356
2017-11-03 20:01:25 +00:00
Jun Bum Lim
aa115afcd5 Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00
Jun Bum Lim
d171e076c9 Add CallSiteSplitting pass
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :

1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.

Split from :
      if (!ptr || c)
        callee(ptr);
to :
      if (!ptr)
        callee(null ptr)  // set the known constant value
      else if (c)
        callee(nonnull ptr)  // set non-null attribute in the argument

2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2, label %BB1
      BB1:
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
       call void @bar(i32 %p)
to
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2-split0, label %BB1
      BB1:
       br label %BB2-split1
      BB2-split0:
       call void @bar(i32 0)
       br label %BB2
      BB2-split1:
       call void @bar(i32 1)
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide

Reviewed By: davidxl

Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39137

llvm-svn: 317351
2017-11-03 19:01:57 +00:00
Mikael Holmen
390f369a75 [ADCE] Use MapVector for BlockInfo to make iteration order deterministic
Summary:
Also added a reserve() method to MapVector since we want to use that from
ADCE.

DenseMap does not provide deterministic iteration order so with that
we will handle the members of BlockInfo in random order, eventually
leading to random order of the blocks in the predecessor lists.

Without this change, I get the same predecessor order in about 90% of the
time when I compile a certain reproducer and in 10% I get a different one.

No idea how to make a proper test case for this.

Reviewers: kuhar, david2050

Reviewed By: kuhar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39593

llvm-svn: 317323
2017-11-03 14:15:08 +00:00
Clement Courbet
8d32100fc8 re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/.

llvm-svn: 317318
2017-11-03 12:12:27 +00:00
Puyan Lotfi
2591fd6099 mir-canon: First commit.
mir-canon (MIRCanonicalizerPass) is a pass designed to reorder instructions and
rename operands so that two similar programs will diff more cleanly after being
run through mir-canon than they would otherwise. This project is still a work
in progress and there are ideas still being discussed for improving diff
quality.

M    include/llvm/InitializePasses.h
M    lib/CodeGen/CMakeLists.txt
M    lib/CodeGen/CodeGen.cpp
A    lib/CodeGen/MIRCanonicalizerPass.cpp

llvm-svn: 317285
2017-11-02 23:37:32 +00:00
Hiroshi Yamauchi
9e5d6805dd Irreducible loop metadata for more accurate block frequency under PGO.
Summary:
Currently the block frequency analysis is an approximation for irreducible
loops.

The new irreducible loop metadata is used to annotate the irreducible loop
headers with their header weights based on the PGO profile (currently this is
approximated to be evenly weighted) and to help improve the accuracy of the
block frequency analysis for irreducible loops.

This patch is a basic support for this.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39028

llvm-svn: 317278
2017-11-02 22:26:51 +00:00
Adrian Prantl
3022c20faf Clean up comments in include/llvm-c/DebugInfo.h
Patch by Harlan Haskins!

Differential Revision: https://reviews.llvm.org/D39568

llvm-svn: 317271
2017-11-02 21:35:37 +00:00
Adrian Prantl
2e3db2532b Add missing header guards.
llvm-svn: 317267
2017-11-02 20:58:58 +00:00
Mitch Phillips
628b75abff Fixed line length style issue.
Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39395

llvm-svn: 317223
2017-11-02 18:04:44 +00:00
Chad Rosier
413f2d7828 [TargetParser][AArch64] Reorder enum to preserve 5.0.0 libLLVM ABI.
This is required for backporting r311659 to the 5.0.1 release.
PR35060

Differential Revision: https://reviews.llvm.org/D39558

llvm-svn: 317222
2017-11-02 17:52:27 +00:00
Clement Courbet
7d1aa061eb Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage

This reverts commit eea333c33fa73ad225ef28607795984829f65688.

llvm-svn: 317213
2017-11-02 15:53:10 +00:00
Clement Courbet
f47d6470dc [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.

See the discussion in PR33325 for more details.

Reviewers: spatel

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D39456

llvm-svn: 317211
2017-11-02 15:02:51 +00:00
Yichao Yu
bce23aee65 Allow inaccessiblememonly and inaccessiblemem_or_argmemonly to be overwriten on call site with operand bundle
Summary:
Similar to argmemonly, readonly and readnone.

Fix PR35128

Reviewers: andrew.w.kaylor, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D39434

llvm-svn: 317201
2017-11-02 12:18:33 +00:00
NAKAMURA Takumi
10f211b68b llvm-c/DebugInfo.h: Fix warning. [-Wdocumentation]
llvm-svn: 317191
2017-11-02 08:03:12 +00:00
Jake Ehrlich
9584905a27 [yaml2obj][ELF] Add support for setting alignment in program headers
Sometimes program headers have larger alignments than any of the
sections they contain. Currently yaml2obj can't produce such files. A
bug recently appeared in llvm-objcopy that failed in such a case. I'd
like to be able to add tests to llvm-objcopy for such cases.

This change adds an optional alignment parameter to program headers that
will be used instead of calculating the alignment.

Differential Revision: https://reviews.llvm.org/D39130

llvm-svn: 317139
2017-11-01 23:14:48 +00:00
Petar Jovanovic
8c7ceb5bd9 Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf
buildbot failure (build #15606).

llvm-svn: 317136
2017-11-01 23:05:52 +00:00
whitequark
ee01810f86 [LLVM-C] Expose functions to create debug locations via DIBuilder.
These include:
  * Several functions for creating an LLVMDIBuilder,
  * LLVMDIBuilderCreateCompileUnit,
  * LLVMDIBuilderCreateFile,
  * LLVMDIBuilderCreateDebugLocation.

Patch by Harlan Haskins.

Differential Revision: https://reviews.llvm.org/D32368

llvm-svn: 317135
2017-11-01 22:18:52 +00:00
Daniel Sanders
eb37434978 [globalisel][regbank] Warn about MIR ambiguities when register bank/class names clash.
llvm-svn: 317132
2017-11-01 22:13:05 +00:00
Rui Ueyama
4509288779 Rewrite FileOutputBuffer as two separate classes.
This patch is to rewrite FileOutputBuffer as two separate classes;
one for file-backed output buffer and the other for memory-backed
output buffer. I think the new code is easier to follow because two
different implementations are now actually separated as different
classes.

Unlike the previous implementation, the class that does not replace the
final output file using rename(2) does not create a temporary file at
all. Instead, it allocates memory using mmap(2) and use it. I think
this is an improvement because it is now guaranteed that the temporary
memory region doesn't trigger any I/O and there's now zero chance to
leave a temporary file behind. Also, it shouldn't impose new restrictions
because were using mmap IO too.

Differential Revision: https://reviews.llvm.org/D39449

llvm-svn: 317127
2017-11-01 21:38:14 +00:00
Dehao Chen
6503490d34 Include GUIDs from the same module when computing GUIDs that needs to be imported.
Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D39480

llvm-svn: 317118
2017-11-01 20:26:47 +00:00
Daniel Sanders
afac3fef9a [globalisel][tablegen] Add support for multi-insn emission
The importer will now accept nested instructions in the result pattern such as
(ADDWrr $a, (SUBWrr $b, $c)). This is only valid when the nested instruction
def's a single vreg and the parent instruction consumes a single vreg where a
nested instruction is specified. The importer will automatically create a vreg
to connect the two using the type information from the pattern. This vreg will
be constrained to the register classes given in the instruction definitions*.

* REG_SEQUENCE is explicitly rejected because of this. The definition doesn't
  constrain to a register class and it therefore needs special handling.

llvm-svn: 317117
2017-11-01 19:57:57 +00:00
Petar Jovanovic
d61746acde Correct dwarf unwind information in function epilogue for X86
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.


Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D35844

llvm-svn: 317100
2017-11-01 16:04:11 +00:00
Geoff Berry
126122d5fd [BranchProbabilityInfo] Handle irreducible loops.
Summary:
Compute the strongly connected components of the CFG and fall back to
use these for blocks that are in loops that are not detected by
LoopInfo when computing loop back-edge and exit branch probabilities.

Reviewers: dexonsmith, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D39385

llvm-svn: 317094
2017-11-01 15:16:50 +00:00
NAKAMURA Takumi
6e3809e6fd Fix warnings discovered by rL317076. [-Wunused-private-field]
llvm-svn: 317091
2017-11-01 13:47:55 +00:00
NAKAMURA Takumi
870473aa74 Reformat.
llvm-svn: 317078
2017-11-01 05:14:35 +00:00
NAKAMURA Takumi
72cbe234f4 Revert rL317019, "[ADT] Split optional to only include copy mechanics and dtor for non-trivial types."
Seems g++-4.8 (eg. Ubuntu 14.04) doesn't like this.

llvm-svn: 317077
2017-11-01 05:14:31 +00:00
Peter Collingbourne
c96a69b45d Object: Move some code from ELF.h into ELF.cpp.
Differential Revision: https://reviews.llvm.org/D39271

llvm-svn: 317046
2017-10-31 22:49:23 +00:00
Peter Collingbourne
b997552a11 Inline compareAddr function into its only caller. NFCI.
llvm-svn: 317045
2017-10-31 22:49:09 +00:00
Benjamin Kramer
ec4a9b4088 Revert "[DWARF] Now that Optional is standard layout, put it into an union instead of splatting it."
GCC doesn't like it. This reverts commit r317028.

llvm-svn: 317030
2017-10-31 19:55:08 +00:00
Benjamin Kramer
142ac789a5 [DWARF] Now that Optional is standard layout, put it into an union instead of splatting it.
No functionality change intended.

llvm-svn: 317028
2017-10-31 19:40:03 +00:00
Benjamin Kramer
f423bb4d8a [ADT] Split optional to only include copy mechanics and dtor for non-trivial types.
This makes uses of Optional more transparent to the compiler (and
clang-tidy) and generates slightly smaller code.

llvm-svn: 317019
2017-10-31 18:35:54 +00:00
Wolfgang Pieb
2c53dfefa1 [Metadata][NFC] Make MDNode::resolve() public in preparation for the fix to PR33930.
Reviewers: aprantl
llvm-svn: 317018
2017-10-31 18:25:28 +00:00
Teresa Johnson
f1bd32e79d [ThinLTO] Double bits of module hash used for renaming
Summary:
Use 64 instead of 32 bits of the module hash as the suffix when renaming
after promotion to reduce the likelihood of a collision (which we
observed in a binary when using 32 bits).

Reviewers: pcc

Subscribers: llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D39443

llvm-svn: 316996
2017-10-31 12:56:09 +00:00
David Green
90105ec5c8 [LoopUnroll] Clean up remarks for unroll remainder
The optimisation remarks for loop unrolling with an unrolled remainder looks something like:

test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll]
            C[i] += A[i*N+j];
                 ^
test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll]
        for(int j = 0; j < N; j++)
        ^
This removes the first of the two messages.

Differential revision: https://reviews.llvm.org/D38725

llvm-svn: 316986
2017-10-31 10:47:46 +00:00
Max Kazantsev
568017d4d5 Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors"
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
  int array[LEN];
  ...
  guard(0 <= index && index < LEN);
  use(array[index]);

Differential Revision: https://reviews.llvm.org/D37460

llvm-svn: 316975
2017-10-31 05:07:56 +00:00
Daniel Neilson
0ad57a67a0 Create instruction classes for identifying any atomicity of memory intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html

This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:

IntrinsicInst
  -> MemIntrinsicBase (methods-only class)
    -> MemIntrinsic (non-atomic intrinsics)
      -> MemSetInst
      -> MemTransferInst
        -> MemCpyInst
        -> MemMoveInst
    -> AtomicMemIntrinsic (atomic intrinsics)
      -> AtomicMemSetInst
      -> AtomicMemTransferInst
        -> AtomicMemCpyInst
        -> AtomicMemMoveInst
    -> AnyMemIntrinsic (both atomicities)
      -> AnyMemSetInst
      -> AnyMemTransferInst
        -> AnyMemCpyInst
        -> AnyMemMoveInst

This involves some class renaming:
    ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
    ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
    ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.

An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).

---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"

REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"

FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )

SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"

for f in $FILES; do
    echo "Processing: $f"
    sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done

Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev

Reviewed By: sanjoy

Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38419

llvm-svn: 316950
2017-10-30 19:51:48 +00:00
Clement Courbet
32210b316a [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905
2017-10-30 14:19:33 +00:00
Sanjay Patel
52b396ea52 [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
The old PM sets the options of what used to be known as "latesimplifycfg" on the 
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not 
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 316869
2017-10-29 20:49:31 +00:00
Saleem Abdulrasool
92084746fc ADT: add a helper to check if the Triple is ARM64
Add a trivial helper for checking if the architecture is AArch64 Little
Endian or Big Endian.

llvm-svn: 316837
2017-10-28 19:15:05 +00:00
Sanjay Patel
ab3266d1be [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
Eugene Zelenko
df1c971e39 [ADT] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316818
2017-10-28 00:24:26 +00:00
David Blaikie
4faf7716f4 Add a few missing headers for modularization/IWYU/etc
Several cases where class definitions are required for DenseMap pointer
traits handling.

llvm-svn: 316803
2017-10-27 22:12:46 +00:00
whitequark
198497f55f [LLVM-C] Publicly expose getters of MetadataType, TokenType
Patch by Robert Widmann.

Expose getters for MetadataType and TokenType publicly in the C API.
Discovered a need for these while trying to wrap the intrinsics API.

Differential Revision: https://reviews.llvm.org/D38809

llvm-svn: 316762
2017-10-27 11:51:40 +00:00
NAKAMURA Takumi
af734a1671 llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h: Fix -fmodules build introduced in rL316715.
llvm-svn: 316743
2017-10-27 05:45:11 +00:00
Sean Fertile
76045150cf Add subclass data to the FoldingSetNode for MemIntrinsicSDNodes.
Not having the subclass data on an MemIntrinsicSDNodes means it was possible
to try to fold 2 nodes with the same operands but differing MMO flags. This
would trip an assertion when trying to refine the alignment between the 2
MachineMemOperands.

Differential Revision: https://reviews.llvm.org/D38898

llvm-svn: 316737
2017-10-27 04:02:51 +00:00
Eugene Zelenko
d1a2d71da9 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316724
2017-10-27 01:09:08 +00:00
David Blaikie
12654d6d71 InstructionSelectorImpl.h: Modularize/remove ODR violations by using a static member function to expose the debug name
llvm-svn: 316715
2017-10-26 23:39:54 +00:00
David Blaikie
93be1aab08 MCCodePadder.h: Include definition of type for use with DenseMap
Pointer traits require a full definition of a type to function
correctly, so the header must be included rather than only a forward
declaration.

llvm-svn: 316714
2017-10-26 23:39:52 +00:00
Aditya Nandakumar
b4f531a783 [GISel]: Missed checking if it's okay to create a G_CONSTANT of DstTy in the legalizationCombiner
llvm-svn: 316694
2017-10-26 20:13:54 +00:00
Sean Fertile
6a96e17cac Represent runtime preemption in the IR.
Currently we do not represent runtime preemption in the IR, which has several
drawbacks:

  1) The semantics of GlobalValues differ depending on the object file format
     you are targeting (as well as the relocation-model and -fPIE value).
  2) We have no way of disabling inlining of run time interposable functions,
     since in the IR we only know if a function is link-time interposable.
     Because of this llvm cannot support elf-interposition semantics.
  3) In LTO builds of executables we will have extra knowledge that a symbol
     resolved to a local definition and can't be preemptable, but have no way to
     propagate that knowledge through the compiler.

This patch adds preemptability specifiers to the IR with the following meaning:

dso_local --> means the compiler may assume the symbol will resolve to a
 definition within the current linkage unit and the symbol may be accessed
 directly even if the definition is not within this compilation unit.

dso_preemptable --> means that the compiler must assume the GlobalValue may be
replaced with a definition from outside the current linkage unit at runtime.

To ease transitioning dso_preemptable is treated as a 'default' in that
low-level codegen will still do the same checks it did previously to see if a
symbol should be accessed indirectly. Eventually when IR producers emit the
specifiers on all Globalvalues we can change dso_preemptable to mean 'always
access indirectly', and remove the current logic.

Differential Revision: https://reviews.llvm.org/D20217

llvm-svn: 316668
2017-10-26 15:00:26 +00:00
Eugene Zelenko
6c6ba843a4 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316630
2017-10-26 00:55:39 +00:00
Jonas Devlieghere
e7361cd27e Re-land "[dwarfdump] Add -lookup option"
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 316619
2017-10-25 21:56:41 +00:00
Sanjoy Das
7a6287ec04 Add a comment to clarify a future change
llvm-svn: 316614
2017-10-25 21:40:59 +00:00
Aditya Nandakumar
28d9faf26b Make the combiner check if shifts are legal before creating them
Summary: Make sure shifts are legal/specified by the legalizerinfo before creating it

Reviewers: qcolombet, dsanders, rovka, t.p.northover

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39264

llvm-svn: 316602
2017-10-25 18:49:18 +00:00
Peter Collingbourne
656d1cb74c Remove dead function declaration.
llvm-svn: 316597
2017-10-25 17:42:00 +00:00
Matthew Simpson
05237a905d Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.

Differential Revision: https://reviews.llvm.org/D37355

llvm-svn: 316576
2017-10-25 13:40:08 +00:00
Daniil Fukalov
b62bacb924 [inlineasm] Fix crash when number of matched input constraint operands overflows signed char
In a case when number of output constraint operands that has matched input operands
doesn't fit to signed char, TargetLowering::ParseConstraints() can try to access
ConstraintOperands (that is std::vector) with negative index.

Reviewers: rampitec, arsenm

Differential Review: https://reviews.llvm.org/D39125

llvm-svn: 316574
2017-10-25 12:51:32 +00:00
George Rimar
938c776d94 [llvm-dwarfdump] - Fix array out of bounds access crash.
This fixes possible out of bound access in
DWARFDie::getFirstChild()
which might happen when .debug_info section is corrupted,
like shown in testcase.

Differential revision: https://reviews.llvm.org/D39185

llvm-svn: 316566
2017-10-25 10:23:49 +00:00
Peter Collingbourne
251c0e9aa5 llvm-readobj: Add support for reading relocations in the Android packed format.
This is in preparation for testing lld's upcoming relocation packing
feature (D39152). I have verified that this implementation correctly
unpacks the relocations from a Chromium DSO built with gold and the
Android relocation packer for ARM32 and ARM64.

Differential Revision: https://reviews.llvm.org/D39272

llvm-svn: 316543
2017-10-25 03:37:12 +00:00
Adrian Prantl
74a7e5f6f8 Implement salavageDebugInfo functionality for SelectionDAG.
Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds
a hook that can be invoked before an SDNode that is associated with an
SDDbgValue is erased to capture the effect of the deleted node in a
DIExpression.

The motivating example is an SDDebugValue attached to an ADD operation
that gets folded into a LOAD+OFFSET operation.

rdar://problem/32121503

llvm-svn: 316525
2017-10-24 22:55:12 +00:00
Sam Clegg
4e1a08d54c Add Triple::isOSUnknown
Subscribers: aheejin

Differential Revision: https://reviews.llvm.org/D39256

llvm-svn: 316524
2017-10-24 22:48:19 +00:00
David Blaikie
f6357ae275 ValueMapper.h: Don't mark header functions as file local
llvm-svn: 316516
2017-10-24 21:29:21 +00:00
David Blaikie
c01b46c73e Transforms/Utils/Local.h: Don't mark header functions as file local
llvm-svn: 316515
2017-10-24 21:29:20 +00:00
David Blaikie
5974607431 TargetOpcodes.h: Don't mark header functions as file local
llvm-svn: 316514
2017-10-24 21:29:19 +00:00
David Blaikie
81c6e7d462 Printable.h: Don't mark header functions as file local
llvm-svn: 316513
2017-10-24 21:29:19 +00:00
David Blaikie
8c783a9730 ConvertUTF.h: Don't mark header functions as file local
llvm-svn: 316512
2017-10-24 21:29:18 +00:00
David Blaikie
2c20cd4637 AtomicOrdering.h: Don't mark header functions as file local
llvm-svn: 316511
2017-10-24 21:29:18 +00:00
David Blaikie
2afa6a2927 LaneBitmask.h: Don't mark header functions as file local
llvm-svn: 316510
2017-10-24 21:29:17 +00:00
David Blaikie
728a415814 Type.h: Don't mark header functions as file local
llvm-svn: 316509
2017-10-24 21:29:16 +00:00
David Blaikie
4f809b1482 RegisterUsageInfo.h: Add missing header for complete type needed for DenseMap traits
llvm-svn: 316504
2017-10-24 21:29:10 +00:00
Simon Pilgrim
c62f67e118 Fix Wdocumentation warning. NFCI.
llvm-svn: 316498
2017-10-24 20:56:09 +00:00
Artem Belevich
dab780c7cb [NVPTX] allow address space inference for volatile loads/stores.
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.

Differential Revision: https://reviews.llvm.org/D39026

llvm-svn: 316495
2017-10-24 20:31:44 +00:00
David Blaikie
463fb86f87 BinaryFormat/MachO.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316477
2017-10-24 17:29:14 +00:00
David Blaikie
86450fc685 ValueTracking.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316476
2017-10-24 17:29:14 +00:00
David Blaikie
610d150edd MemoryBuiltins.h: Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316475
2017-10-24 17:29:13 +00:00
David Blaikie
32330f0e28 IndirectCallSiteVisitor.h:findIndirectCallSites Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316474
2017-10-24 17:29:12 +00:00
David Blaikie
592131e872 StringExtras.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline                                                                   function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316473
2017-10-24 17:29:12 +00:00
David Blaikie
27e1b52093 SmallVector.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline
function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316472
2017-10-24 17:29:11 +00:00
David Blaikie
d2b320df06 DenseMap.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline
function in a header and also creates binary bloat from duplicate definitions.

llvm-svn: 316471
2017-10-24 17:29:11 +00:00
David Blaikie
79b38636a9 BitVector.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another
inline function in a header and also creates binary bloat from duplicate
definitions.

llvm-svn: 316470
2017-10-24 17:29:08 +00:00
Marek Olsak
f5ea109f9c AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

llvm-svn: 316427
2017-10-24 10:27:13 +00:00
Marek Olsak
68ff4e0dd2 AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D38543

llvm-svn: 316426
2017-10-24 10:26:59 +00:00
Sam McCall
420b16c6d1 Support formatv of TimePoint with strftime-style formats.
Summary:
Support formatv of TimePoint with strftime-style formats.

Extensions for millis/micros/nanos are added.
Inital use case is HH:MM:SS.MMM timestamps in clangd logs.

Reviewers: bkramer, ilya-biryukov

Subscribers: labath, llvm-commits

Differential Revision: https://reviews.llvm.org/D38992

llvm-svn: 316419
2017-10-24 08:30:19 +00:00
Bruno Cardoso Lopes
6bb62ee96b [Modules] Add module for Config/llvm-config.h
Besides all the goodness from modularizing a header, this is necessary
to compile ToT with modules with the clang host compiler from Xcode 9 in
macOS 10.13, which our bots don't use yet.

rdar://problem/35038151

llvm-svn: 316414
2017-10-24 06:18:52 +00:00
Omer Paparo Bivas
3f4c58083e [MC] Adding code padding for performance stability - infrastructure. NFC.
Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved.
The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known.
This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding.

Reviewers:
aaboud
zvi
craig.topper
gadi.haber

Differential revision: https://reviews.llvm.org/D34393

Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e
llvm-svn: 316413
2017-10-24 06:16:03 +00:00
Bob Haarman
01985f083c [raw_fd_ostream] report actual error in error messages
Summary:
Previously, we would emit error messages like "IO failure on output
stream". This change causes use to include information about what
actually went wrong, e.g. "No space left on device".

Reviewers: sunfish, rnk

Reviewed By: rnk

Subscribers: mehdi_amini, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39203

llvm-svn: 316404
2017-10-24 01:26:22 +00:00
Reid Kleckner
eb9ed33777 [codeview] Add support for inlinee lists
This adds type index discovery and dumper support for symbol record kind
0x1168, which is a list of inlined function ids. This symbol kind is
undocumented, but S_INLINEES is consistent with the existing
nomenclature.

Fixes PR34222

llvm-svn: 316398
2017-10-23 23:43:40 +00:00
Justin Lebar
666ae11aa6 [PM] Fix Typo
Patch by Nick Sarnie.

llvm-svn: 316397
2017-10-23 23:42:05 +00:00
Bob Wilson
45f054feaf Add a new Simulator entry for the target triple environment.
Apple's iOS, tvOS and watchOS simulator platforms have never been clearly
distinguished in the target triples. Even though they are intended to
behave similarly to the corresponding device platforms, they have separate
SDKs and are really separate platforms from the compiler's perspective.
Clang now defines a macro when building for one of these simulator platforms
(r297866) but that relies on the very indirect mechanism of checking to see
which option was used to specify the minimum deployment target. That is not
so great. Swift would also like to distinguish these simulator platforms in
a similar way, but unlike Clang, Swift does not use a separate option to
specify the minimum deployment target -- it uses a -target option to
specify the target triple directly, including the OS version number.
Using a different target triple for the simulator platforms is a much
more direct and obvious way to specify this. Putting the "simulator" in
the environment component of the triple means the OS values can stay the
same and existing code the looks at the OS field will not be affected.

https://reviews.llvm.org/D39143
rdar://problem/34729432

llvm-svn: 316380
2017-10-23 21:51:50 +00:00
Daniel Sanders
572038831c [globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero.
This patch enables the import of stores. Unfortunately, doing so by itself,
loses an optimization where storing 0 to memory makes use of WZR/XZR.

To mitigate this, this patch also introduces a new feature that allows register
operands to nominate a zero register. When this is done, GlobalISel will
substitute (G_CONSTANT 0) with the nominated register automatically. This
is currently configured to only apply to the stores.

Applying it to GPR32/GPR64 register classes in general will be done after
review see (https://reviews.llvm.org/D39150).

llvm-svn: 316360
2017-10-23 18:19:24 +00:00
Sam McCall
cde259879b Support formatting formatv_objects.
Summary:
Support formatting formatv_objects.

While here, fix documentation about member-formatters, and attempted
perfect-forwarding (I think).

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38997

llvm-svn: 316330
2017-10-23 15:40:44 +00:00
George Rimar
b702d7b84c [llvm-dwarfdump] - Teach tool about few GNU call_sites constants.
This teaches tool about following consants: 
DW_TAG_GNU_call_site,
DW_TAG_GNU_call_site_parameter,
DW_AT_GNU_call_site_value,
DW_AT_GNU_all_call_sites.

Constants documented here: https://sourceware.org/elfutils/DwarfExtensions

Differential revision: https://reviews.llvm.org/D39119

llvm-svn: 316321
2017-10-23 11:24:14 +00:00
Benjamin Kramer
ad3ad3c311 Create fewer copies of StringMaps. No functionality change intended.
llvm-svn: 316301
2017-10-22 20:16:28 +00:00
Sanjay Patel
aabdcd8b1c [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Marina Yatsina
b5c80eef49 Add logic to greedy reg alloc to avoid bad eviction chains
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
llvm-svn: 316295
2017-10-22 17:59:38 +00:00
Eugene Zelenko
e08f7dbc57 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316253
2017-10-21 00:57:46 +00:00
Krzysztof Parzyszek
231a63268d [Packetizer] Add function to check for aliasing between instructions
llvm-svn: 316243
2017-10-20 22:08:40 +00:00
Eugene Zelenko
ef120e7012 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316241
2017-10-20 21:47:29 +00:00
Sam Clegg
a4980eecee [WebAssembly] MC: Fix crash when -g specified.
At this point we don't output any debug sections or thier
relocations.

Differential Revision: https://reviews.llvm.org/D39076

llvm-svn: 316240
2017-10-20 21:28:38 +00:00
Daniel Sanders
ff8715aa62 [globalisel][tablegen] Fix small spelling nits. NFC
ComplexRendererFn -> ComplexRendererFns
Corrected a couple lingering references to tied operands that were missed.

llvm-svn: 316237
2017-10-20 20:55:29 +00:00
Peter Collingbourne
3760a48055 COFF: Add type server pdb files to linkrepro tar file.
Differential Revision: https://reviews.llvm.org/D38977

llvm-svn: 316233
2017-10-20 19:48:26 +00:00
Lang Hames
1d2150fbc9 [ExecutionEngine] After a heroic dev-meeting hack session, the JIT supports TLS.
Turns on EmulatedTLS support by default in EngineBuilder. ;)

llvm-svn: 316200
2017-10-20 00:53:16 +00:00
Eugene Zelenko
d9ca107b8e [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316187
2017-10-19 21:21:30 +00:00
Sam McCall
c45bdbcd89 Const fix for YAMLParser.
llvm-svn: 316151
2017-10-19 08:13:49 +00:00
whitequark
71ed228afe [MergeFunctions] Don't blindly RAUW a GlobalValue with a ConstantExpr.
MergeFunctions uses (through FunctionComparator) a map of GlobalValues
to identifiers because it needs to compare functions and globals
do not have an inherent total order. Thus, FunctionComparator
(through GlobalNumberState) has a ValueMap<GlobalValue *>.

r315852 added a RAUW on globals that may have been previously
encountered by the FunctionComparator, which would replace
a GlobalValue * key with a ConstantExpr *, which is illegal.

This commit adjusts that code path to remove the function being
replaced from the ValueMap as well.

llvm-svn: 316145
2017-10-19 04:47:48 +00:00
Vedant Kumar
f1c5f8681a [llvm-cov] Move LineCoverageIterator to libCoverage. NFC.
LineCoverageIterator makes it easy for clients of coverage data to
determine line execution counts for a file or function. The coverage
iteration logic is tricky enough that it really pays not to have
multiple copies of it. Hopefully having just one implementation in LLVM
will make the iteration logic easier to test, reuse, and update.

This commit is NFC but I've added a unit test to go along with it just
because it's easy to do now.

llvm-svn: 316141
2017-10-18 23:58:28 +00:00
Sanjoy Das
d41e33114f Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain."
This reverts commit r316054.  There was some confusion over the review process:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html

llvm-svn: 316129
2017-10-18 22:00:57 +00:00
Eugene Zelenko
74914fb100 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316128
2017-10-18 21:46:47 +00:00
Konstantin Zhuravlyov
762f1d9a9a AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency
Differential Revision: https://reviews.llvm.org/D38957

llvm-svn: 316097
2017-10-18 17:31:09 +00:00
Jatin Bhateja
02e129e082 [ScalarEvolution] Handling for ICmp occuring in the evolution chain.
Summary:
 If a compare instruction is same or inverse of the compare in the
 branch of the loop latch, then return a constant evolution node.
 Currently scope of evaluation is limited to SCEV computation for
 PHI nodes.

 This shall facilitate computations of loop exit counts in cases
 where compare appears in the evolution chain of induction variables.

 Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju

Reviewed By: junryoungju

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38494

llvm-svn: 316054
2017-10-18 01:36:16 +00:00
Michael Zolotukhin
bcf5523bc8 [GlobalDCE] Use DenseMap instead of unordered_multimap for GVDependencies.
Summary:
std::unordered_multimap happens to be very slow when the number of elements
grows large. On one of our internal applications we observed a 17x compile time
improvement from changing it to DenseMap.

Reviewers: mehdi_amini, serge-sans-paille, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38916

llvm-svn: 316045
2017-10-17 23:47:06 +00:00
Eugene Zelenko
66256b9583 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316034
2017-10-17 21:27:42 +00:00
Sam McCall
a8580da010 Fix pthread_[gs]etname_np detection
llvm-svn: 316005
2017-10-17 15:32:17 +00:00
Yichao Yu
5ecae3bc91 Fix FaultMaps crash when the out streamer is reused
Summary:
Make sure the map is cleared before processing a new module. Similar to what is done on `StackMaps`.

This issue is similar to D38588, though this time for FaultMaps (on x86) rather than ARM/AArch64. Other than possible mixing of information between modules, the crash is caused by the pointers values in the map that was allocated by the bump pointer allocator that is unwinded when emitting the next file. This issue has been around since 3.8.

This issue is likely much harder to write a test for since AFAICT it requires emitting something much more compilcated (and possibly real code) instead of just some random bytes.

Reviewers: skatkov, sanjoy

Reviewed By: skatkov, sanjoy

Subscribers: sanjoy, aemerson, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38924

llvm-svn: 315990
2017-10-17 11:44:34 +00:00
Philip Reames
3a2efccab1 Revert 315440 on behalf of mkazantsev
This patch reverts rL315440 because of the bug described at
https://bugs.llvm.org/show_bug.cgi?id=34937

The fix for the bug is on review as D38944, but not yet ready.  Given this is a regression reverting until a fix is ready is called for.

Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason.  I did the build and test to ensure the revert worked as expected on his behalf.

llvm-svn: 315974
2017-10-17 06:21:07 +00:00
Sanjoy Das
e62cbedf0c Revert "[SCEV] Maintain and use a loop->loop invalidation dependency"
This reverts commit r315713.  It causes PR34968.

I think I know what the problem is, but I don't think I'll have time to fix it
this week.

llvm-svn: 315962
2017-10-17 01:03:56 +00:00
Matthew Simpson
88ecaf14af Add !callees metadata
This patch adds a new kind of metadata that indicates the possible callees of
indirect calls.

Differential Revision: https://reviews.llvm.org/D37354

llvm-svn: 315944
2017-10-16 22:22:11 +00:00
Eugene Zelenko
584203c7eb [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315940
2017-10-16 21:34:24 +00:00
Krzysztof Parzyszek
23820a3959 Replace make_range in MachineRegisterInfo with ArrayRef, NFC
llvm-svn: 315938
2017-10-16 21:19:40 +00:00
Tony Tye
9d875a6ad4 Add base relative relocation record that can be used for the following case (OpenCL example):
static __global int Var = 0; 
__global int* Ptr[] = {&Var};
...

In this case Var is a non premptable symbol and so its address can be used as the value of Ptr, with a base relative relocation that will add the delta between the ELF address and the actual load address. Such relocations do not require a symbol.

Differential Revision: https://reviews.llvm.org/D38909

llvm-svn: 315935
2017-10-16 20:44:29 +00:00
Krzysztof Parzyszek
7791aebbb3 Add iterator range MachineRegisterInfo::liveins(), adopt users, NFC
llvm-svn: 315927
2017-10-16 19:08:41 +00:00
Anna Thomas
1811e7f3ce [SCEV] Rename getMaxBECount and update comments. NFC
Post commit review comments at D38825.

llvm-svn: 315920
2017-10-16 17:47:17 +00:00
Matthew Simpson
b85949d044 [SparsePropagation] Enable interprocedural analysis
This patch adds the ability to perform IPSCCP-like interprocedural analysis to
the generic sparse propagation solver. The patch gives clients the ability to
define their own custom LatticeKey types that the generic solver maps to custom
LatticeVal types. The custom lattice keys can be used, for example, to
distinguish among mappings for regular values, values returned from functions,
and values stored in global variables. Clients are responsible for defining how
to convert between LatticeKeys and LLVM Values by providing a specialization of
the LatticeKeyInfo template.

The added unit tests demonstrate how the generic solver can be used to perform
a simplified version of interprocedural constant propagation.

Differential Revision: https://reviews.llvm.org/D37353

llvm-svn: 315919
2017-10-16 17:44:17 +00:00
Andrew V. Tischenko
1787c97059 This patch is a result of D37262: The issues with X86 prefixes. It closes PR7709, PR17697, PR19251, PR32809 and PR21640. There could be other bugs closed by this patch.
llvm-svn: 315899
2017-10-16 11:14:29 +00:00
Daniel Sanders
83019be957 Re-commit r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315887
2017-10-16 03:36:29 +00:00
Daniel Sanders
281e3744f7 Revert r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
MSVC doesn't like one of the constructors.

llvm-svn: 315886
2017-10-16 02:15:39 +00:00
Daniel Sanders
2b6560d286 [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315885
2017-10-16 01:16:35 +00:00
Daniel Sanders
74e6f55971 [globalisel][tablegen] Implement unindexed load, non-extending load, and MemVT checks
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
       special-case. I've yet to find a reasonable way to implement it.

select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.

Depends on D37456

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37457

llvm-svn: 315884
2017-10-16 00:56:30 +00:00
Daniel Sanders
55b609f419 Re-commit r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Hopefully fixed the ambiguous constructor that a large number of bots reported.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315869
2017-10-15 18:22:54 +00:00
Daniel Sanders
11bf61fbdb Revert r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
A large number of bots are failing on an ambiguous constructor call.

llvm-svn: 315866
2017-10-15 17:51:07 +00:00
Daniel Sanders
a7247bc07b [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315863
2017-10-15 17:03:36 +00:00
Aaron Ballman
1dbcb12601 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Hongbin Zheng
467ee2c357 [LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.

Differential Revision: https://reviews.llvm.org/D38928

llvm-svn: 315850
2017-10-15 07:31:02 +00:00
Daniel Sanders
ca098ff70b [globalisel][tablegen] Map ld and st to G_LOAD and G_STORE. NFC
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.

This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.

Depends on D37443

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37445

llvm-svn: 315843
2017-10-15 02:41:12 +00:00
Daniel Sanders
fc690adc20 [tablegen] Handle common load/store predicates inside tablegen. NFC.
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
   SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
   GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.

This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.

It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.

Depends on D36618

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37443

Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().

llvm-svn: 315841
2017-10-15 02:06:44 +00:00
Konstantin Zhuravlyov
6ed2d5b556 AMDGPU: Add AMDGPU HSA Kernel Descriptor
- Update docs to match llvm coding style
  - Add missing FP16_OVFL bit for gfx9
  - Fix the size of the kernel descriptor in the docs

Differential Revision: https://reviews.llvm.org/D38902

llvm-svn: 315822
2017-10-14 19:17:08 +00:00
Konstantin Zhuravlyov
d68ff516d3 AMDGPU: Bring HSA metadata on par with the specification
Differential Revision: https://reviews.llvm.org/D38753

llvm-svn: 315821
2017-10-14 19:03:51 +00:00
Jakub Kuderski
2aa97100da [Dominators] Remove the NCA check
Summary:
This patch removes the `verifyNCD` check.

The reason for this is that the other checks are sufficient to prove or disprove correctness of any DominatorTree, and that `verifyNCD` doesn't provide (in my option) better error messages then the other ones.
Additionally, this should give a (small) improvement to the total verification time, as the check is O(n), and checking the sibling property takes O(n^3).

Reviewers: dberlin, grosser, davide, brzycki

Reviewed By: brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38802

llvm-svn: 315790
2017-10-14 03:00:56 +00:00
Daniel Sanders
7f4765cd80 [globalisel][tablegen] Fix an unused variable warning caused by a typo (corrected OtherInsnID->OtherOpIdx).
The tests were passing by luck since the instruction ID and operand index happened to be the same.

llvm-svn: 315788
2017-10-14 01:51:46 +00:00
Daniel Sanders
5b568c3f23 [globalisel][tablegen] Fix undefined references to dump()
Two debugging statements snuck into the commit.

llvm-svn: 315783
2017-10-14 00:56:01 +00:00
Daniel Sanders
d400a290e5 [globalisel][tablegen] Simplify named operand/operator lookups and fix a wrong-code bug this revealed.
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.

Depends on D36569

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: aemerson, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36618

llvm-svn: 315780
2017-10-14 00:31:58 +00:00
Daniel Sanders
eb33df745e [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36569

llvm-svn: 315761
2017-10-13 21:28:03 +00:00
Eugene Zelenko
ffefeab297 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315760
2017-10-13 21:17:07 +00:00
Quentin Colombet
1a1679d9f5 [RegisterBankInfo] Cache the getMinimalPhysRegClass information
TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive
because it has to iterate over all the register classes.
Cache this information as we need and get it so that we limit its usage.
Right now, we heavily rely on it, because this is how we get the mapping
for vregs defined by copies from physreg (i.e., the one that are ABI
related).

Improve compile time by up to 10% for that pass.

NFC

llvm-svn: 315759
2017-10-13 21:16:15 +00:00
Daniel Sanders
4c2aa5c732 [aarch64] Support APInt and APFloat in ImmLeaf subclasses and make AArch64 use them.
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.

With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.

This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.

For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.

Depends on D36086

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36534

llvm-svn: 315747
2017-10-13 20:42:18 +00:00
Benjamin Kramer
a3c69c204d [SmallPtrSet] Add iterator epoch tracking.
This will detect invalid iterators when ABI breaking checks are enabled.

llvm-svn: 315746
2017-10-13 20:37:52 +00:00
Matt Arsenault
1086f79305 DAG: Add opcode and source type to isFPExtFree
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.

llvm-svn: 315740
2017-10-13 19:55:45 +00:00
Sanjay Patel
9bac033ffa [LLVMCore] fix description for OverflowingBinaryOperator; NFC
llvm-svn: 315726
2017-10-13 18:25:23 +00:00
Matthew Simpson
013daef6ec [IPSCCP] Move common functions to ValueLatticeUtils (NFC)
This patch moves some common utility functions out of IPSCCP and makes them
available globally. The functions determine if interprocedural data-flow
analyses can propagate information through function returns, arguments, and
global variables.

Differential Revision: https://reviews.llvm.org/D37638

llvm-svn: 315719
2017-10-13 17:53:44 +00:00
Sanjoy Das
7341e7e3ca [SCEV] Maintain and use a loop->loop invalidation dependency
Summary:
This change uses the loop use list added in the previous change to remember the
loops that appear in the trip count expressions of other loops; and uses it in
forgetLoop.  This lets us not scan every loop in the function on a forgetLoop
call.

With this change we no longer invalidate clear out backedge taken counts on
forgetValue.  I think this is fine -- the contract is that SCEV users must call
forgetLoop(L) if their change to the IR could have changed the trip count of L;
solely calling forgetValue on a value feeding into the backedge condition of L
is not enough.  Moreover, I don't think we can strengthen forgetValue to be
sufficient for invalidating trip counts without significantly re-architecting
SCEV.  For instance, if we have the loop:

  I = *Ptr;
  E = I + 10;
  do {
    // ...
  } while (++I != E);

then the backedge taken count of the loop is 9, and it has no reference to
either I or E, i.e. there is no way in SCEV today to re-discover the dependency
of the loop's trip count on E or I.  So a SCEV client cannot change E to (say)
"I + 20", call forgetValue(E) and expect the loop's trip count to be updated.

Reviewers: atrick, sunfish, mkazantsev

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38435

llvm-svn: 315713
2017-10-13 17:13:44 +00:00
Anna Thomas
b4afc533f7 [SCEV] Teach SCEV to find maxBECount when loop endbound is variant
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.

This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.

This provides more information to users of SCEV and can be used to
improve identification of finite loops.

Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38825

llvm-svn: 315683
2017-10-13 14:30:43 +00:00
Daniel Jasper
3863a9d4f9 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

llvm-svn: 315680
2017-10-13 14:04:21 +00:00
Sanjoy Das
9cb64b47cf [SCEV] Maintain loop use lists, and use them in forgetLoop
Summary:
Currently we do not correctly invalidate memoized results for add recurrences
that were created directly (i.e. they were not created from a `Value`).  This
change fixes this by keeping loop use lists and using the loop use lists to
determine which SCEV expressions to invalidate.

Here are some statistics on the number of uses of in the use lists of all loops
on a clang bootstrap (config: release, no asserts):

     Count: 731310
       Min: 1
      Mean: 8.555150
50th %time: 4
95th %tile: 25
99th %tile: 53
       Max: 433

Reviewers: atrick, sunfish, mkazantsev

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38434

llvm-svn: 315672
2017-10-13 05:50:52 +00:00
Matt Morehouse
e184e29f2e [llvm-isel-fuzzer] Use "--" as separator rather than '='.
Summary: OSS-Fuzz doesn't support '=' in filenames.

Reviewers: bogner, kcc

Reviewed By: kcc

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38866

llvm-svn: 315647
2017-10-13 00:18:32 +00:00
Adam Nemet
3799746ee9 Add DK_Remark to SMDiagnostic
Swift uses SMDiagnostic for diagnostic messages. For
https://github.com/apple/swift/pull/12294, we need remark support.

I picked the color that clang uses to display them.

Differential Revision: https://reviews.llvm.org/D38865

llvm-svn: 315642
2017-10-12 23:56:02 +00:00
Craig Topper
bbfe102642 [SelectionDAG] Const-correct the DemandedMask argument to one of the overloads of SimplifyDemandedBits. NFC
llvm-svn: 315641
2017-10-12 23:46:05 +00:00
Eugene Zelenko
16fc947602 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315640
2017-10-12 23:30:03 +00:00
Matthias Braun
6cdc04a20c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Matthias Braun
e1c491ab05 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Craig Topper
bb29109c88 [X86] Add CLWB intrinsic. llvm part
llvm-svn: 315613
2017-10-12 20:08:31 +00:00
Wei Ding
725c541dbe Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

llvm-svn: 315610
2017-10-12 19:37:14 +00:00
Konstantin Zhuravlyov
2c85ba5119 AMDGPU/NFC: Move AMDGPU specific note types to ELF.h
Differential Revision: https://reviews.llvm.org/D38747

llvm-svn: 315608
2017-10-12 18:59:54 +00:00
Artem Belevich
848056d8ad [NVPTX] Implemented wmma intrinsics and instructions.
WMMA = "Warp Level Matrix Multiply-Accumulate".
These are the new instructions introduced in PTX6.0 and available
on sm_70 GPUs.

Differential Revision: https://reviews.llvm.org/D38645

llvm-svn: 315601
2017-10-12 18:27:55 +00:00
Don Hinton
16622c817e [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00