1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

88222 Commits

Author SHA1 Message Date
Derek Schuff
cd46ca19fb [WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback
Summary:
MRI::eliminateFrameIndex can emit several instructions to do address
calculations; these can usually be stackified. Because instructions with
FI operands can have subsequent operands which may be expression trees,
find the top of the leftmost tree and insert the code before it, to keep
the LIFO property.

Also use stackified registers when writing back the SP value to memory
in the epilog; it's unnecessary because SP will not be used after the
epilog, and it results in better code.

Differential Revision: http://reviews.llvm.org/D18234

llvm-svn: 263725
2016-03-17 17:00:29 +00:00
David Majnemer
1cd89cdff1 [COFF] Refactor section alignment calculation
Section alignment isn't completely trivial, let it live in one place so
that we may reuse it in LLVM.

llvm-svn: 263722
2016-03-17 16:55:18 +00:00
David Majnemer
3542a361ce Forgot to commit this with r263692
llvm-svn: 263721
2016-03-17 16:55:11 +00:00
Changpeng Fang
7f678be718 AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute
Symmary:
  ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed.

Reviewers
  arsenm, tstellarAMD

Subscribers
  llvm-commits, arsenm

Differential Revision:
  http://reviews.llvm.org/D18197

llvm-svn: 263720
2016-03-17 16:43:50 +00:00
Nicolai Haehnle
b24081a638 AMDGPU: mark atomic instructions as sources of divergence
Summary:
As explained by the comment, threads will typically see different values
returned by atomic instructions even if the arguments are equal.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18156

llvm-svn: 263719
2016-03-17 16:21:59 +00:00
Simon Pilgrim
f9f3d37f61 [X86][SSE] Simplified blend-with-zero combining
We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines

This patch stops the combine if we already have a blend in place (means we miss some domain corrections)

llvm-svn: 263717
2016-03-17 15:59:36 +00:00
Sanjay Patel
1aa6c6adf1 propagate 'unpredictable' metadata on select instructions
This is similar to D18133 where we allowed profile weights on select instructions. 
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.

A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648

Differential Revision: http://reviews.llvm.org/D18220

llvm-svn: 263716
2016-03-17 15:30:52 +00:00
Saleem Abdulrasool
a3ae5ba6d2 ARM: Revert SVN r253865, 254158, fix windows division
The two changes together weakened the test and caused a regression with division
handling in MSVC mode.  They were applied to avoid an assertion being triggered
in the block frequency analysis.  However, the underlying problem was simply
being masked rather than solved properly.  Address the actual underlying problem
and revert the changes.  Rather than analyze the cause of the assertion, the
division failure was assumed to be an overflow.

The underlying issue was a subtle bug in the BB construction in the emission of
the div-by-zero check (WIN__DBZCHK).  We did not construct the proper successor
information in the basic blocks, nor did we update the PHIs associated with the
basic block when we split them.  This would result in assertions being triggered
in the block frequency analysis pass.

Although the original tests are being removed, the tests themselves performed
very little in terms of validation but merely tested that we did not assert when
generating code.  Update this with new tests that actually ensure that we do not
regress on the code generation.

llvm-svn: 263714
2016-03-17 14:10:49 +00:00
Simon Atanasyan
ac8aa5d9e9 [mips] Use formatImm call to print immediate value in the MipsInstPrinter
That allows, for example, to print hex-formatted immediates using
llvm-objdump --print-imm-hex command line option.

Differential Revision: http://reviews.llvm.org/D18195

llvm-svn: 263704
2016-03-17 10:43:36 +00:00
Scott Egerton
805aef719b [mips] Eliminate instances of "potentially uninitialised local variable" warnings, NFC
Summary:
This should eliminate all occurrences of this within LLVMMipsAsmParser.
This patch is in response to http://reviews.llvm.org/D17983. I was unable
to reproduce the warnings on my machine so please advise if this fixes the
warnings.

Reviewers: ariccio, vkalintiris, dsanders

Subscribers: dblaikie, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18087

llvm-svn: 263703
2016-03-17 10:37:51 +00:00
Sanjoy Das
3634ad5d88 [Statepoints] Separate out logic for statepoint directives; NFC
This splits out the logic that maps the `"statepoint-id"` attribute into
the actual statepoint ID, and the `"statepoint-num-patch-bytes"`
attribute into the number of patchable bytes the statpeoint is lowered
into.  The new home of this logic is in IR/Statepoint.cpp, and this
refactoring will support similar functionality when lowering calls with
deopt operand bundles in the future.

llvm-svn: 263685
2016-03-17 01:56:10 +00:00
Sanjoy Das
ae5062e675 [Statepoints] Minor NFC cleanups
Mostly code simplifcations, and bringing up IR/Statepoints.cpp up to
LLVM coding style.

llvm-svn: 263683
2016-03-17 00:47:18 +00:00
Sanjoy Das
c5c53e4e0f [SelectionDAG] Remove visitStatepoint; NFC
This way we have a single entry point into StatepointLowering.  The
method was a direct dispatch to LowerStatepoint anyway.

llvm-svn: 263682
2016-03-17 00:47:14 +00:00
Chris Bieneman
b6bde406c3 Upgrade TBAA *before* upgrading intrinsics
Summary: If TBAA is on an intrinsic and it gets upgraded and drops the TBAA we hit an odd assert. We should just upgrade the TBAA first because it doesn't have side-effects.

Reviewers: reames, apilipenko, manmanren

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18229

llvm-svn: 263673
2016-03-16 23:17:54 +00:00
Sanjoy Das
990420994e Fix indentation; NFC
llvm-svn: 263672
2016-03-16 23:11:21 +00:00
Sanjoy Das
f404934b85 Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC
Summary:
This is a step towards implementing "direct" lowering of calls and
invokes with deopt operand bundles into STATEPOINT nodes (as opposed to
having them mandatorily pass through RewriteStatepointsForGC, which is
the case today).

This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint`
helper function that is able to lower a "statepoint like thing", and
uses it to lower `gc.statepoint` calls.  This is an NFC now, but in a
later change we will use `LowerAsStatepoint` to directly lower calls and
invokes with operand bundles without going through an intermediate
`gc.statepoint` IR representation.

FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add
support for lowering non gc.statepoints, right now it is fairly tightly
coupled with an IR level `gc.statepoint`.

Reviewers: reames, pgavlin, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18106

llvm-svn: 263671
2016-03-16 23:08:00 +00:00
Xinliang David Li
70f512af3e Variable name cleanup /NFC
llvm-svn: 263666
2016-03-16 22:13:41 +00:00
James Y Knight
4b163a5aa3 Tweak some atomics functions in preparation for larger changes; NFC.
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
  '__sync' libcalls and '__atomic' libcalls, and this function is for
  the '__sync' ones.

- getInsertFencesForAtomic() has been replaced with
  shouldInsertFencesForAtomic(Instruction), so that the decision can be
  made per-instruction. This functionality will be used soon.

- emitLeadingFence/emitTrailingFence are no longer called if
  shouldInsertFencesForAtomic returns false, and thus don't need to
  check the condition themselves.

llvm-svn: 263665
2016-03-16 22:12:04 +00:00
Sanjoy Das
16fc176033 [SelectionDAG] Extract out populateCallLoweringInfo; NFC
SelectionDAGBuilder::populateCallLoweringInfo is now used instead of
SelectionDAGBuilder::lowerCallOperands.  The populateCallLoweringInfo
interface is more composable in face of design changes like
http://reviews.llvm.org/D18106

llvm-svn: 263663
2016-03-16 20:49:31 +00:00
Vedant Kumar
ade0280b1f [ProfileData] Make a utility method public, NFC
The swift frontend needs to be able to look up PGO function name
variables based on the original raw function name. That's because it's
not possible to create PGO function name variables while emitting swift
IR. Instead, we have to create the name variables while lowering swift
IR to llvm IR, at which point we fix up all calls to the increment
intrinsic to point to the right name variable.

llvm-svn: 263662
2016-03-16 20:49:26 +00:00
Nicolai Haehnle
dec2f57950 AMDGPU: Prevent uniform loops from becoming infinite
Summary:
Uniform loops where the branch leaving the loop is predicated on VCCNZ
must be skipped if EXEC = 0, otherwise they will be infinite.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18137

llvm-svn: 263658
2016-03-16 20:14:33 +00:00
Colin LeMahieu
397dce7d53 [Hexagon] Adding missing break in switch statement. Extra operands would have been appended to the end.
llvm-svn: 263657
2016-03-16 20:00:38 +00:00
Chad Rosier
705414d47a [SLP] Make DataLayout a member variable.
llvm-svn: 263656
2016-03-16 19:48:42 +00:00
Geoff Berry
e30b3db816 Revert "[LSR] Create fewer redundant instructions."
This reverts commit r263644.  Investigating bootstrap failures.

llvm-svn: 263655
2016-03-16 19:21:47 +00:00
Simon Pilgrim
bfb2dc296d Removed trailing whitespace
llvm-svn: 263650
2016-03-16 18:37:44 +00:00
Sanjay Patel
bd226877f7 fix function names; NFC
llvm-svn: 263646
2016-03-16 18:00:09 +00:00
Evgeniy Stepanov
274dac12b9 [msan] Add a comment with a bug link.
llvm-svn: 263645
2016-03-16 17:39:17 +00:00
Geoff Berry
715fd0c25a [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Reviewers: atrick

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18001

llvm-svn: 263644
2016-03-16 17:29:49 +00:00
Michel Danzer
8851024ddd AMDGPU: Verify instructions in non-debug builds as well
And emit an error if it fails.

This prevents illegal instructions from getting sent to the GPU, which
would potentially result in a hang.

This is a candidate for the stable branch(es).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
llvm-svn: 263627
2016-03-16 09:10:42 +00:00
Michel Danzer
fe529a11c9 AMDGPU/SI: Clean up indentation in SIInstrInfo::getDefaultRsrcDataFormat
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 263626
2016-03-16 09:10:35 +00:00
Igor Breger
bf48be46fb AVX512BW: Fix SRA v64i8 lowering. Use PCMPGTM (cmp result in k register) for 512bit vector because PCMPGT supported only for 128/256bit.
Differential Revision: http://reviews.llvm.org/D18204

llvm-svn: 263624
2016-03-16 08:48:26 +00:00
Haicheng Wu
7db39c64f9 [JumpThreading] See through Cast Instructions
To capture more jump-thread opportunity.

llvm-svn: 263618
2016-03-16 04:52:52 +00:00
Lang Hames
69d7550bd6 [Support] Add the 'Error' class for structured error handling.
This patch introduces the Error classs for lightweight, structured,
recoverable error handling. It includes utilities for creating, manipulating
and handling errors. The scheme is similar to exceptions, in that errors are
described with user-defined types. Unlike exceptions however, errors are
represented as ordinary return types in the API (similar to the way
std::error_code is used).

For usage notes see the LLVM programmer's manual, and the Error.h header.
Usage examples can be found in unittests/Support/ErrorTest.cpp.

Many thanks to David Blaikie, Mehdi Amini, Kevin Enderby and others on the
llvm-dev and llvm-commits lists for lots of discussion and review.

llvm-svn: 263609
2016-03-16 01:02:46 +00:00
Haicheng Wu
ec5b54aa40 Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()"
Not sure it handles undef properly.

llvm-svn: 263605
2016-03-15 23:38:47 +00:00
Adam Nemet
d9b1474a1b Turn LoopLoadElimination on again
The latent bug that LLE exposed in the LoopVectorizer was resolved
(PR26952).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263595
2016-03-15 22:26:12 +00:00
Mike Aizatsky
34cbd6a60e [libfuzzer] speeding up corpus load
llvm-svn: 263591
2016-03-15 21:47:21 +00:00
Bjorn Steinbrink
a31b1fe444 Also handle the new Rust pers fn to isCatchAll()
llvm-svn: 263585
2016-03-15 20:57:07 +00:00
Bjorn Steinbrink
1148767aed Add Rust's personality function to the list of known personality functions
Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18192

llvm-svn: 263581
2016-03-15 20:35:45 +00:00
Evgeniy Stepanov
ad30e96a63 [msan] Don't put module constructors in comdats.
There is something strange going on with debug info (.eh_frame_hdr)
disappearing when msan.module_ctor are placed in comdat sections.

Moving this functionality under flag, disabled by default.

llvm-svn: 263579
2016-03-15 20:25:47 +00:00
Teresa Johnson
4ce6e8f7a7 [ThinLTO] Record all global variable defs in the summary
Record all variable defs with a summary record to aid in building a
complete reference graph and locating constant variable defs to import.

llvm-svn: 263576
2016-03-15 19:35:45 +00:00
Chris Bieneman
a31da631cf [CMake] Add PACKAGE_VENDOR for customizing version output
Summary: This change adds a PACKAGE_VENDOR variable. When set it makes the version output more closely resemble the clang version output.

Reviewers: aprantl, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18159

llvm-svn: 263566
2016-03-15 18:07:46 +00:00
Adam Nemet
269fc43aff [LV] Preserve LoopInfo when store predication is used
This was a latent bug that got exposed by the change to add LoopSimplify
as a dependence to LoopLoadElimination.  Since LoopInfo was corrupted
after LV, LoopSimplify mis-compiled nbench in the test-suite (more
details in the PR).

The problem was that when we create the blocks for predicated stores we
didn't add those to any loops.

The original testcase for store predication provides coverage for this
assuming we verify LI on the way out of LV.

Fixes PR26952.

llvm-svn: 263565
2016-03-15 18:06:20 +00:00
Davide Italiano
9708c4a28b [MC] Rename TLSDESC as it's not ARM specific.
Similarly to what was done for TLSCALL in r263515.

llvm-svn: 263564
2016-03-15 17:29:52 +00:00
Changpeng Fang
8c6e4aca72 AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDS
Summary:
  Static LDS size is saved in MachineFunctionInfo::LDSSize,
We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter,
we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize.

Reviewers:
    arsenm
    tstellarAMD

Subscribers
    llvm-commits, arsenm

Differential Revision:
   http://reviews.llvm.org/D18064

llvm-svn: 263563
2016-03-15 17:28:44 +00:00
Douglas Katzman
d81fc2168d Myriad: Add new sparc CPU kinds.
llvm-svn: 263557
2016-03-15 16:41:47 +00:00
Benjamin Kramer
aa4ca53d10 [GlobalOpt] Don't look through aliases when sorting names of globals.
If both are different aliases to the same value the sorting becomes
non-deterministic as array_pod_sort is not stable.

llvm-svn: 263550
2016-03-15 14:18:26 +00:00
Chad Rosier
a1658d4235 [SLP] Update comment to reflect reality. NFC.
llvm-svn: 263548
2016-03-15 13:27:58 +00:00
Lang Hames
e98d2dfdbc [MachO] Extend the alt_entry support for aliases added in r263521 to
expressions of the form 'a = .' and 'a = Ltmp'.

llvm-svn: 263528
2016-03-15 04:20:49 +00:00
Eric Christopher
032166f634 Use some braces to format this a little better.
llvm-svn: 263527
2016-03-15 03:01:31 +00:00
Teresa Johnson
402880e765 BitcodeWriter dyn_cast cleanup for r263275 (NFC)
Address review suggestions from dblaikie: change a few dyn_cast to cast
and fold a cast into if condition.

llvm-svn: 263526
2016-03-15 02:41:29 +00:00
Eric Christopher
773d4a559f Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest
parentheses around '&&' within '||' [-Werror=parentheses].

llvm-svn: 263525
2016-03-15 02:19:06 +00:00
Teresa Johnson
93b1615239 Move global ID computation from Function to GlobalValue (NFC)
Since the static getGlobalIdentifier and getGUID methods are now called
for global values other than functions, reflect that by moving these
methods to the GlobalValue class.

llvm-svn: 263524
2016-03-15 02:13:19 +00:00
Lang Hames
e025323d29 [MachO] Add MachO alt-entry directive support.
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:

safe_foo:
  // check preconditions for foo
.alt_entry fast_foo
fast_foo:
  // body of foo, can assume preconditions.

The .alt_entry flag is also implicitly set on assembly aliases of the form:

a = b + C

where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.

llvm-svn: 263521
2016-03-15 01:43:05 +00:00
Kostya Serebryany
b66ed58c06 [libFuzzer] use max_len exactly equal to the max size of input. Fix 32-bit build
llvm-svn: 263518
2016-03-15 01:28:00 +00:00
Sanjoy Das
a4372ed421 [StatepointLowering] Move an assertion; NFCI
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.

llvm-svn: 263516
2016-03-15 01:16:31 +00:00
Davide Italiano
ec0101c58a [MC] Rename TLSCALL as it's not ARM specific.
`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with 
_ARM_ since this is how it was originally implemented.
The X86_64 version is exactly the same so there's no reason
to create a new variant, we can just rename the existing
one to be machine-independent.
This generalization is the first step to implement support
for GNU2 TLS dialect in MC.

Differential Revision:  http://reviews.llvm.org/D18160

llvm-svn: 263515
2016-03-15 00:25:22 +00:00
Teresa Johnson
89e5e1dadb [ThinLTO] Renaming of function index to module summary index (NFC)
(Resubmitting after fixing missing file issue)

With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263513
2016-03-15 00:04:37 +00:00
Eric Christopher
ddb99141b3 Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.

This reverts commit r263303.

llvm-svn: 263512
2016-03-14 23:59:57 +00:00
Justin Lebar
19453c8511 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Amaury Sechet
424f38f304 Imporove load to store => memcpy
Summary: This now try to reorder instructions in order to help create the optimizable pattern.

Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer

Differential Revision: http://reviews.llvm.org/D16523

llvm-svn: 263503
2016-03-14 22:52:27 +00:00
Manuel Jacob
6c0fc2f192 Re-add ConstantFoldInstOperands form taking opcode and return type.
Summary:
This form was replaced by a form taking an instruction instead of opcode and
return type in r258391.  After committing this change (and some depending,
follow-up changes) it turned out in the review thread to be controversial.  The
discussion didn't come to a conclusion yet.  I'm re-adding the old form to fix
the API regression and to provide a better base for discussion, possibly on
llvm-dev.

A difference to the original function is that it can't be called with GEPs
(similarly to how it was already the case for compares).  In order to support
opaque pointers in the future, folding GEPs needs to be passed the source
element type, which is not possible with the current API.

Reviewers: dberlin, reames

Subscribers: dblaikie, eddyb

Differential Revision: http://reviews.llvm.org/D17901

llvm-svn: 263501
2016-03-14 22:34:17 +00:00
Amaury Sechet
fb38990f13 Factor out MachineBlockPlacement::fillWorkLists. NFC
Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18077

llvm-svn: 263495
2016-03-14 21:24:11 +00:00
Teresa Johnson
6632706d71 Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"
This reverts commit r263490. Missed a file.

llvm-svn: 263493
2016-03-14 21:18:10 +00:00
Teresa Johnson
7f5c7bac22 [ThinLTO] Renaming of function index to module summary index (NFC)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.

A companion clang patch will immediately succeed this patch to reflect
this renaming.

llvm-svn: 263490
2016-03-14 21:05:56 +00:00
Adam Nemet
6e95e2f1af Revert "Turn LoopLoadElimination on again"
This reverts commit r263472.

There is an LNT failure on clang-ppc64be-linux-lnt.  Turn this off,
while I am investigating.

llvm-svn: 263485
2016-03-14 20:38:55 +00:00
Sanjay Patel
4b03bb7c12 allow branch weight metadata on select instructions (PR26636)
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636

This doesn't accomplish anything on its own. It's the first step towards preserving 
and using branch weights with selects.

The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to 
determine how to correctly propagate the weights.

Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.

The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.

Differential Revision: http://reviews.llvm.org/D18133

llvm-svn: 263482
2016-03-14 20:18:59 +00:00
Justin Lebar
7e369d0dca [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

llvm-svn: 263481
2016-03-14 20:18:54 +00:00
Ulrich Weigand
cbd02a6d7c [SystemZ] Add missing isBranch flags to certain instruction
Some instructions were missing isBranch, isCall, or isTerminator
flags.  This didn't really affect code generation since most of
the affected patterns were used only for the AsmParser and/or
disassembler.

However, it could affect tools using the MC layer to disassemble
and parse binary code (e.g. via MCInstrDesc::mayAffectControlFlow).

llvm-svn: 263478
2016-03-14 20:16:30 +00:00
Keno Fischer
482f181653 [SLPVectorizer] Fix dependency list
Summary:
DemandedBits was added to the requirements of SLPVectorizer in rL261212
(and various earlier version of it), but the appropriate initialization
statement was accidentally forgotten.

Ref [[ https://github.com/JuliaLang/julia/issues/14998 | JuliaLang/julia#14998 ]].

Patch by Yichao Yu.
Reviewers: mssimpso
Differential Revision: http://reviews.llvm.org/D18152

llvm-svn: 263476
2016-03-14 20:04:24 +00:00
Adam Nemet
a0e0c0cd76 Turn LoopLoadElimination on again
The two issues that were discovered got fixed (r263058, r263173).

The pass can be disabled with -mllvm -enable-loop-load-elim=0

llvm-svn: 263472
2016-03-14 19:40:25 +00:00
Michael Kuperstein
c117351c9c [AliasSetTracker] Do not strip pointer casts when processing MemSetInst
This fixes PR26843.

llvm-svn: 263462
2016-03-14 18:34:29 +00:00
Chad Rosier
41407e5c3d [AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC.
http://reviews.llvm.org/D18125
Patch by Aditya Kumar.

llvm-svn: 263461
2016-03-14 18:24:34 +00:00
Quentin Colombet
cca4e6c42d [SpillPlacement] Fix a quadratic behavior in spill placement.
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.

Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
  added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.

llvm-svn: 263460
2016-03-14 18:21:25 +00:00
Chad Rosier
1c035c4cf5 [AArch64] Break the dependency between FP and SP when possible.
When the SP in not changed because of realignment/VLAs etc., we restore the SP
by using the previous value of SP and not the FP. Breaking the dependency will
help in cases when the epilog of a callee is close to the epilog of the caller;
for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of
the callee.

http://reviews.llvm.org/D18060
Patch by Aditya Kumar and Evandro Menezes.

llvm-svn: 263458
2016-03-14 18:17:41 +00:00
Chad Rosier
3b7d2db4ef [Mips] Fix -Wunused-private-field warning after r263444.
llvm-svn: 263454
2016-03-14 18:10:20 +00:00
Sanjay Patel
f7ad46820f [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel
f22bc14a47 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Tom Stellard
1cea59b42c AMDGPU/SI: Handle wait states required for DPP instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17543

llvm-svn: 263447
2016-03-14 17:05:56 +00:00
Sanjay Patel
f3adc07abf [x86, AVX] replace masked load with full vector load when possible
Converting masked vector loads to regular vector loads for x86 AVX should always be a win.
I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any
objections.

1. x86 already does this kind of optimization for multiple scalar loads -> vector load.
2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner.

Differential Revision: http://reviews.llvm.org/D18094

llvm-svn: 263446
2016-03-14 16:54:43 +00:00
Daniel Sanders
0983c108c8 [mips] MIPS32R6 compact branch support
Summary:
MIPSR6 introduces a class of branches called compact branches. Unlike the
traditional MIPS branches which have a delay slot, compact branches do not
have a delay slot. The instruction following the compact branch is only
executed if the branch is not taken and must not be a branch.

It works by generating compact branches for MIPS32R6 when the delay slot
filler cannot fill a delay slot. Then, inspecting the generated code for
forbidden slot hazards (a compact branch with an adjacent branch or other
CTI) and inserting nops to clear this hazard.

Patch by Simon Dardis.

Reviewers: vkalintiris, dsanders

Subscribers: MatzeB, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16353

llvm-svn: 263444
2016-03-14 16:24:05 +00:00
Marek Olsak
64405cd52f AMDGPU/SI: Incomplete shader binaries need to finish execution at the end
Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D18058

llvm-svn: 263441
2016-03-14 15:57:14 +00:00
Nicolai Haehnle
176749e5b2 AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence
Summary:
When multiple threads perform an atomic op with the same arguments, they
will usually see different return values.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18101

llvm-svn: 263440
2016-03-14 15:37:18 +00:00
Vasileios Kalintiris
b93f6a836b [mips] Use range-based for loops. NFC.
llvm-svn: 263438
2016-03-14 15:05:30 +00:00
Benjamin Kramer
8ea7e842e3 Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379."
This reverts commit r263424. Breaks self-host.

llvm-svn: 263437
2016-03-14 14:58:36 +00:00
Ulrich Weigand
5020f81c76 [SystemZ] Avoid LER on z13 due to partial register dependencies
On the z13, it turns out to be more efficient to access a full
floating-point register than just the upper half (as done e.g.
by the LE and LER instructions).

Current code already takes this into account when loading from
memory by using the LDE instruction in place of LE.  However,
we still generate LER, which shows the same performance issues
as LE in certain circumstances.

This patch changes the back-end to emit LDR instead of LER to
implement FP32 register-to-register copies on z13.

llvm-svn: 263431
2016-03-14 13:50:03 +00:00
Chad Rosier
e4cdbb48f0 [CVP] Replace nonnegative with positive, per Philip's request. NFC.
llvm-svn: 263430
2016-03-14 13:48:00 +00:00
Zlatko Buljan
cc5a6a9d1a [mips] Fix an issue with long double when function roundl is defined
Differential Revision: http://reviews.llvm.org/D17760

llvm-svn: 263428
2016-03-14 12:50:23 +00:00
Daniel Sanders
85d0b438a1 [mips] Range check uimm16_64
Summary:

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D17725

llvm-svn: 263427
2016-03-14 12:44:44 +00:00
Amjad Aboud
97fd9dd46d Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26715 at r263379.

llvm-svn: 263424
2016-03-14 12:03:20 +00:00
Daniel Sanders
eda4dc3658 [mips] Simplify ordering of range checked immediate classes.
Summary:
With the addition of checks to ensure that operands have a strict ordering
it has become tricky to manage the order in the way I originally intended.

This patch linearizes the ordering which simplifies the implementation but
requires an order that is arbitrary in places. Here are some examples:
* uimm4 < uimm5 < uimm6
* simm4 < uimm4 < simm5 < uimm5
* uimm5 < uimm5_plus1 (1..32) < uimm5_plus32 (32..63) < uimm6
  The term 'superset' starts to break down here since the *_plus* classes
  are not true supersets of uimm5 (but they are still subsets of uimm6).
* uimm5 < uimm5_64, and uimm5 < vsplat_uimm5
  This is entirely arbitrary. We need an ordering and what we pick is
  unimportant since only one is possible for a given mnemonic.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D17723

llvm-svn: 263423
2016-03-14 11:46:30 +00:00
Nikolay Haustov
bade203fd0 [AMDGPU] Assembler: SOP* instruction fixes
s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit.
s_rfe_b64 has just one destination operand and no source.
Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that.
Add s_memrealtime test and change comments in smem.s to follow common style.
Change test for s_memtime to use non-zero register to make it really test encoding.
Add tests for s_buffer_load*.
Add tests for SOPC instructions (same for SI and VI)

Differential Revision: http://reviews.llvm.org/D18040

llvm-svn: 263420
2016-03-14 11:17:19 +00:00
Daniel Sanders
dec4e4aa9c [mips] Range check uimm6_lsl2.
Summary:

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17291

llvm-svn: 263419
2016-03-14 11:16:56 +00:00
Hans Wennborg
3b13eb7175 Try to fix build of WebAssemblyRegStackify.cpp on Windows
It's failing to build on VS2015 with:

C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\lib\Target\WebAssembly\WebAssemblyRegStackify.cpp(520):
error C2668: 'llvm::make_reverse_iterator': ambiguous call to overloaded function
C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\include\llvm/ADT/STLExtras.h(217):
note: could be 'std::reverse_iterator<llvm::MachineBasicBlock::iterator>
llvm::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(IteratorTy)'
        with
        [
            IteratorTy=llvm::MachineInstrBundleIterator<llvm::MachineInstr>
        ]
C:\b\depot_tools\win_toolchain\vs_files\391bbf1220d3edcd3cc3fccdb56224181e3b13a7\win_sdk\bin\..\..\VC\include\xutility(1217):
note: or 'std::reverse_iterator<llvm::MachineBasicBlock::iterator>
std::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(_RanIt)' [found using argument-dependent lookup]
        with
        [
            _RanIt=llvm::MachineInstrBundleIterator<llvm::MachineInstr>
        ]

I don't have VS2015 locally at the moment, but hopefully this will help.

llvm-svn: 263418
2016-03-14 11:04:15 +00:00
Igor Breger
e61fb42d0b AVX512: icmp operation should be always lowered to CMPM (AVX-512) instruction on SKX.
implemented by delena

Differential Revision: http://reviews.llvm.org/D18054

llvm-svn: 263417
2016-03-14 10:26:39 +00:00
Valery Pykhtin
73eb61d4ef [AMDGPU] AsmParser: Factor out parseRegister. NFC.
llvm-svn: 263411
2016-03-14 07:43:42 +00:00
Valery Pykhtin
fa61c51d9e [AMDGPU] AsmParser: refactor post push_back vector access. NFC.
llvm-svn: 263409
2016-03-14 05:25:44 +00:00
David Majnemer
00a7fd7b32 [CodeView] Consistently handle overly large symbol names
Overly large symbol names weren't correctly handled for leaf function
records.

llvm-svn: 263408
2016-03-14 05:15:09 +00:00
Valery Pykhtin
95a8f2b685 [AMDGPU] AsmParser: remove redundant isReg checks. NFC.
llvm-svn: 263407
2016-03-14 05:01:45 +00:00
Haicheng Wu
9e87f080cd [CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative
The motivating example is this

for (j = n; j > 1; j = i) {
   i = j / 2;
}

The signed division is safely to be changed to an unsigned division (j is known
to be larger than 1 from the loop guard) and later turned into a single shift
without considering the sign bit.

llvm-svn: 263406
2016-03-14 03:24:28 +00:00
Amaury Sechet
608702a15b Add facility to add/remove/check attribute on function and arguments.
Summary: This comes from work to make attribute manipulable via the C API.

Reviewers: gottesmm, hfinkel, baldrick, echristo, tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18128

llvm-svn: 263404
2016-03-14 01:37:29 +00:00
Mehdi Amini
12d624a26a Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Simon Pilgrim
176a4e0ec3 [X86][SSE41] Avoid variable blend for constant v8i16 shifts
The SSE41 v8i16 shift lowering using (v)pblendvb is great for non-constant shift amounts, but if it is constant then we can efficiently reduce the VSELECT to shuffles with the pre-SSE41 lowering.

llvm-svn: 263383
2016-03-13 18:35:59 +00:00
Amjad Aboud
9bfdf16f5b Fixed DIBuilder to verify that same imported entity will not be added twice to the "imports" list of the DICompileUnit.
Differential Revision: http://reviews.llvm.org/D17884

llvm-svn: 263379
2016-03-13 11:11:39 +00:00
David Majnemer
4253ad553a [CodeView] Truncate display names
Fundamentally, the length of a variable or function name is bound by the
maximum size of a record: 0xffff.  However, the name doesn't live in a
vacuum; other data is associated with the name, lowering the bound
further.

We would naively attempt to emit the name, causing us to assert because
the record would no-longer fit in 16-bits.  Instead, truncate the name
but preserve as much as we can.

While I have tested this locally, I've decided to not commit it due to
the test's size.

N.B.  While this behavior is undesirable, it is better than MSVC's
behavior.  They seem to truncate to ~4000 characters.

llvm-svn: 263378
2016-03-13 10:53:30 +00:00
David Majnemer
01978bba2d [Bitcode] Make writeComdats less strange
It had a weird artificial limitation on the write side: the comdat name
couldn't be bigger than 2**16.  However, the reader had no such
limitation.  Make the reader and the writer agree.

llvm-svn: 263377
2016-03-13 08:01:03 +00:00
Fiona Glaser
4634b04513 ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression
Check to see if all operands are constant before calling simplify on them
so that we don't perform wasted simplifications.

llvm-svn: 263374
2016-03-13 05:36:15 +00:00
Matt Arsenault
26aa8f6035 APFloat: Fix ilogb for denormals
llvm-svn: 263370
2016-03-13 05:12:32 +00:00
Matt Arsenault
b9effef6fc APFloat: Fix scalbn handling of denormals
This was incorrect for denormals, and also failed
on longer exponent ranges.

llvm-svn: 263369
2016-03-13 05:11:51 +00:00
Craig Topper
f34a1d74e9 [X86] Remove many operands that represent memory stores from outs to ins. These operands are the registers and immediates that specify the memory address not the memory itself thus they are inputs.
llvm-svn: 263354
2016-03-13 02:56:31 +00:00
Amaury Sechet
483902dcce Use templated version of unwrap instead of cats in the Core.cpp. NFC
llvm-svn: 263349
2016-03-13 00:54:40 +00:00
Amaury Sechet
577518b8ab Move LLVMConstStructInContext so that declarationa nd definition order match. NFC
llvm-svn: 263348
2016-03-13 00:40:12 +00:00
Sanjay Patel
b0eaf59441 fix documentation comments; NFC
llvm-svn: 263346
2016-03-12 20:44:58 +00:00
Sanjay Patel
2077c270a0 remove unnecessary cast; NFC
llvm-svn: 263343
2016-03-12 18:17:41 +00:00
Sanjay Patel
3d21d4f960 fix formatting; NFC
llvm-svn: 263342
2016-03-12 18:05:53 +00:00
Sanjay Patel
f900096568 use range loops; NFCI
llvm-svn: 263341
2016-03-12 16:52:17 +00:00
Sanjay Patel
60bcb48851 [x86, InstCombine] delete x86 SSE2 masked store with zero mask
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.

llvm-svn: 263340
2016-03-12 15:16:59 +00:00
Nemanja Ivanovic
eaa9a4f81a Fix for PR 26378
This patch corresponds to review:
http://reviews.llvm.org/D17712

We were not clearing the TOC vector in PPCAsmPrinter when initializing it. This
caused duplicate definition asserts when the pass is reused on the module
(i.e. with -compile-twice or in JIT contexts).

llvm-svn: 263338
2016-03-12 10:23:07 +00:00
Sanjoy Das
fe692ab86e Make gc relocates more strongly typed; NFC
Don't use a `Value *` where we can use a stronger `GCRelocateInst *`
type.

llvm-svn: 263327
2016-03-12 02:54:27 +00:00
Quentin Colombet
aaf2db6c80 [X86] Make sure we do not clobber RBX with cmpxchg when used as a base pointer.
cmpxchg[8|16]b uses RBX as one of its argument.
In other words, using this instruction clobbers RBX as it is defined to hold one
the input. When the backend uses dynamically allocated stack, RBX is used as a
reserved register for the base pointer. 

Reserved registers have special semantic that only the target understands and
enforces, because of that, the register allocator don’t use them, but also,
don’t try to make sure they are used properly (remember it does not know how
they are supposed to be used).

Therefore, when RBX is used as a reserved register but defined by something that
is not compatible with that use, the register allocator will not fix the
surrounding code to make sure it gets saved and restored properly around the
broken code. This is the responsibility of the target to do the right thing with
its reserved register.

To fix that, when the base pointer needs to be preserved, we use a different
pseudo instruction for cmpxchg that save rbx.
That pseudo takes two more arguments than the regular instruction:
- One is the value to be copied into RBX to set the proper value for the
  comparison.
- The other is the virtual register holding the save of the value of RBX as the
  base pointer. This saving is done as part of isel (i.e., we emit a copy from
  rbx).

cmpxchg_save_rbx <regular cmpxchg args>, input_for_rbx_reg, save_of_rbx_as_bp

This gets expanded into:
rbx = copy input_for_rbx_reg
cmpxchg <regular cmpxchg args>
rbx = save_of_rbx_as_bp

Note: The actual modeling of the pseudo is a bit more complicated to make sure
the interferes that appears after the pseudo gets expanded are properly modeled
before that expansion.

This fixes PR26883.

llvm-svn: 263325
2016-03-12 02:25:27 +00:00
Kostya Serebryany
5b93d4b15a [libFuzzer] try to use max_len based on the items of the corpus instead of blindly defaulting to 64 bytes.
llvm-svn: 263323
2016-03-12 01:57:04 +00:00
Eric Christopher
559b0be562 Temporarily revert:
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date:   Fri Mar 11 17:15:50 2016 +0000

    Remove PreserveNames template parameter from IRBuilder

    Summary:
    Following r263086, we are now relying on a flag on the Context to
    discard Value names in release builds.

    Reviewers: chandlerc

    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: http://reviews.llvm.org/D18023

    From: Mehdi Amini <mehdi.amini@apple.com>

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
    91177308-0d34-0410-b5e6-96231b3b80d8

until we can figure out what to do about clang and Release build testing.

This reverts commit 263258.

llvm-svn: 263321
2016-03-12 01:47:22 +00:00
Mehdi Amini
607b3e284c Minor cleanup and documentation to IRMover (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263304
2016-03-11 22:19:06 +00:00
Simon Pilgrim
d62ab3da09 [X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 263303
2016-03-11 22:18:05 +00:00
Ahmed Bougacha
a2af74a580 [AArch64] Don't blindly lower f16/f128 FCCMPs.
Instead, extend f16 (like we do when lowering a standalone SETCC),
and let f128 be legalized to the RT calls.

Fixes PR26803.

llvm-svn: 263301
2016-03-11 22:02:58 +00:00
Dan Gohman
0b3d2a0666 [WebAssembly] Add final keywords to a few more subclasses, for consistency.
llvm-svn: 263287
2016-03-11 19:45:37 +00:00
George Burgess IV
e00d07860b [MemorySSA] Make a return type reflect reality. NFC.
llvm-svn: 263286
2016-03-11 19:34:03 +00:00
Sanjoy Das
3b791814db Introduce @llvm.experimental.deoptimize
Summary:
This intrinsic, together with deoptimization operand bundles, allow
frontends to express transfer of control and frame-local state from
one (typically more specialized, hence faster) version of a function
into another (typically more generic, hence slower) version.

In languages with a fully integrated managed runtime this intrinsic can
be used to implement "uncommon trap" like functionality.  In unmanaged
languages like C and C++, this intrinsic can be used to represent the
slow paths of specialized functions.

Note: this change does not address how `@llvm.experimental_deoptimize`
is lowered.  That will be done in a later change.

Reviewers: chandlerc, rnk, atrick, reames

Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet

Differential Revision: http://reviews.llvm.org/D17732

llvm-svn: 263281
2016-03-11 19:08:34 +00:00
Vedant Kumar
bf7c40f75a [PGO] Skip value profile instrumentation of inline asm
Value profile instrumentation treats inline asm calls like they are
indirect calls. This causes problems when the 'Callee' is passed to a
ptrtoint cast -- the verifier rightly claims that this is bogus and
crashes opt.

llvm-svn: 263278
2016-03-11 18:57:48 +00:00
Teresa Johnson
e1affa2f8c [ThinLTO] Support for reference graph in per-module and combined summary.
Summary:
This patch adds support for including a full reference graph including
call graph edges and other GV references in the summary.

The reference graph edges can be used to make importing decisions
without materializing any source modules, can be used in the plugin
to make file staging decisions for distributed build systems, and is
expected to have other uses.

The call graph edges are recorded in each function summary in the
bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
there is PGO, where the ValueId can be mapped to the function GUID via
the ValueSymbolTable. In the function index in memory, the call graph
edges reference the target via the CalleeGUID instead of the
CalleeValueId.

The reference graph edges are recorded in each summary record with a
list of referenced value IDs, which can be mapped to value GUID via the
ValueSymbolTable.

Addtionally, a new summary record type is added to record references
from global variable initializers. A number of bitcode records and data
structures have been renamed to reflect the newly expanded scope of the
summary beyond functions. More cleanup will follow.

Reviewers: joker.eph, davidxl

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17212

llvm-svn: 263275
2016-03-11 18:52:24 +00:00
Simon Pilgrim
9da9e86e49 Fix spelling.
llvm-svn: 263266
2016-03-11 17:31:43 +00:00
Quentin Colombet
8cb0c9cc03 [IRTranslator] Translate unconditional branches.
llvm-svn: 263265
2016-03-11 17:28:03 +00:00
Quentin Colombet
292c6c729f [MachineIRBuilder] Rework buildInstr API to maximize code reuse.
llvm-svn: 263264
2016-03-11 17:27:58 +00:00
Quentin Colombet
81284f84b6 [IRTranslator] Update getOrCreateVReg API to use references.
A value that we want to keep in a virtual register cannot be null.
Reflect that in the API.

llvm-svn: 263263
2016-03-11 17:27:54 +00:00
Quentin Colombet
59f835d8b1 [MachineIRBuilder] Rename the setter of MF for consistency with the getter.
llvm-svn: 263262
2016-03-11 17:27:51 +00:00
Quentin Colombet
87dd1b9264 [MachineIRBuilder] Rename the setter for MBB for consistency with the getter.
llvm-svn: 263261
2016-03-11 17:27:47 +00:00
Quentin Colombet
4c20724407 [IRTranslator] Update getOrCreateBB API to use references.
A null basic block is invalid, so just pass a reference.

llvm-svn: 263260
2016-03-11 17:27:43 +00:00
Mehdi Amini
1f82b794e4 Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.

Reviewers: chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18023

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
2016-03-11 17:15:50 +00:00
Mehdi Amini
58e5289560 Do not specialize IRBuilder to strip names in SROA
Summary:
Following r263086, we are replacing this by a runtime check.
More cleanup will follow on the IRBuilder itself, but I submitted
this patch separately as SROA has a fancy "prefixInserter" class
that needs extra-love.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18022

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263256
2016-03-11 17:15:34 +00:00
Chad Rosier
acf4769353 [misched] Fix a truncation issue from r263021.
The truncation was causing the sorting algorithm to behave oddly when comparing
positive and negative offsets.  Fortunately, this doesn't currently happen in
practice and was exposed by a WIP.  Thus, I can't test this change now, but the
follow on patch will.

llvm-svn: 263255
2016-03-11 16:54:07 +00:00
Chandler Carruth
7f1a01ecd3 [PM] Sink the "Expression" type for GVN into the class as a private
member type.

Because of how this type is used by the ValueTable, it cannot actually
have hidden visibility. GCC actually nicely warns about this but Clang
just silently ... I don't even know. =/ We should do a better job either
way though.

This should resolve a bunch of the GCC warnings about visibility that
the port of GVN triggered and make the visibility story a bit more
correct.

llvm-svn: 263250
2016-03-11 16:25:19 +00:00
Marianne Mailhot-Sarrasin
eb1485c31e More UTF string conversion wrappers
Added new string conversion wrappers that convert between `std::string` (of UTF-8 bytes) and `std::wstring`, which is particularly useful for Win32 interop. Also fixed a missing string conversion for `getenv` on Win32, using these new wrappers.
The motivation behind this is to provide the support functions required for LLDB to work properly on Windows with non-ASCII data; however, the functions are not LLDB specific.

Patch by cameron314

Differential Revision: http://reviews.llvm.org/D17549

llvm-svn: 263247
2016-03-11 15:59:32 +00:00
Valery Pykhtin
7abb03febb [AMDGPU] Fix VOPC instruction operand namings
Differential Revision: http://reviews.llvm.org/D17966

llvm-svn: 263242
2016-03-11 14:53:28 +00:00
Simon Pilgrim
d1894c7f7a [X86][AVX] Fixed issue where a long chain of shuffles could attempt to combine to a single (illegal) PSHUFB instruction.
Its not enough that we test for SSSE3 - that's only OK for 128-bit vectors - we also need to test for AVX2 / AVX512BW for 256/512 bit vector cases.

llvm-svn: 263239
2016-03-11 14:39:10 +00:00
Chandler Carruth
0997ba3d13 [AA] Make BasicAA just require domtree.
This doesn't change how many times we construct domtrees in the normal
pipeline, and it removes fragility and instability where basic-aa may
not be run in time to see domtrees because they happen to be constructed
afterward.

This isn't quite as clean as the change to memdep because there is
a mode where basic-aa specifically runs without domtrees -- in the
hacking version used by function-attrs with the legacy pass manager.

llvm-svn: 263234
2016-03-11 13:53:18 +00:00
Chandler Carruth
a2bec20b29 [memdep] Just require domtree for memdep.
This doesn't cause us to construct dominator trees any more often in the
normal pipeline, and removes an entire mode of memdep that needed to be
reasoned about and maintained. Perhaps more importantly, it removes the
ability for the results of memdep to be different because of accidental
pass scheduling goofs or the order of evaluation of 'getResult' calls.

Essentially, 'getCachedResult', unless across IR-unit boundaries, is
extremely dangerous. We need to work much harder to avoid it (or its
analog in the old pass manager).

llvm-svn: 263232
2016-03-11 13:46:00 +00:00
Chandler Carruth
edc84248f8 [PM] The order of evaluation of these analyses is actually significant,
much to my horror, so use variables to fix it in place.

This terrifies me. Both basic-aa and memdep will provide more precise
information when the domtree and/or the loop info is available. Because
of this, if your pass (like GVN) requires domtree, and then queries
memdep or basic-aa, it will get more precise results. If it does this in
the other order, it gets less precise results.

All of the ideas I have for fixing this are, essentially, terrible. Here
I've just caused us to stop having unspecified behavior as different
implementations evaluate the order of these arguments differently. I'm
actually rather glad that they do, or the fragility of memdep and
basic-aa would have gone on unnoticed. I've left comments so we don't
immediately break this again. This should fix bots whose host compilers
evaluate the order of arguments differently from Clang.

llvm-svn: 263231
2016-03-11 13:26:47 +00:00
Vasileios Kalintiris
f63ff65eda [mips] MIPSR6 Instruction itineraries
Summary: Defines instruction itineraries for common MIPSR6 instructions.

Patch by Simon Dardis.

Reviewers: vkalintiris

Subscribers: MatzeB, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17198

llvm-svn: 263229
2016-03-11 13:05:06 +00:00
Daniel Sanders
ed44705402 [mips] Range check simm4.
Summary:

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16811

llvm-svn: 263220
2016-03-11 11:37:50 +00:00
Chandler Carruth
6150530377 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

llvm-svn: 263219
2016-03-11 11:05:24 +00:00
Chandler Carruth
0c3020180a [PM] Rename the CRTP mixin base classes for the new pass manager to
clarify their purpose.

Firstly, call them "...Mixin" types so it is clear that there is no
type hierarchy being formed here. Secondly, use the term 'Info' to
clarify that they aren't adding any interesting *semantics* to the
passes or analyses, just exposing APIs used by the management layer to
get information about the pass or analysis.

Thanks to Manuel for helping pin down the naming confusion here and come
up with effective names to address it.

In case you already have some out-of-tree stuff, the following should be
roughly what you want to update:

  perl -pi -e 's/\b(Pass|Analysis)Base\b/\1InfoMixin/g'

llvm-svn: 263217
2016-03-11 10:33:22 +00:00