1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

142663 Commits

Author SHA1 Message Date
Davide Italiano
d2cdaa14df [NewGVN] Remove unneeded newline from assertion message.
llvm-svn: 290755
2016-12-30 15:01:17 +00:00
Abhilash Bhandari
b2431386bd [ADT] Fix for compilation error when operator++(int) (post-increment function) of SmallPtrSetIterator is used.
The bug was introduced in r289619.

Reviewers: Mehdi Amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28134

llvm-svn: 290749
2016-12-30 12:34:36 +00:00
David Majnemer
6cb01daae8 [InstCombine] Address post-commit feedback
llvm-svn: 290741
2016-12-30 03:36:17 +00:00
Mehdi Amini
b6f8739c4a Fix test change in r290736: restore index generation
I remove one extra line, but because annoyingly llvm-lit does not
clean the output directory before running the test, it didn't fail
locally (the file was present from a previous run).

llvm-svn: 290740
2016-12-30 01:15:50 +00:00
Kostya Serebryany
8bf798611b [libFuzzer] cleaner implementation of -print_pcs=1
llvm-svn: 290739
2016-12-30 01:13:07 +00:00
Michael Kuperstein
173ab48710 [LICM] When promoting scalars, allow inserting stores to thread-local allocas.
This is similar to the allocfn case - if an alloca is not captured, then it's
necessarily thread-local.

Differential Revision: https://reviews.llvm.org/D28170

llvm-svn: 290738
2016-12-30 01:03:17 +00:00
Dehao Chen
6e6c58680c Use continuous boosting factor for complete unroll.
Summary:
The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously.

The patch also simplified the code by reducing one parameter in UP.

The patch only affects code-gen of two speccpu2006 benchmark:

445.gobmk binary size decreases 0.08%, no performance change.
464.h264ref binary size increases 0.24%, no performance change.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26989

llvm-svn: 290737
2016-12-30 00:50:28 +00:00
Mehdi Amini
79dcb50b47 Replace test from using llvm-lto to use llvm-link (NFC)
Some incoming changes in ThinLTO will break this test.
Instead of relying on the heuristic to import, we
force the importing to happen with llvm-link.

llvm-svn: 290736
2016-12-30 00:45:26 +00:00
Michael Kuperstein
3233a544cc [LICM] Remove unneeded tracking of whether changes were made. NFC.
"Changed" doesn't actually change within the loop, so there's
no reason to keep track of it - we always return false during
analysis and true after the transformation is made.

llvm-svn: 290735
2016-12-30 00:43:22 +00:00
Michael Kuperstein
53fdea54fe [LICM] Make logic in promoteLoopAccessesToScalars easier to follow. NFC.
llvm-svn: 290734
2016-12-30 00:39:00 +00:00
David Majnemer
a1ff7a6ad1 [InstCombine] More thoroughly canonicalize the position of zexts
We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y))
where possible.  However, we didn't perform the same canonicalization
for zexts or for muls.

llvm-svn: 290733
2016-12-30 00:28:58 +00:00
Dylan McKay
e1112141cb [AVR] Optimize 16-bit ORs with '0'
Summary: Fixes PR 31344

Authored by Anmol P. Paralkar

Reviewers: dylanmckay

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D28121

llvm-svn: 290732
2016-12-30 00:21:56 +00:00
Reid Kleckner
48c3376e2c Simplify FunctionLoweringInfo.cpp with range for loops
I'm preparing to add some pattern matching code here, so simplify the
code before I do. NFC

llvm-svn: 290731
2016-12-30 00:21:38 +00:00
Reid Kleckner
80d1fe591c Include <algorithm> for std::max etc
llvm-svn: 290730
2016-12-30 00:15:40 +00:00
Michael Kuperstein
213104a282 [LICM] Compute exit blocks for promotion eagerly. NFC.
This moves the exit block and insertion point computation to be eager,
instead of after seeing the first scalar we can promote.

The cost is relatively small (the computation happens anyway, see discussion
on D28147), and the code is easier to follow, and can bail out earlier
if there's a catchswitch present.

llvm-svn: 290729
2016-12-29 23:11:19 +00:00
Michael Kuperstein
6a5beb0fcb [LICM] Don't try to promote in loops where we have no chance to promote. NFC.
We would check whether we have a prehader *or* dedicated exit blocks,
and go into the promotion loop. Then, for each alias set we'd check
if we have a preheader *and* dedicated exit blocks, and bail if not.

Instead, bail immediately if we don't have both.

llvm-svn: 290728
2016-12-29 22:51:22 +00:00
Michael Kuperstein
e1104b3705 [LICM] Only recompute LCSSA when we actually promoted something.
We want to recompute LCSSA only when we actually promoted a value.
This means we only need to look at changes made by promotion when
deciding whether to recompute it or not, not at regular sinking/hoisting.

(This was what the code was documented as doing, just not what it did)

Hopefully NFC.

llvm-svn: 290726
2016-12-29 22:37:13 +00:00
Daniel Berlin
aede96b453 NewGVN: Fix PR 31491 by ensuring that we touch the right instructions. Change to one based numbering so we can assert we don't cause the same bug again.
llvm-svn: 290724
2016-12-29 22:15:12 +00:00
Craig Topper
52ec7b9641 [Analysis] Remove repeated text from a comment. NFC
llvm-svn: 290723
2016-12-29 21:48:28 +00:00
Bryant Wong
0a9ac83d7a Fix indentation in r290716.
Use two-space indentation like the rest of the file.

llvm-svn: 290722
2016-12-29 20:05:51 +00:00
Justin Lebar
9746bebf2f [ADT] Rewrite IntrusiveRefCntPtr's comments. NFC
Edit for voice, and also add examples.  In particular, add an
explanation for why you might want to specialize IntrusiveRefCntPtrInfo,
which is not obvious.

llvm-svn: 290720
2016-12-29 19:59:38 +00:00
Justin Lebar
61e5b34b7e [ADT] Rename RefCountedBase::ref_cnt to RefCount. NFC
This makes it comply with the LLVM style guide, and also makes it
consistent with ThreadSafeRefCountedBase below.

llvm-svn: 290719
2016-12-29 19:59:34 +00:00
Justin Lebar
c54ca36396 [ADT] clang-format IntrusiveRefCntrPtr.h. NFC
This file had some strange indentation.

Also remove some unnecessary whitespace between one-line member
functions.

llvm-svn: 290718
2016-12-29 19:59:30 +00:00
Justin Lebar
71e5133921 [ADT] Delete RefCountedBaseVPTR.
Summary:
This class is unnecessary.

Its comment indicated that it was a compile error to allocate an
instance of a class that inherits from RefCountedBaseVPTR on the stack.
This may have been true at one point, but it's not today.

Moreover you really do not want to allocate *any* refcounted object on
the stack, vptrs or not, so if we did have a way to prevent these
objects from being stack-allocated, we'd want to apply it to regular
RefCountedBase too, obviating the need for a separate RefCountedBaseVPTR
class.

It seems that the main way RefCountedBaseVPTR provides safety is by
making its subclass's destructor virtual.  This may have been helpful at
one point, but these days clang will emit an error if you define a class
with virtual functions that inherits from RefCountedBase but doesn't
have a virtual destructor.

Reviewers: compnerd, dblaikie

Subscribers: cfe-commits, klimek, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D28162

llvm-svn: 290717
2016-12-29 19:59:26 +00:00
Bryant Wong
5695b1421e Correctly handle multi-lined RUN lines.
`utils/update_{llc_test,test}_checks` ought to be able to handle RUN commands
that span multiple lines, as shown in the example at
http://llvm.org/docs/CommandGuide/FileCheck.html#the-filecheck-check-prefix-option

Differential Revision: https://reviews.llvm.org/D26523

llvm-svn: 290716
2016-12-29 19:32:34 +00:00
Justin Lebar
f25a278287 [ADT] Use memcpy for type punning in MathExtras.
Summary: Previously we type-punned through a union, which is not safe.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28161

llvm-svn: 290715
2016-12-29 18:15:34 +00:00
Reid Kleckner
dd916c40a8 Revert "[COFF] Use 32-bit jump table entries in .rdata for Win64"
This reverts commit r290694. It broke sanitizer tests on Win64. I'll
probably bring this back, but the jump tables will just live in .text
like they do for MSVC.

llvm-svn: 290714
2016-12-29 17:07:10 +00:00
Sanjoy Das
8c493b5ff3 [TBAAVerifier] Be stricter around verifying scalar nodes
This fixes the issue exposed in PR31393, where we weren't trying
sufficiently hard to diagnose bad TBAA metadata.

This does reduce the variety in the error messages we print out, but I
think the tradeoff of verifying more, simply and quickly overrules the
need for more helpful error messags here.

llvm-svn: 290713
2016-12-29 15:47:05 +00:00
Sanjoy Das
2d9d8d6e59 [TBAAVerifier] Make things const-consistent; NFC
llvm-svn: 290712
2016-12-29 15:47:01 +00:00
Sanjoy Das
f2cdbf0522 [TBAAVerifier] Memoize validity of scalar tbaa nodes; NFCI
llvm-svn: 290711
2016-12-29 15:46:57 +00:00
Artem Tamazov
7abf7635ee [AMDGPU][mc] Enable absolute expressions in .hsa_code_object_isa directive
Among other stuff, this allows to use predefined .option.machine_version_major
/minor/stepping symbols in the directive.

Relevant test expanded at once (also file renamed for clarity).

Differential Revision: https://reviews.llvm.org/D28140

llvm-svn: 290710
2016-12-29 15:41:52 +00:00
Igor Laevsky
64a0622c9c Fix documentation generator warnings after rL290708.
llvm-svn: 290709
2016-12-29 15:08:57 +00:00
Igor Laevsky
cde17c8a3e Introduce element-wise atomic memcpy intrinsic
This change adds a new intrinsic which is intended to provide memcpy functionality
with additional atomicity guarantees. Please refer to the review thread
or language reference for further details.

Differential Revision: https://reviews.llvm.org/D27133

llvm-svn: 290708
2016-12-29 14:31:07 +00:00
Craig Topper
c31995bf02 [InstCombine] Use getVectorNumElements instead of explicitly casting to VectorType and calling getNumElements. NFC
llvm-svn: 290707
2016-12-29 07:03:18 +00:00
Craig Topper
7e34057ad3 [InstCombine] Fix typo in comment. NFC
llvm-svn: 290706
2016-12-29 05:38:31 +00:00
Craig Topper
5b7df5e739 [InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC
llvm-svn: 290705
2016-12-29 04:24:32 +00:00
Craig Topper
84af66a480 [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner.
We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded.

llvm-svn: 290704
2016-12-29 03:30:17 +00:00
Kostya Serebryany
4ee731d6c9 [libFuzzer] make __sanitizer_cov_trace_switch more predictable
llvm-svn: 290703
2016-12-29 02:50:35 +00:00
Craig Topper
412163fcd4 [InstCombine] Fix some of the AVX-512 scalar arithmetic test cases to do a better job of testing what they intended to test.
The accidentally had trivially dead code. Also needed to adjust the rounding mode to not CUR_DIRECTION so the intrinsics don't get converted to native operations before going through SimplifyDemandedVectorElts.

llvm-svn: 290702
2016-12-29 02:29:04 +00:00
Mehdi Amini
cff3bece5c Remove BitstreamWriter::Emit64(), it was never called (NFC)
llvm-svn: 290701
2016-12-29 01:40:53 +00:00
Reid Kleckner
158f53fe75 Fix mingw build by moving the static const data member before the bitfields
Apparently GCC targeting Windows breaks bitfields on static data members:
  struct Foo {
    unsigned X : 16;
    static const int M = 42;
    unsigned Y : 16;
  };
  static_assert(sizeof(Foo) == 4, "asdf"); // fails

Who knew.

llvm-svn: 290700
2016-12-29 01:14:41 +00:00
Daniel Berlin
23bf9929d1 NewGVN: Sort Dominator Tree in RPO order, and use that for generating order.
Summary:
The optimal iteration order for this problem is RPO order. We want to
process as many preds of a backedge as we can before we process the
backedge.

At the same time, as we add predicate handling, we want to be able to
touch instructions that are dominated by a given block by
ranges (because a change in value numbering a predicate possibly
affects all users we dominate that are using that predicate).
If we don't do it this way, we can't do value inference over
backedges (the paper covers this in depth).

The newgvn branch currently overshoots the last part, and guarantees
that it will touch *at least* the right set of instructions, but it
does touch more.  This is because the bitvector instruction ranges are
currently generated in RPO order (so we take the max and the min of
the ranges of dominated blocks, which means there are some in the
middle we didn't have to touch that we did).

We can do better by sorting the dominator tree, and then just using
dominator tree order.

As a preliminary, the dominator tree has some RPO guarantees, but not
enough. It guarantees that for a given node, your idom must come
before you in the RPO ordering. It guarantees no relative RPO ordering
for siblings.  We add siblings in whatever order they appear in the module.

So that is what we fix.

We sort the children array of the domtree into RPO order, and then use
the dominator tree for ordering, instead of RPO, since the dominator
tree is now a valid RPO ordering.

Note: This would help any other pass that iterates a forward problem
in dominator tree order.  Most of them are single pass.  It will still
maximize whatever result they compute.  We could also build the
dominator tree in this order, but our incremental updates would still
put it out of sort order, and recomputing the sort order is almost as
hard as general incremental updates of the domtree.

Also note that the sorting does not affect any tests, etc. Nothing
depends on domtree order, including the verifier, the equals
functions for domtree nodes, etc.

How much could this matter, you ask?
Here are the current numbers.
This is generated by running NewGVN over all files in LLVM.

Note that once we propagate equalities, the differences go up by an
order of magnitude or two (IE instead of 29, the max ends up in the
thousands, since the worst case we add a factor of N, where N is the
number of branch predicates).  So while it doesn't look that stark for
the default ordering, it gets *much much* worse.  There are also
programs in the wild where the difference is already pretty stark
(2 iterations vs hundreds).

RPO ordering:
759040 Number of iterations is 1
112908 Number of iterations is 2

Default dominator tree ordering:
755081 Number of iterations is 1
116234 Number of iterations is 2
   603 Number of iterations is 3
    27 Number of iterations is 4
     2 Number of iterations is 5
     1 Number of iterations is 7

Dominator tree sorted:
759040 Number of iterations is 1
112908 Number of iterations is 2
<yay!>

Really bad ordering (sort domtree siblings in postorder. not quite the
worst possible, but yeah):
754008 Number of iterations is 1
    21 Number of iterations is 10
     8 Number of iterations is 11
     6 Number of iterations is 12
     5 Number of iterations is 13
     2 Number of iterations is 14
     2 Number of iterations is 15
     3 Number of iterations is 16
     1 Number of iterations is 17
     2 Number of iterations is 18
 96642 Number of iterations is 2
     1 Number of iterations is 20
     2 Number of iterations is 21
     1 Number of iterations is 22
     1 Number of iterations is 29
 17266 Number of iterations is 3
  2598 Number of iterations is 4
   798 Number of iterations is 5
   273 Number of iterations is 6
   186 Number of iterations is 7
    80 Number of iterations is 8
    42 Number of iterations is 9

Reviewers: chandlerc, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28129

llvm-svn: 290699
2016-12-29 01:12:36 +00:00
Reid Kleckner
36f848cda1 Add a static_assert about the sizeof(GlobalValue)
I added one for Value back in r262045, and I'm starting to think we
should have these for any class with bitfields whose memory efficiency
really matters.

llvm-svn: 290698
2016-12-29 00:55:51 +00:00
Daniel Berlin
231815c580 Update equalsStoreHelper for the fact that only one branch can be true
llvm-svn: 290697
2016-12-29 00:49:32 +00:00
Justin Lebar
85f5dbd306 [GlobalValue] Move HasLLVMReservedName into existing bitfield. NFC
Summary:
Follow-up to r290691, where I introduced HasLLVMReservedName.  rnk
pointed out that that patch added an extra word to GlobalValue on MSVC,
because it doesn't pack bitfields with different types.

This patch moves HasLLVMReservedName into the existing bitfield, where
we appear to have plenty of bits to spare.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28149

llvm-svn: 290696
2016-12-29 00:30:46 +00:00
Justin Lebar
a399eb32c6 [IR] Clarify that Value::getName() is not actually cheap.
It involves a hashtable lookup when the Value has a name.

llvm-svn: 290695
2016-12-29 00:30:42 +00:00
Reid Kleckner
551887db33 [COFF] Use 32-bit jump table entries in .rdata for Win64
Summary:
We were already using 32-bit jump table entries, but this was a
consequence of the default PIC model on Win64, and not an intentional
design decision. This patch ensures that we always use 32-bit label
difference jump table entries on Win64 regardless of the PIC model. This
is a good idea because it saves executable size and object file size.

Moving the jump tables to .rdata cleans up the disassembled object code
and reduces the available ROP targets, but it requires adding one more
RIP-relative lea to the code.  COFF doesn't have relocations to express
the difference between two arbitrary symbols, so we can't use the jump
table label in the label difference like we do elsewhere.

Fixes PR31488

Reviewers: majnemer, compnerd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28141

llvm-svn: 290694
2016-12-29 00:12:39 +00:00
Mehdi Amini
16e26549a6 Change Metadata Index emission in the bitcode to use 2x32 bits for the placeholder
The Bitstream reader and writer are limited to handle a "size_t" at
most, which means that we can't backpatch and read back a 64bits
value on 32 bits platform.

llvm-svn: 290693
2016-12-28 23:45:54 +00:00
Piotr Padlewski
84cc9a1e8e Revert "[NewGVN] replace emplace_back with push_back"
llvm-svn: 290692
2016-12-28 23:24:02 +00:00
Justin Lebar
e38b309355 Speed up Function::isIntrinsic() by adding a bit to GlobalValue. NFC
Summary:
Previously isIntrinsic() called getName().  This involves a hashtable
lookup, so is nontrivially expensive.  And isIntrinsic() is called
frequently, particularly by dyn_cast<IntrinsicInstr>.

This patch steals a bit of IntID and uses that to store whether or not
getName() starts with "llvm."

Reviewers: bogner, arsenm, joker-eph

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D22949

llvm-svn: 290691
2016-12-28 22:59:45 +00:00