1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

94444 Commits

Author SHA1 Message Date
Reid Kleckner
0669cf2688 [codeview] Emit vtable shape information
The shape of the vtable is passed down as the size of the
__vtbl_ptr_type. This special pointer type appears both as the pointee
type of the vptr type, and by itself in every dynamic class. For classes
with multiple vtables, only the shape of the primary vftable is
included, as the shape of all secondary vftables will be the same as in
the base class.

Fixes PR28150

llvm-svn: 280254
2016-08-31 15:59:30 +00:00
Philip Reames
686905fa4d [statepoints][experimental] Add support for live-in semantics of values in deopt bundles
This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes.

The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate.

Today, we *only* fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.)

My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any *known* miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default.

Differential Revision: https://reviews.llvm.org/D24000

llvm-svn: 280250
2016-08-31 15:12:17 +00:00
Simon Pilgrim
a931b633c1 [X86][SSE] Improve awareness of (v)cvtpd2ps implicit zeroing of upper 64-bits of xmm result
Associate x86_sse2_cvtpd2ps with X86ISD::VFPROUND to avoid inserting unnecessary zeroing shuffles.

Differential Revision: https://reviews.llvm.org/D23797

llvm-svn: 280249
2016-08-31 15:09:34 +00:00
Chad Rosier
66d09c898e [SLP] Arguments should be camel case, and start with an upper case letter. NFC.
llvm-svn: 280248
2016-08-31 15:06:58 +00:00
Sjoerd Meijer
b739719e02 Clang patch r280064 introduced ways to set the FP exceptions and denormal
types. This is the LLVM counterpart and it adds options that map onto FP
exceptions and denormal build attributes allowing better fp math library
selections.

Differential Revision: https://reviews.llvm.org/D24070

llvm-svn: 280246
2016-08-31 14:17:38 +00:00
Krzysztof Parzyszek
5d37c679be Fixed spill stack objects are mutable
Differential Revision: https://reviews.llvm.org/D24039

llvm-svn: 280244
2016-08-31 13:52:17 +00:00
James Molloy
6a93446586 Revert "[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases"
This reverts commit r280218. This *also* causes buildbot errors. Sigh. Not a successful day all around!

llvm-svn: 280239
2016-08-31 13:32:28 +00:00
James Molloy
0cb02140c1 Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd"
This reverts commit r280216 - it caused buildbot failures.

llvm-svn: 280234
2016-08-31 13:16:52 +00:00
James Molloy
7f2f60f9b1 Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"
This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain.

llvm-svn: 280233
2016-08-31 13:16:45 +00:00
James Molloy
2366f91976 Revert "[SimplifyCFG] Add a workaround to fix PR30188"
This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain.

llvm-svn: 280232
2016-08-31 13:16:36 +00:00
James Molloy
351927e4cd Revert "[SimplifyCFG] Fix bootstrap failure after r280220"
This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence.

llvm-svn: 280231
2016-08-31 13:16:30 +00:00
Diana Picus
da60ac4ab8 Use abstraction in AArch64AsmPrinter::lowerSTACKMAP. NFCI
Use functionality from StackMapOpers instead of hardcoding an operand access.

llvm-svn: 280230
2016-08-31 12:43:49 +00:00
Diana Picus
13a702a2c8 Typo fixes. NFC
llvm-svn: 280229
2016-08-31 12:43:44 +00:00
James Molloy
5252babd4e [SimplifyCFG] Fix bootstrap failure after r280220
We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash.

llvm-svn: 280228
2016-08-31 12:33:48 +00:00
Nikolay Haustov
3027d29492 AMDGPU/SI: Handle aliases in AMDGPUAlwaysInlinePass
Summary:
Simply replace usage of aliases to functions with aliasee.
This came up when bitcode linking to builtin library and
calls to aliases not being resolved.

Also made minor improvements to existing test.

Reviewers: tstellarAMD, alex-t, vpykhtin

Subscribers: arsenm, wdng, rampitec

Differential Revision: https://reviews.llvm.org/D24023

llvm-svn: 280221
2016-08-31 11:18:33 +00:00
James Molloy
323dc48eb8 [SimplifyCFG] Add a workaround to fix PR30188
We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

llvm-svn: 280219
2016-08-31 10:46:45 +00:00
James Molloy
97b3566778 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

llvm-svn: 280218
2016-08-31 10:46:39 +00:00
James Molloy
bfb4ed7a07 [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

llvm-svn: 280217
2016-08-31 10:46:33 +00:00
James Molloy
0803b4eca8 [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

llvm-svn: 280216
2016-08-31 10:46:23 +00:00
James Molloy
6315c274f3 [SimplifyCFG] Tail-merge calls with sideeffects
This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour
at least vaguely consistent with the previous version and keep it as close to NFC as
I could.

There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup
along the way to ensure we don't create indirect calls.

Should fix PR28964.

llvm-svn: 280215
2016-08-31 10:46:16 +00:00
Simon Pilgrim
dcb63f85f3 [X86][SSE] Improve awareness of fptrunc implicit zeroing of upper 64-bits of xmm result
Add patterns to avoid inserting unnecessary zeroing shuffles when lowering fptrunc to (v)cvtpd2ps

Differential Revision: https://reviews.llvm.org/D23797

llvm-svn: 280214
2016-08-31 10:35:13 +00:00
Igor Kudrin
972cd75cf9 [Coverage] Make sorting criteria for CounterMappingRegions local.
Move the comparison function into the only place there it is used,
i.e. the call to std::stable_sort in CoverageMappingWriter::write().

Add sorting by region kinds as it is required to ensure stable order
in our tests and to simplify D23987.

Differential Revision: https://reviews.llvm.org/D24034

llvm-svn: 280198
2016-08-31 07:01:17 +00:00
Craig Topper
6de2ad51c7 [AVX-512] Add patterns to select masked logical operations if the select has a floating point type.
This is needed in order to replace the masked floating point logical op intrinsics with native IR.

llvm-svn: 280195
2016-08-31 05:37:52 +00:00
Dean Michael Berris
63093a6577 [XRay] Support multiple return instructions in a single basic block
Add a .mir test to catch this case, and fix the xray-instrumentation
pass to handle it appropriately.

llvm-svn: 280192
2016-08-31 05:20:08 +00:00
David Majnemer
d770d3137d [Loads] Properly populate the visited set in isDereferenceableAndAlignedPointer
There were paths where we wouldn't populate the visited set, causing us
to recurse forever if an SSA variable was defined in terms of itself.

This fixes PR30210.

llvm-svn: 280191
2016-08-31 03:22:32 +00:00
Hal Finkel
cddd83b2d6 [PowerPC] Don't spill the frame pointer twice
When a function contains something, such as inline asm, which explicitly
clobbers the register used as the frame pointer, don't spill it twice. If we
need a frame pointer, it will be saved/restored in the prologue/epilogue code.
Explicitly spilling it again will reuse the same spill slot used by the
prologue/epilogue code, thus clobbering the saved value. The same applies
to the base-pointer or PIC-base register.

Partially fixes PR26856. Thanks to Ulrich for his analysis and the small
inline-asm reproducer.

llvm-svn: 280188
2016-08-31 00:52:03 +00:00
Gor Nishanov
2f5f55d9bd [Coroutines] Part 10: Add coroutine promise support.
Summary:
1) CoroEarly now lowers llvm.coro.promise intrinsic that allows to obtain
a coroutine promise pointer from a coroutine frame and vice versa.

2) CoroFrame now interprets Promise argument of llvm.coro.begin to
place CoroutinPromise alloca at a deterministic offset from the coroutine frame.

Now, the coroutine promise example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex4.ll).

Reviewers: majnemer

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23993

llvm-svn: 280184
2016-08-31 00:35:41 +00:00
Sanjay Patel
ad21d55ed2 [InstCombine] clean up InsertRangeTest; NFCI
It's much less code and easier to read if we don't duplicate
everything between the 'Inside' and not 'Inside' cases.

As noted with the FIXME, the goal is to make this vector-friendly
in a follow-up patch.

llvm-svn: 280183
2016-08-31 00:19:35 +00:00
Alina Sbirlea
82dac894c9 [LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions. Resolves PR29148.
Summary:
LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize.
A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions
in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head.

e.g.:
i1: store a[0]
i2: store a[1]
i3: store a[1]
Leads to:
H: i1
T: i2 i3
Instead of:
H: i1 i1
T: i2 i3
So the positions for instructions that follow i3 will have different indexes in H/T.
This patch resolves PR29148.

This issue also surfaced the fact that if the chain is too long, and TLI
returns a "not-fast" answer, the whole chain will be abandoned for
vectorization, even though a smaller one would be beneficial.
Added a testcase and FIXME for this.

Reviewers: tstellarAMD, arsenm, jlebar

Subscribers: mzolotukhin, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24057

llvm-svn: 280179
2016-08-30 23:53:59 +00:00
Reid Kleckner
ee70f4de4f [codeview] Remove redundant TypeTable lookup
As written, the code should assert if this lookup would have ever
succeeded.  Without looking through composite types, the type graph
should be acyclic.

llvm-svn: 280168
2016-08-30 21:48:14 +00:00
Kevin Enderby
f77b75655a Next set of additional error checks for invalid Mach-O files for bad LC_DYSYMTAB’s.
This contains the missing checks for LC_DYSYMTAB load command fields.

llvm-svn: 280161
2016-08-30 21:28:30 +00:00
Tim Northover
aaa2a927bc GlobalISel: combine extracts & sequences created for legalization
Legalization ends up creating many G_SEQUENCE/G_EXTRACT pairs which leads to
inefficient codegen (even for -O0), so add a quick pass over the function to
remove them again.

llvm-svn: 280155
2016-08-30 20:51:25 +00:00
Matt Arsenault
ea8814c4f7 AMDGPU: Relax SGPR asm constraint register class
s should be SReg_32 to be as general as possible. This can avoid a copy
from m0.

llvm-svn: 280154
2016-08-30 20:50:08 +00:00
Mike Aizatsky
b9c35c415b [libfuzzer] simplified unit truncation; do not write trunc items to disc
Differential Revision: https://reviews.llvm.org/D24049

llvm-svn: 280153
2016-08-30 20:49:07 +00:00
Michael Kuperstein
39c8e8b0c7 [LoopVectorizer] Predicate instructions in blocks with several incoming edges
We don't need to limit predication to blocks that have a single incoming
edge, we just need to use the right mask.
This fixes PR30172.

Differential Revision: https://reviews.llvm.org/D24009

llvm-svn: 280148
2016-08-30 20:22:21 +00:00
David Majnemer
7622ad4008 [COFFObjectFile] Ignore broken symbol table
When binaries are compressed by UPX, information about symbol table
offset and symbol count remain unchanged (but became invalid due to
compression).
This causes failure in the constructor and the rest of the binary cannot
be processed.

Instead, reset symbol related information (symbol/string table pointers,
sizes) - this should disable the related iterators and functions while
the rest of the binary can still be processed.

Patch by Bandzi Michal!

llvm-svn: 280147
2016-08-30 20:20:24 +00:00
Duncan P. N. Exon Smith
1463620411 CodeGen: Fixup for r280128, since GCC isn't as permissive as Clang
Fixes the bots, e.g.:
http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/10055

llvm-svn: 280135
2016-08-30 19:11:11 +00:00
Tim Northover
4f03f012c7 GlobalISel: forbid physical registers on generic MIs.
We're intending to move to a world where the type of a register is determined
by its (unique) def. This is incompatible with physregs, which are untyped.

It also means the other passes don't have to worry quite so much about
register-class compatibility and inserting COPYs appropriately.

llvm-svn: 280132
2016-08-30 18:52:46 +00:00
Duncan P. N. Exon Smith
9df0e64f25 ADT: Split ilist_node_traits into alloc and callback, NFC
Many lists want to override only allocation semantics, or callbacks for
iplist.  Split these up to prevent code duplication.
- Specialize ilist_alloc_traits to change the implementations of
  deleteNode() and createNode().
- One common desire is to do nothing deleteNode() and disable
  createNode().  Specialize ilist_alloc_traits to inherit from
  ilist_noalloc_traits for that behaviour.
- Specialize ilist_callback_traits to use the addNodeToList(),
  removeNodeFromList(), and transferNodesFromList() callbacks.

As a drive-by, add some coverage to the callback-related unit tests.

llvm-svn: 280128
2016-08-30 18:40:47 +00:00
Kyle Butt
6a05633b20 TailDuplication: Extract Indirect-Branch block limit as option. NFC
The existing code hard-coded a limit of 20 instructions for duplication
when a block ended with an indirect branch. Extract this as an option.
No functional change intended.

llvm-svn: 280125
2016-08-30 18:18:54 +00:00
Duncan P. N. Exon Smith
fa87381dee ADT: Guarantee transferNodesFromList is only called on transfers
Guarantee that ilist_traits<T>::transferNodesFromList is only called
when nodes are actually changing lists.

I also moved all the callbacks to occur *first*, before the operation.
This is the only choice for iplist<T>::merge, so we might as well be
consistent.  I expect this to have no effect in practice, although it
simplifies the logic in both iplist<T>::transfer and iplist<T>::insert.

llvm-svn: 280122
2016-08-30 18:00:45 +00:00
Sanjay Patel
a2b59dfe55 [InstCombine] replace divide-by-constant checks with asserts; NFC
These folds already have tests for scalar and vector types, except 
for the vector div-by-0 case, so I'm adding tests for that.

llvm-svn: 280115
2016-08-30 17:31:34 +00:00
Sanjay Patel
f098368a5a [InstCombine] clean up foldICmpDivConstant; NFCI
1. Fix comments to match variable names
2. Remove redundant CmpRHS variable
3. Add FIXME to replace some checks with asserts

llvm-svn: 280112
2016-08-30 17:10:49 +00:00
NAKAMURA Takumi
35c1fea391 Fixup r279618, instantiate *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC>, instead of *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID.
Or they were not instantiated as expected;

  llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID
  llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID

llvm-svn: 280105
2016-08-30 15:47:13 +00:00
Valery Pykhtin
7ff35e6fee [AMDGPU] Refactor SOP instructions TD files.
Differential revision: https://reviews.llvm.org/D23617

llvm-svn: 280101
2016-08-30 15:20:31 +00:00
Kostya Serebryany
733e18adcb [libFuzzer] fix a bug when running a single unit of N bytes with -max_len=M, M<N, caused a buffer overflow
llvm-svn: 280098
2016-08-30 14:52:05 +00:00
Kostya Serebryany
1d077e5054 [libFuzzer] stop using bits for memcmp's value profile -- seems to blow up the corpus too much
llvm-svn: 280096
2016-08-30 14:39:33 +00:00
Nirav Dave
0a2093c3a1 [MC] Move parser helper functions from Asmparser to MCAsmParser
NFC Intended.

llvm-svn: 280092
2016-08-30 14:15:43 +00:00
Chad Rosier
a507852652 [Reassociate] Add additional debug output. NFC.
llvm-svn: 280090
2016-08-30 13:58:35 +00:00
NAKAMURA Takumi
0be0197c94 SILoadStoreOptimizer.cpp: Fix a warning in r279991. [-Wunused-variable]
llvm-svn: 280075
2016-08-30 11:50:21 +00:00