1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

143831 Commits

Author SHA1 Message Date
Davide Italiano
d5a8d17b60 [obj2yaml] Produce correct output for invalid relocations.
R_X86_64_NONE can be emitted without a symbol associated (well,
in theory it should never be emitted in an ABI-compliant relocatable
object). So, if there's no symbol associated to a reloc, emit one
with an empty name, instead of crashing.

Ack'ed by Michael Spencer offline.

PR: 31768
llvm-svn: 293224
2017-01-26 23:12:53 +00:00
Krzysztof Parzyszek
9ef94361a2 [Hexagon] Require IPO library in Hexagon build
This should unbreak the Hexagon build bots.

llvm-svn: 293221
2017-01-26 23:03:22 +00:00
Daniel Berlin
e04316482c NewGVN: Fix bug exposed by PR31761
Summary:
This does not actually fix the testcase in PR31761 (discussion is
ongoing on the testcase), but does fix a bug it exposes, where stores
were not properly clobbering loads.

We accomplish this by unifying the memory equivalence infratructure
back into the normal congruence infrastructure, and then properly
destroying congruence classes when memory state leaders disappear.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29195

llvm-svn: 293216
2017-01-26 22:21:48 +00:00
Sanjay Patel
66560a79a5 [InstCombine] fold (X >>u C) << C --> X & (-1 << C)
We already have this fold when the lshr has one use, but it doesn't need that
restriction. We may be able to remove some code from foldShiftedShift().

Also, move the similar:
(X << C) >>u C --> X & (-1 >>u C)
...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().

That whole function seems questionable since it is called by commonShiftTransforms(),
but there's really not much in common if we're checking the shift opcodes for every
fold.

llvm-svn: 293215
2017-01-26 22:08:10 +00:00
Ahmed Bougacha
bab0dceb1b [GlobalISel] Remove duplicate function using variadic templates. NFC.
I think the initial version of r293172 was trying:
  std::forward<Args...>(args)...
which doesn't compile.  This seems like the correct way:
  std::forward<Args>(args)...

llvm-svn: 293214
2017-01-26 22:07:37 +00:00
Krzysztof Parzyszek
136e254cdb [Hexagon] Add Hexagon-specific loop idiom recognition pass
llvm-svn: 293213
2017-01-26 21:41:10 +00:00
Daniel Berlin
613048e2b7 NewGVN: Add algorithm overview
llvm-svn: 293212
2017-01-26 21:39:49 +00:00
Sanjay Patel
afb8de4915 [InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with splat vectors
llvm-svn: 293208
2017-01-26 20:52:27 +00:00
Zvi Rackover
b7932e32fe [Doc][LangRef] Fix typo-ish error in description of Masked Gather
Summary: Fix the example of equivalent expansion for when mask is all ones.

Reviewers: delena

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29179

llvm-svn: 293206
2017-01-26 20:29:15 +00:00
Sanjay Patel
a94fdc692e [InstCombine] add tests for shift-shift folds; NFC
llvm-svn: 293205
2017-01-26 20:10:55 +00:00
Balaram Makam
95d42eb2a2 [AArch64] Refine Kryo Machine Model
Summary: Refine floating point SQRT and DIV with accurate latency information.

Reviewers: mcrosier

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D29191

llvm-svn: 293204
2017-01-26 20:10:41 +00:00
Kyle Butt
3ddc108df8 [IfConversion] Use reverse_iterator to simplify. NFC
This simplifies skipping debug instructions and shrinking ranges.

llvm-svn: 293202
2017-01-26 20:02:47 +00:00
Sean Fertile
6ef8fab619 [PPC] cleanup of mayLoad/mayStore flags and memory operands.
1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store
   instructions.
2) Updated the flags on a number of intrinsics indicating that they write
    memory.
3) Added SDNPMemOperand flags for some target dependent SDNodes so that they
   propagate their memory operand

Review: https://reviews.llvm.org/D28818
llvm-svn: 293200
2017-01-26 18:59:15 +00:00
Daniel Berlin
0ac47a0236 NewGVN: Fix output of pr31578 testcase now that we mark unreachable blocks as unreachable
llvm-svn: 293198
2017-01-26 18:49:03 +00:00
Daniel Berlin
3eedb093ba NewGVN: Make unreachable blocks be marked with unreachable
llvm-svn: 293196
2017-01-26 18:30:29 +00:00
Stanislav Mekhanoshin
4b31377e87 Replace addEarlyAsPossiblePasses callback with adjustPassManager
This change introduces adjustPassManager target callback giving a
target an opportunity to tweak PassManagerBuilder before pass
managers are populated.

This generalizes and replaces addEarlyAsPossiblePasses target
callback. In particular that can be used to add custom passes to
extension points other than EP_EarlyAsPossible.

Differential Revision: https://reviews.llvm.org/D28336

llvm-svn: 293189
2017-01-26 16:49:08 +00:00
Nirav Dave
2a565d7a4e Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r293184 which is failing in LTO builds

llvm-svn: 293188
2017-01-26 16:46:13 +00:00
Serge Rogatch
67e972e487 [XRay][Arm32] Reduce the portion of the stub and implement more staging for tail calls - in LLVM
Summary:
This patch provides more staging for tail calls in XRay Arm32 . When the logging part of XRay is ready for tail calls, its support in the core part of XRay Arm32 may be as easy as changing the number passed to the handler from 1 to 2.
Coupled patch:
- https://reviews.llvm.org/D28674

Reviewers: dberris, rengolin

Reviewed By: dberris

Subscribers: llvm-commits, iid_iunknown, aemerson, rengolin, dberris

Differential Revision: https://reviews.llvm.org/D28673

llvm-svn: 293185
2017-01-26 16:17:03 +00:00
Nirav Dave
c7f26fe4ae In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
* Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 293184
2017-01-26 16:02:24 +00:00
Rafael Espindola
9d69c5bee1 Use shouldAssumeDSOLocal in classifyGlobalReference.
And teach shouldAssumeDSOLocal that ppc has no copy relocations.

The resulting code handle a few more case than before. For example, it
knows that a weak symbol can be resolved to another .o file, but it
will still be in the main executable.

llvm-svn: 293180
2017-01-26 15:02:31 +00:00
Simon Pilgrim
b008f0941d [X86][SSE] Add support for combining ANDNP byte masks with target shuffles
llvm-svn: 293178
2017-01-26 14:31:12 +00:00
Daniil Fukalov
a685f298a9 [SCEV] Introduce add operation inlining limit
Inlining in getAddExpr() can cause abnormal computational time in some cases.
New parameter -scev-addops-inline-threshold is intruduced with default value 500.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28812

llvm-svn: 293176
2017-01-26 13:33:17 +00:00
Simon Pilgrim
7d26eaeb36 [X86][SSE] Pull out target shuffle resolve code into helper. NFCI.
Pulled out code that removed unused inputs from a target shuffle mask into a helper function to allow it to be reused in a future commit.

llvm-svn: 293175
2017-01-26 13:06:02 +00:00
Daniel Sanders
a3f89f8091 Remove a '#if 0' that wasn't intended for commit in r293173.
The '#if 0' contained the code I had intended to use but clang
rejects it (possibly incorrectly).

llvm-svn: 293174
2017-01-26 12:10:43 +00:00
Daniel Sanders
1248fb3533 Attempt to fix windows buildbots after r293172.
llvm-svn: 293173
2017-01-26 11:23:49 +00:00
Daniel Sanders
c777901882 [globalisel] Re-factor ISel matchers into a hierarchy. NFC
Summary:
This should make it possible to easily add everything needed to import all
the existing SelectionDAG rules. It should also serve the likely
kinds of GlobalISel rules (some of which are not currently representable
in SelectionDAG) once we've nailed down the tablegen definition for that.

The hierarchy is as follows:
  MatcherRule - A matching rule. Currently used to emit C++ ISel code but will
  |             also be used to emit test cases and tablegen definitions in the
  |             near future.
  |- Instruction(s) - Represents the instruction to be matched.
     |- Instruction Predicate(s) - Test the opcode, arithmetic flags, etc. of an
     |                             instruction.
     \- Operand(s) - Represents a particular operand of the instruction. In the
        |            future, there may be subclasses to test the same predicates
        |            on multiple operands (including for variadic instructions).
        \ Operand Predicate(s) - Test the type, register bank, etc. of an operand.
                                 This is where the ComplexPattern equivalent
                                 will be represented. It's also
                                 nested-instruction matching will live as a
                                 predicate that follows the DefUse chain to the
                                 Def and tests a MatcherRule from that position.

Support for multiple instruction matchers in a rule has been retained from
the existing code but has been adjusted to assert when it is used.
Previously it would silently drop all but the first instruction matcher.

The tablegen-erated file is not functionally changed but has more
parentheses and no longer attempts to format the if-statements since
keeping track of the indentation is tricky in the presence of the matcher
hierarchy. It would be nice to have CMakes tablegen() run the output
through clang-format (when available) so we don't have to complicate
TableGen with pretty-printing.

It's also worth mentioning that this hierarchy will also be able to emit
TableGen definitions and test cases in the near future. This is the reason
for favouring explicit emit*() calls rather than the << operator.

Reviewers: aditya_nandakumar, rovka, t.p.northover, qcolombet, ab

Reviewed By: ab

Subscribers: igorb, dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D28942

llvm-svn: 293172
2017-01-26 11:10:14 +00:00
Valery Pykhtin
ab0bb6ac0c [AMDGPU] Fix typo in GCNSchedStrategy
Differential revision: https://reviews.llvm.org/D28980

llvm-svn: 293171
2017-01-26 10:51:47 +00:00
Simon Dardis
ed0555e547 Revert "[mips] N64 static relocation model support"
This reverts commit r293164. There are multiple tests failing.

llvm-svn: 293170
2017-01-26 10:46:07 +00:00
Chandler Carruth
b3230220b9 [LV] Fix an issue where forming LCSSA in the place that we did would
change the set of uniform instructions in the loop causing an assert
failure.

The problem is that the legalization checking also builds data
structures mapping various facts about the loop body. The immediate
cause was the set of uniform instructions. If these then change when
LCSSA is formed, the data structures would already have been built and
become stale. The included test case triggered an assert in loop
vectorize that was reduced out of the new PM's pipeline.

The solution is to form LCSSA early enough that no information is cached
across the changes made. The only really obvious position is outside of
the main logic to vectorize the loop. This also has the advantage of
removing one case where forming LCSSA could mutate the loop but we
wouldn't track that as a "Changed" state.

If it is significantly advantageous to do some legalization checking
prior to this, we can do a more careful positioning but it seemed best
to just back off to a safe position first.

llvm-svn: 293168
2017-01-26 10:41:09 +00:00
Simon Dardis
0f0a7a888b [mips] N64 static relocation model support
This patch makes one change to GOT handling and two changes to N64's
relocation model handling. Furthermore, the jumptable encodings have
been corrected for static N64.

Big GOT handling is now done via a new SDNode MipsGotHi - this node is
unconditionally lowered to an lui instruction.

The first change to N64's relocation handling is the lifting of the
restriction that N64 always uses PIC. Now it is possible to target static
environments.

The second change adds support for 64 bit symbols and enables them by
default. Previously N64 had patterns for sym32 mode only. In this mode all
symbols are assumed to have 32 bit addresses. sym32 mode support
is selectable with attribute 'sym32'. A follow on patch for clang will
add the necessary frontend parameter.

This partially resolves PR/23485.

Thanks to Brooks Davis for reporting the issue!

Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris

Differential Revision: https://reviews.llvm.org/D23652

llvm-svn: 293164
2017-01-26 10:19:02 +00:00
Diana Picus
b8228ddf1c [ARM] GlobalISel: Load i1, i8 and i16 args from stack
Add support for loading i1, i8 and i16 arguments from the stack, with or without
the ABI extension flags.

When the ABI extension flags are present, we load a 4-byte value, otherwise we
preserve the size of the load and let the instruction selector replace it with a
LDRB/LDRH. This generates the same thing as DAGISel.

Differential Revision: https://reviews.llvm.org/D27803

llvm-svn: 293163
2017-01-26 09:20:47 +00:00
Alexey Bataev
b7506f76d3 [SLP] Add one more reduction operation for extra argument test to make
it vectorizable.

llvm-svn: 293162
2017-01-26 09:18:41 +00:00
Chandler Carruth
579549a336 [PM] Use PoisoningVH correctly when merely deleting entries in a map
with it.

This code was dereferencing the PoisoningVH which isn't allowed once it
is poisoned. But the code itself really doesn't need to access the
pointer, it is just doing the safe stuff of clearing out data structures
keyed on the pointer value.

Change the code to use iterators to erase directly from a DenseMap. This
is also substantially more efficient as it avoids lots of hashing and
lookups to do the erasure. DenseMap supports iterating behind the
iteration which is fairly easy to implement.

Sadly, I don't have a test case here. I'm not even close and I don't
know that I ever will be. The issue is that several of the tricky
aspects of fixing this only show up when you cause the stack's
SmallVector to be in *EXACTLY* the right location. I only ever got
a reproduction for those with Clang, and only with *exactly* the right
command line flags. Any adjustment, even to seemingly unrelated flags,
would make partial and half-way solutions magically start to "work". In
good news, all of this was caught with the LLVM test suite. Also, there
is no *specific* code here that is untested, just that the old pattern
of code won't immediately fail on any test case I've managed to
contrive.

llvm-svn: 293160
2017-01-26 08:31:54 +00:00
NAKAMURA Takumi
a86dd54b5e Chapter3/KaleidoscopeJIT.h: Fix a warning. [-Wunused-lambda-capture]
"this", aka class members, is not referred in the body.

llvm-svn: 293159
2017-01-26 08:31:14 +00:00
Craig Topper
42fe18a2f3 [TargetTransformInfo] Add override keywords to supporess -Winconsistent-missing-override.
llvm-svn: 293158
2017-01-26 08:04:27 +00:00
Craig Topper
d519e39f21 [AVX-512] Move the combine that runs combineBitcastForMaskedOp to the last DAG combine phase where I had originally meant to put it.
llvm-svn: 293157
2017-01-26 07:17:58 +00:00
Craig Topper
ebed70c5a8 [X86] When bitcasting INSERT_SUBVECTOR/EXTRACT_SUBVECTOR to match masked operations, use the correct type for the immediate operand.
llvm-svn: 293156
2017-01-26 07:17:53 +00:00
Jonas Paulsson
1dc6fdc89f [TargetTransformInfo] Refactor and improve getScalarizationOverhead()
Refactoring to remove duplications of this method.

New method getOperandsScalarizationOverhead() that looks at the present unique
operands and add extract costs for them. Old behaviour was to just add extract
costs for one operand of the type always, which still happens in
getArithmeticInstrCost() if no operands are provided by the caller.

This is a good start of improving on this, but there are more places
that can be improved by using getOperandsScalarizationOverhead().

Review: Hal Finkel
https://reviews.llvm.org/D29017

llvm-svn: 293155
2017-01-26 07:03:25 +00:00
Alexey Bataev
d5721411da [SLP] Fixed test for extra arguments in horizontal reductions.
llvm-svn: 293153
2017-01-26 06:19:52 +00:00
Craig Topper
e87596f222 [DAGCombiner] Fold extract_subvector of undef to undef. Fold away inserting undef subvectors.
llvm-svn: 293152
2017-01-26 05:38:46 +00:00
Craig Topper
b6c9a3c954 [X86] Add demanded elts support for the inputs to pclmul intrinsic
This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors.

Differential Revision: https://reviews.llvm.org/D28979

llvm-svn: 293151
2017-01-26 05:17:13 +00:00
Taewook Oh
bf2c1121d3 Revert test commit
llvm-svn: 293150
2017-01-26 04:34:25 +00:00
Taewook Oh
ed84c38218 test commit
llvm-svn: 293148
2017-01-26 04:32:40 +00:00
Adam Nemet
9ca81099b8 [OptDiag] Predicates to check the same type of IR and MIR opt remarks
It will be used from clang.

llvm-svn: 293145
2017-01-26 04:03:18 +00:00
Peter Collingbourne
72e7a5adf5 gold-plugin: Fix test case.
llvm-svn: 293137
2017-01-26 02:15:08 +00:00
Chandler Carruth
774ae973b9 [PM] Simplify the new PM interface to the loop unroller and expose two
factory functions for the two modes the loop unroller is actually used
in in-tree: simplified full-unrolling and the entire thing including
partial unrolling.

I've also wired these up to nice names so you can express both of these
being in a pipeline easily. This is a precursor to actually enabling
these parts of the O2 pipeline.

Differential Revision: https://reviews.llvm.org/D28897

llvm-svn: 293136
2017-01-26 02:13:50 +00:00
Chandler Carruth
c22d160d50 [Loops] Restructure the LoopInfo verify function so that it more
directly walks the current loop structure verifying that a matching
structure can be found in a freshly computed version.

Also pull things out of containers when necessary once an issue is found
and print them directly.

This makes it substantially easier to debug verification failures as
the process stops at the exact point in the loop nest where they diverge
and has in easily accessed local variables (or printed to stderr
already) the loops and other information needed to analyze the failure.

Differential Revision: https://reviews.llvm.org/D29142

llvm-svn: 293133
2017-01-26 02:07:20 +00:00
Peter Collingbourne
49500b6e3c gold-plugin: Simplify naming of object files created with save-temps or obj-path.
Now we never append a number to the file name for task ID 0.

Differential Revision: https://reviews.llvm.org/D29160

llvm-svn: 293132
2017-01-26 02:07:05 +00:00
Rui Ueyama
8f8807628d Fix --Wunused-function.
llvm-svn: 293131
2017-01-26 02:03:58 +00:00
Kostya Serebryany
b0dd98aa0e [libFuzzer] remove a bit of stale code
llvm-svn: 293129
2017-01-26 01:45:54 +00:00