1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
Commit Graph

145667 Commits

Author SHA1 Message Date
Vassil Vassilev
84e29fb5b4 Do not leak OpenedHandles.
llvm-svn: 296748
2017-03-02 14:30:05 +00:00
Matthew Simpson
729924f526 [LV] Considier non-consecutive but vectorizable accesses for VF selection
When computing the smallest and largest types for selecting the maximum
vectorization factor, we currently ignore loads and stores of pointer types if
the memory access is non-consecutive. We do this because such accesses must be
scalarized regardless of vectorization factor, and thus shouldn't be considered
when determining the factor. This patch makes this check less aggressive by
also considering non-consecutive accesses that may be vectorized, such as
interleaved accesses. Because we don't know at the time of the check if an
accesses will certainly be vectorized (this is a cost model decision given a
particular VF), we consider all accesses that can potentially be vectorized.

Differential Revision: https://reviews.llvm.org/D30305

llvm-svn: 296747
2017-03-02 13:55:05 +00:00
Andrew V. Tischenko
69e584bbd4 Added special test covering a problem with PIC relocation model on SLM architecture. The fix will come in D26855.
llvm-svn: 296746
2017-03-02 13:47:03 +00:00
Serge Pavlov
722a286698 Do not verify MachimeDominatorTree if it is not calculated
If dominator tree is not calculated or is invalidated, set corresponding
pointer in the pass state to nullptr. Such pointer value will indicate
that operations with dominator tree are not allowed. In particular, it
allows to skip verification for such pass state. The dominator tree is
not calculated if the machine dominator pass was skipped, it occures in
the case of entities with linkage available_externally.

The change fixes some test fails observed when expensive checks
are enabled.

Differential Revision: https://reviews.llvm.org/D29280

llvm-svn: 296742
2017-03-02 12:00:10 +00:00
Xin Tong
ac8410ceb7 Fix typo. NFCI
llvm-svn: 296735
2017-03-02 08:39:11 +00:00
Peter Collingbourne
41d9b0b540 cmake: Configure the ThinLTO cache directory when using ELF lld or gold.
Differential Revision: https://reviews.llvm.org/D30522

llvm-svn: 296730
2017-03-02 03:01:12 +00:00
Peter Collingbourne
dfbe8f3dfb LTO: When creating a local cache, create the cache directory if it does not already exist.
Differential Revision: https://reviews.llvm.org/D30519

llvm-svn: 296726
2017-03-02 02:02:38 +00:00
Matthias Braun
ea58cf0e44 LiveRegMatrix: Fix some subreg interference checks
Surprisingly, one of the three interference checks in LiveRegMatrix was
using the main live range instead of the apropriate subregister range
resulting in unnecessarily conservative results.

llvm-svn: 296722
2017-03-02 00:35:08 +00:00
Matthias Braun
939e72b029 LiveIntervalUnion: Remove unused function; NFC
llvm-svn: 296721
2017-03-02 00:15:06 +00:00
Eli Friedman
5d8cb8245c Revert r296708; causing test failures on ARM hosts.
Original commit message:

[ARM] Fix insert point for store rescheduling.
    
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation as
LastOp.
    
This patch fixes some cases where we would sink stores for no reason.

llvm-svn: 296718
2017-03-02 00:08:50 +00:00
Eugene Zelenko
ec780d0459 [Support] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 296714
2017-03-01 23:59:26 +00:00
Paul Robinson
271a695573 Remove spurious use of LLVM_FALLTHROUGH (NFC)
llvm-svn: 296713
2017-03-01 23:59:11 +00:00
Amaury Sechet
c49fa55b23 [DAGCombiner] mulhi + 1 never overflow.
Summary:
This can be used to optimize large multiplications after legalization.

Depends on D29565

Reviewers: mkuper, spatel, RKSimon, zvi, bkramer, aaboud, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29587

llvm-svn: 296711
2017-03-01 23:44:17 +00:00
Ahmed Bougacha
ee9842c4ba [GlobalISel] Add a way for targets to enable GISel.
Until now, we've had to use -global-isel to enable GISel.  But using
that on other targets that don't support it will result in an abort, as we
can't build a full pipeline.
Additionally, we want to experiment with enabling GISel by default for
some targets: we can't just enable GISel by default, even among those
target that do have some support, because the level of support varies.

This first step adds an override for the target to explicitly define its
level of support.  For AArch64, do that using
a new command-line option (I know..):
  -aarch64-enable-global-isel-at-O=<N>
Where N is the opt-level below which GISel should be used.

Default that to -1, so that we still don't enable GISel anywhere.
We're not there yet!

While there, remove a couple LLVM_UNLIKELYs.  Building the pipeline is
such a cold path that in practice that shouldn't matter at all.

llvm-svn: 296710
2017-03-01 23:33:08 +00:00
Amaury Sechet
d0508faaf6 Improve mulhi overflow test. NFC
llvm-svn: 296709
2017-03-01 23:31:19 +00:00
Eli Friedman
d150ed85fb [ARM] Fix insert point for store rescheduling.
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation as
LastOp.
    
This patch fixes some cases where we would sink stores for no reason.
    
Differential Revision: https://reviews.llvm.org/D30124

llvm-svn: 296708
2017-03-01 23:20:29 +00:00
Eli Friedman
c3db369888 [ARM] Check correct instructions for load/store rescheduling.
This code starts from the high end of the sorted vector of offsets, and
works backwards: it tries to find contiguous offsets, process them, then
pops them from the end of the vector. Most of the code agrees with this
order of processing, but one loop doesn't: it instead processes elements
from the low end of the vector (which are nodes with unrelated offsets).
Fix that loop to process the correct elements.
    
This has a few implications. One, we don't incorrectly return early when
processing multiple groups of offsets in the same block (which allows
rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert
point for loads, so they're correctly sorted (which affects the
scheduling of vldm-liveness.ll). I think it might also impact some of
the heuristics slightly.
    
Differential Revision: https://reviews.llvm.org/D30368

llvm-svn: 296701
2017-03-01 22:56:20 +00:00
Sanjay Patel
b91c17c096 [DAGCombiner] fold binops with constant into select-of-constants
This is part of the ongoing attempt to improve select codegen for all targets and select 
canonicalization in IR (see D24480 for more background). The transform is a subset of what
is done in InstCombine's FoldOpIntoSelect().

I first noticed a regression in the x86 avx512-insert-extract.ll tests with a patch that 
hopes to convert more selects to basic math ops. This appears to be a general missing DAG
transform though, so I added tests for all standard binops in rL296621 
(PowerPC was chosen semi-randomly; it has scripted FileCheck support, but so do ARM and x86).

The poor output for "sel_constants_shl_constant" is tracked with:
https://bugs.llvm.org/show_bug.cgi?id=32105

Differential Revision: https://reviews.llvm.org/D30502

llvm-svn: 296699
2017-03-01 22:51:31 +00:00
Reid Kleckner
05f6767063 [Constant Hoisting] Avoid inserting instructions before EH pads
Now that terminators can be EH pads, this code needs to iterate over the
immediate dominators of the EH pad to find a valid insertion point.

Fix for PR32107

Patch by Robert Olliff!

Differential Revision: https://reviews.llvm.org/D30511

llvm-svn: 296698
2017-03-01 22:41:12 +00:00
Eugene Zelenko
1445fb1065 [MC] Fix MachineLocation constructor broken in r294685 (NFC).
Problem spotted by Frej Drejhammar.

llvm-svn: 296697
2017-03-01 22:28:23 +00:00
Amaury Sechet
b2f95f13cf Add test case for mulhi's overflow. NFC
llvm-svn: 296696
2017-03-01 22:27:21 +00:00
Victor Leschuk
38a4ccf3d9 [DebugInfo] [DWARFv5] Unique abbrevs for DIEs with different implicit_const values
Take DW_FORM_implicit_const attribute value into account when profiling
DIEAbbrevData.

Currently if we have two similar types with implicit_const attributes and
different values we end up with only one abbrev in .debug_abbrev section.
For example consider two structures: S1 with implicit_const attribute ATTR
and value VAL1 and S2 with implicit_const ATTR and value VAL2.
The .debug_abbrev section will contain only 1 related record:

[N] DW_TAG_structure_type       DW_CHILDREN_yes
        DW_AT_ATTR        DW_FORM_implicit_const  VAL1
        // ....

This is incorrect as struct S2 (with VAL2) will use abbrev record with VAL1.

With this patch we will have two different abbreviations here:

[N] DW_TAG_structure_type       DW_CHILDREN_yes
        DW_AT_ATTR        DW_FORM_implicit_const  VAL1
        // ....

[M] DW_TAG_structure_type       DW_CHILDREN_yes
        DW_AT_ATTR        DW_FORM_implicit_const  VAL2
        // ....

llvm-svn: 296691
2017-03-01 22:13:42 +00:00
Benjamin Kramer
39c2511ce5 [DAGCombiner] Remove non-ascii character and reflow comment.
llvm-svn: 296690
2017-03-01 22:10:43 +00:00
Matthias Braun
dc879ab8b2 LIU:::Query: Query LiveRange instead of LiveInterval; NFC
- We only need the information from the base class, not the additional
  details in the LiveInterval class.
- Spread more `const`
- Some code cleanup

llvm-svn: 296684
2017-03-01 21:48:12 +00:00
Reid Kleckner
599ca91378 Elide argument copies during instruction selection
Summary:
Avoids tons of prologue boilerplate when arguments are passed in memory
and left in memory. This can happen in a debug build or in a release
build when an argument alloca is escaped.  This will dramatically affect
the code size of x86 debug builds, because X86 fast isel doesn't handle
arguments passed in memory at all. It only handles the x86_64 case of up
to 6 basic register parameters.

This is implemented by analyzing the entry block before ISel to identify
copy elision candidates. A copy elision candidate is an argument that is
used to fully initialize an alloca before any other possibly escaping
uses of that alloca. If an argument is a copy elision candidate, we set
a flag on the InputArg. If the the target generates loads from a fixed
stack object that matches the size and alignment requirements of the
alloca, the SelectionDAG builder will delete the stack object created
for the alloca and replace it with the fixed stack object. The load is
left behind to satisfy any remaining uses of the argument value. The
store is now dead and is therefore elided. The fixed stack object is
also marked as mutable, as it may now be modified by the user, and it
would be invalid to rematerialize the initial load from it.

Supersedes D28388

Fixes PR26328

Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29668

llvm-svn: 296683
2017-03-01 21:42:00 +00:00
Adam Nemet
6702f17f53 New tool: opt-stats.py
I am planning to use this tool to find too noisy (missed) optimization
remarks.  Long term it may actually be better to just have another tool that
exports the remarks into an sqlite database and perform queries like this in
SQL.

This splits out the YAML parsing from opt-viewer.py into a new Python module
optrecord.py.

This is the result of the script on the LLVM testsuite:

Total number of remarks        714433

Top 10 remarks by pass:
  inline                         52%
  gvn                            24%
  licm                           13%
  loop-vectorize                  5%
  asm-printer                     3%
  loop-unroll                     1%
  regalloc                        1%
  inline-cost                     0%
  slp-vectorizer                  0%
  loop-delete                     0%

Top 10 remarks:
  gvn/LoadClobbered              20%
  inline/Inlined                 19%
  inline/CanBeInlined            18%
  inline/NoDefinition             9%
  licm/LoadWithLoopInvariantAddressInvalidated  6%
  licm/Hoisted                    6%
  asm-printer/InstructionCount    3%
  inline/TooCostly                3%
  gvn/LoadElim                    3%
  loop-vectorize/MissedDetails    2%

Beside some refactoring, I also changed optrecords not to use context to
access global data (max_hotness).  Because of the separate module this would
have required splitting context into two.  However it's not possible to access
the optrecord context from the SourceFileRenderer when calling back to
Remark.RelativeHotness.

llvm-svn: 296682
2017-03-01 21:35:00 +00:00
Zachary Turner
fac433a455 Re-enable BinaryStreamTest.StreamReaderObject.
This was failing because I was using memcmp to compare two
objects that included padding bytes, which were uninitialized.

llvm-svn: 296681
2017-03-01 21:30:06 +00:00
Craig Topper
860d21dff8 [APInt] Optimize APInt creation from uint64_t
Summary:
This patch moves the clearUnusedBits calls into the two different initialization paths for APInt from a uint64_t. This allows the compiler to better optimize the clearing of the unused bits for the single word case. And it puts the clearing for the multi word case into the initSlowCase function to save code. In the common case of initializing with 0 this allows the clearing to be completely optimized out for the single word case.

On my local x86 build this is showing a ~45kb reduction in the size of the opt binary.

Reviewers: RKSimon, hans, majnemer, davide, MatzeB

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30486

llvm-svn: 296677
2017-03-01 21:06:18 +00:00
Matthias Braun
82f79045b6 LIU::Query: Remove unused getter; NFC
llvm-svn: 296676
2017-03-01 21:02:56 +00:00
Matthias Braun
f1005cf89b LIU::Query: Remove always false member+getter; NFC
llvm-svn: 296675
2017-03-01 21:02:52 +00:00
Matthias Braun
8f086cb3df LiveIntervalUnion: Remove unused functions; NFC
Remove two unused functions that are in fact bad API and should not be
called anyway.

llvm-svn: 296674
2017-03-01 21:02:47 +00:00
Sanjay Patel
c6f8fa463c [InstCombine] use -instnamer and auto-generate complete checks; NFC
llvm-svn: 296673
2017-03-01 20:59:56 +00:00
Zachary Turner
3023830ec8 Disable BinaryStreamTest.StreamReaderObject.
llvm-svn: 296672
2017-03-01 20:58:28 +00:00
Sanjay Patel
78c43c1fcf [x86] add vector tests for more coverage of D30502; NFC
llvm-svn: 296671
2017-03-01 20:31:23 +00:00
Nemanja Ivanovic
ec52f511b1 Improve scheduling with branch coalescing
This patch adds a MachineSSA pass that coalesces blocks that branch
on the same condition.

Committing on behalf of Lei Huang.

Differential Revision: https://reviews.llvm.org/D28249

llvm-svn: 296670
2017-03-01 20:29:34 +00:00
Nirav Dave
6c2364a373 [DAG] Prevent Stale nodes from entering worklist
Add check that deleted nodes do not get added to worklist. This can
occur when a node's operand is simplified to an existing node.

This fixes PR32108.

Reviewers: jyknight, hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30506

llvm-svn: 296668
2017-03-01 20:19:38 +00:00
Nirav Dave
eaddb9d459 Add test cases for merging stores of multiply used stores
llvm-svn: 296667
2017-03-01 20:18:14 +00:00
Krzysztof Parzyszek
66051b01ce [RDF] Replace {} with explicit constructor, since not all compilers like it
llvm-svn: 296666
2017-03-01 19:59:28 +00:00
Daniel Berlin
dac375e8a0 NewGVN: Add debug counter for value numbering
llvm-svn: 296665
2017-03-01 19:59:26 +00:00
Paul Robinson
970f809bcd [DWARF] Print leading zeros in type signature
llvm-svn: 296663
2017-03-01 19:43:29 +00:00
Krzysztof Parzyszek
c5cb06d0d7 [RDF] Add recursion limit to getAllReachingDefsRec
For large programs this function can take significant amounts of time.
Let it abort gracefully when the program is too complex.

llvm-svn: 296662
2017-03-01 19:30:42 +00:00
Zachary Turner
5e0cdce6b7 [PDB] Fix and re-enable BinaryStreamArray test.
This was due to the test stream choosing an arbitrary partition
index for introducing the discontinuity rather than choosing
an index that would be correctly aligned for the type of data.

Also added an assertion into FixedStreamArray so that this will
be caught on all bots in the future, and not just the UBSan bot.

llvm-svn: 296661
2017-03-01 19:29:11 +00:00
Paul Robinson
e5104b06a3 Reorder fields for better packing. (NFC)
llvm-svn: 296660
2017-03-01 19:26:41 +00:00
Bob Haarman
a41054c917 enable building with LTO on Windows using clang-cl and lld
Summary: With clang-cl gaining support for link-time optimization, we can now enable builds using LTO when using clang-cl and lld on Windows. To do this, we must not pass the -flto flag to the linker; lld-link does not understand it, but will perform LTO automatically when it encounters bitcode files. We also don't pass /Brepro when using LTO - the compiler doesn't generate object files for LTO, so passing the flag would only result in a warning about it being unused.

Reviewers: rnk, ruiu, hans

Reviewed By: hans

Subscribers: mgorny, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D30240

llvm-svn: 296658
2017-03-01 19:22:18 +00:00
Paul Robinson
7f74b62b4c Alphabetize some cases (NFC)
llvm-svn: 296655
2017-03-01 19:01:47 +00:00
Hans Wennborg
358597d3c9 Revert r296575 "[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available"
It caused miscompiles, e.g. in Chromium (PR32109).

llvm-svn: 296654
2017-03-01 18:57:16 +00:00
Paul Robinson
3386628429 [DWARF] Default lower bound should respect requested DWARF version.
DWARF may define a default lower-bound for arrays in languages defined
in a particular DWARF version.  But the logic to suppress an
unnecessary lower-bound attribute was looking at the hard-coded
default DWARF version, not the version that had been requested.

Also updated the list with all languages defined in DWARF v5.

Differential Revision: http://reviews.llvm.org/D30484

llvm-svn: 296652
2017-03-01 18:32:37 +00:00
Artur Pilipenko
3d8bddc9b0 [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine
Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336).

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 296651
2017-03-01 18:12:29 +00:00
Krzysztof Parzyszek
0ac992393c [Hexagon] Fix testcase accidentally broken by r296645
llvm-svn: 296647
2017-03-01 17:53:42 +00:00
Krzysztof Parzyszek
fc4d778298 [Hexagon] Fix lowering of formal arguments of type i1
On Hexagon, values of type i1 are passed in registers of type i32,
even though i1 is not a legal value for these registers. This is a
special case and needs special handling to maintain consistency of
the lowering information.

This fixes PR32089.

llvm-svn: 296645
2017-03-01 17:30:10 +00:00