1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

141973 Commits

Author SHA1 Message Date
Dehao Chen
a322040bdf revert r289669 which breaks bots
llvm-svn: 289676
2016-12-14 17:23:16 +00:00
Yaxun Liu
98de4b3c84 AMDGPU: Emit runtime metadata version 2 as YAML
Differential Revision: https://reviews.llvm.org/D25046

llvm-svn: 289674
2016-12-14 17:16:52 +00:00
Derek Schuff
faf0eebd7e lit.cfg: Check value of build config rather than converting to boolean
This is a CMake var which never evaluates to false.

llvm-svn: 289673
2016-12-14 17:05:34 +00:00
Matt Arsenault
6bc4304e35 AMDGPU: Make AllocationPriority of SGPRs higher than VGPRs
Since SGPRs should spill to VGPRs, they should be allocated first.
I don't think this is sufficient for SGPRs to always spill to
VGPRs though.

llvm-svn: 289671
2016-12-14 16:52:06 +00:00
Dehao Chen
3565f45b60 Create SampleProfileLoader pass in llvm instead of clang
Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.

Reviewers: tejohnson, davidxl, dnovillo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27743

llvm-svn: 289669
2016-12-14 16:49:28 +00:00
Nirav Dave
9fd3ae9cf9 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
Reverting due to ARM MCJIT and MIPS LLD error.

This reverts commit r289659.

llvm-svn: 289667
2016-12-14 16:43:44 +00:00
Matt Arsenault
c74ace61e1 AMDGPU: Change vintrp printing
llvm-svn: 289664
2016-12-14 16:36:12 +00:00
Derek Schuff
8d68fc19a9 Revert gold part of change, just liblto
llvm-svn: 289663
2016-12-14 16:20:25 +00:00
Derek Schuff
25d8d3aa71 Disable libLTO tests when libLTO is not built
Summary:
The current test only checks whether ld64 is available, causing tests
to fail when ld64 is avilable but libLTO is not built.

Reviewers: beanz, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D27739

llvm-svn: 289662
2016-12-14 16:20:22 +00:00
Robert Lougher
2012251e73 New API for merging debug locations. NFC.
Given two debug locations the function getMergedLocation combines the
locations into a single location (which may be an empty location).
Please see https://reviews.llvm.org/D26256 for the discussion leading
up to this API.

Note the function is currently a stub.  This allows optimisations to
use the API although no location will actually be used.

This is patch 1 out of 8 for D26256.  As suggested by David Blaikie,
each change in D26256 has been broken out into a separate patch.

llvm-svn: 289661
2016-12-14 16:14:17 +00:00
Nirav Dave
afe2eccae3 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after fixing after removing load-store factoring through
token factors in favor of improved token factor operand pruning

Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.

Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).

Additional Minor Changes:

   1. Finishes removing unused AliasLoad code
   2. Unifies the the chain aggregation in the merged stores across
      code paths
   3. Re-add the Store node to the worklist after calling
      SimplifyDemandedBits.
   4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
      arbitrary, but seemed sufficient to not cause regressions in
      tests.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations

Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 289659
2016-12-14 15:44:26 +00:00
Simon Pilgrim
5b2a6b4bc7 Wdocumentation fix
llvm-svn: 289655
2016-12-14 15:14:44 +00:00
Simon Pilgrim
f05854a8a9 [DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2
Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases.

Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value.

Differential Revision: https://reviews.llvm.org/D27714

llvm-svn: 289654
2016-12-14 15:08:13 +00:00
Michael Zuckerman
2aaf4b3071 Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction
adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945.

Test explanation:
Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP.

In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers").

Missed optimization opportunity: 
vpcmpled %zmm0, %zmm1, %k0
knotw %k0, %k1

can be combine to 
vpcmpgtd %zmm0, %zmm2, %k1

Reviewers: 
1. delena
2. igorb 

Commited after check all 
Differential Revision: https://reviews.llvm.org/D27160

llvm-svn: 289653
2016-12-14 14:57:10 +00:00
Simon Pilgrim
e235162ad6 [X86][SSE] Add AVX1 tests to sdiv/udiv srem/urem combine tests
As requested on D27714

llvm-svn: 289652
2016-12-14 14:39:51 +00:00
Renato Golin
8d9c51307d Revert "[AVR] Add the very first on-target test"
This reverts commit r289648, as it's an execution test and relies on the
emulator/dispatcher being available on all builders.

llvm-svn: 289651
2016-12-14 13:24:20 +00:00
Stephan Bergmann
760305752a Adapt to recent APFloat change
llvm-svn: 289649
2016-12-14 12:11:35 +00:00
Dylan McKay
ebd67f8b91 [AVR] Add the very first on-target test
This test runs on actual AVR hardware.

llvm-svn: 289648
2016-12-14 12:03:39 +00:00
Stephan Bergmann
aba15d97df Replace APFloatBase static fltSemantics data members with getter functions
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.

Differential Revision: https://reviews.llvm.org/D26671

llvm-svn: 289647
2016-12-14 11:57:17 +00:00
Artur Pilipenko
6633f605fc Add a couple of assertions to the load combine code introduced by r289538
llvm-svn: 289646
2016-12-14 11:55:47 +00:00
Dylan McKay
39c862ae4e [AVR] Add the integrated testing tool to the .gitignore
We build it as an LLVM tool.

llvm-svn: 289645
2016-12-14 11:47:14 +00:00
Oliver Stannard
f0a90b93e6 [Assembler] Better error messages for .org directive
Currently, the error messages we emit for the .org directive when the
expression is not absolute or is out of range do not include the line
number of the directive, so it can be hard to track down the problem if
a file contains many .org directives.

This patch stores the source location in the MCOrgFragment, so that it
can be used for diagnostics emitted during layout.

Since layout is an iterative process, and the errors are detected during
each iteration, it would have been possible for errors to be reported
multiple times. To prevent this, I've made the assembler bail out after
each iteration if any errors have been reported. This will still allow
multiple unrelated errors to be reported in the common case where they
are all detected in the first round of layout.

Differential Revision: https://reviews.llvm.org/D27411

llvm-svn: 289643
2016-12-14 10:43:58 +00:00
Dylan McKay
fea5788d65 [AVR] Add a function instrumentation pass
This will be used for an on-chip test suite.

llvm-svn: 289641
2016-12-14 10:15:00 +00:00
Craig Topper
a7a943094c [X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar floating point to integer conversion intrinsics.
llvm-svn: 289639
2016-12-14 07:46:12 +00:00
Hal Finkel
972f16ac78 [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)
This change aims to unify and correct our logic for when we need to allow for
the possibility of the linker adding a TOC restoration instruction after a
call. This comes up in two contexts:

 1. When determining tail-call eligibility. If we make a tail call (i.e.
    directly branch to a function) then there is no place for the linker to add
    a TOC restoration.
 2. When determining when we need to add a nop instruction after a call.
    Likewise, if there is a possibility that the linker might need to add a
    TOC restoration after a call, then we need to put a nop after the call
    (the bl instruction).

First problem: We were using similar, but different, logic to decide (1) and
(2). This is just wrong. Both the resideInSameModule function (used when
determining tail-call eligibility) and the isLocalCall function (used when
deciding if the post-call nop is needed) were supposed to be determining the
same underlying fact (i.e. might a TOC restoration be needed after the call).
The same logic should be used in both places.

Second problem: The logic in both places was wrong. We only know that two
functions will share the same TOC when both functions come from the same
section of the same object. Otherwise the linker might cause the functions to
use different TOC base addresses (unless the multi-TOC linker option is
disabled, in which case only shared-library boundaries are relevant). There are
a number of factors that can cause functions to be placed in different sections
or come from different objects (-ffunction-sections, explicitly-specified
section names, COMDAT, weak linkage, etc.). All of these need to be checked.
The existing logic only checked properties of the callee, but the properties of
the caller must also be checked (for example, calling from a function in a
COMDAT section means calling between sections).

There was a conceptual error in the resideInSameModule function in that it
allowed tail calls to functions with weak linkage and protected/hidden
visibility. While protected/hidden visibility does prevent the function
implementation from being replaced at runtime (via interposition), it does not
prevent the linker from using an alternate implementation at link time (i.e.
using some strong definition to replace the provided weak one during linking).
If this happens, then we're still potentially looking at a required TOC
restoration upon return.

Otherwise, in general, the post-call nop is needed wherever ELF interposition
needs to be supported. We don't currently support ELF interposition at the IR
level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
for more information), and I don't think we should try to make it appear to
work in the backend in spite of that fact. This will yield subtle bugs if
interposition is attempted. As a result, regardless of whether we're in PIC
mode, we don't assume that we need to add the nop to support the possibility of
ELF interposition. However, the necessary check is in place (i.e. calling
GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for
which interposition is allowed at the IR level, we'll add the nop as necessary.
In the mean time, we'll generate more tail calls and fewer nops when compiling
position-independent code.

Differential Revision: https://reviews.llvm.org/D27231

llvm-svn: 289638
2016-12-14 07:24:50 +00:00
Craig Topper
402d6bf13b [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better.
Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking.

llvm-svn: 289636
2016-12-14 06:06:58 +00:00
Craig Topper
b9bbf2793c [X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts.
Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly.

llvm-svn: 289632
2016-12-14 05:43:05 +00:00
Mehdi Amini
1c61c19903 [ThinLTO] Add an API to trigger file-based API for returning objects to the linker
Summary:
The motivation is to support better the -object_path_lto option on
Darwin. The linker needs to write down the generate object files on
disk for later use by lldb or dsymutil (debug info are not present
in the final binary). We're moving this into libLTO so that we can
be smarter when a cache is enabled and hard-link when possible
instead of duplicating the files.

Reviewers: tejohnson, deadalnix, pcc

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D27507

llvm-svn: 289631
2016-12-14 04:56:42 +00:00
Craig Topper
02f77b59fa [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

llvm-svn: 289629
2016-12-14 03:17:30 +00:00
Craig Topper
2678dd3b0f [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

llvm-svn: 289628
2016-12-14 03:17:27 +00:00
Mehdi Amini
cef3c53d65 Don't double-initialize cl::opt for iterating in reverse order to uncover non-determinism in codegen by default
Bots are broken and needs to be fixed before having this on by default.
The feature was committed in r289619.

I tried to disable it in r289624 and failed because it was initialized in two places.

llvm-svn: 289626
2016-12-14 02:35:32 +00:00
Mehdi Amini
70b653afcb Disable Iterating SmallPtrSet in reverse order to uncover non-determinism in codegen by default
Bots are broken and needs to be fixed before having this on by default.
The feature was committed in r289619.

llvm-svn: 289624
2016-12-14 02:02:28 +00:00
Kostya Serebryany
f2bc70c266 [libFuzzer] document one more desired feature of a fuzz target
llvm-svn: 289622
2016-12-14 01:31:21 +00:00
Peter Collingbourne
3b1ff75ab8 LTO: Add support for multi-module bitcode files.
Differential Revision: https://reviews.llvm.org/D27313

llvm-svn: 289621
2016-12-14 01:17:59 +00:00
Paul Robinson
fee2a350dc [DWARF] Preserve column number when emitting 'line 0' record
Follow-up to r289256, address a FIXME to avoid resetting the column
number. This reduced .debug_line by 2.6% in a RelWithDebInfo
self-build of clang.

llvm-svn: 289620
2016-12-14 00:27:35 +00:00
Mandeep Singh Grang
f3d88aa2be [llvm] Iterate SmallPtrSet in reverse order to uncover non-determinism in codegen
Summary:
Given a flag (-mllvm -reverse-iterate) this patch will enable iteration of SmallPtrSet in reverse order.
The idea is to compile the same source with and without this flag and expect the code to not change.
If there is a difference in codegen then it would mean that the codegen is sensitive to the iteration order of SmallPtrSet.
This is enabled only with LLVM_ENABLE_ABI_BREAKING_CHECKS.

Reviewers: chandlerc, dexonsmith, mehdi_amini

Subscribers: mgorny, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D26718

llvm-svn: 289619
2016-12-14 00:15:57 +00:00
Evandro Menezes
f2e582c435 [ARM] Fix typo in checking prefix
llvm-svn: 289617
2016-12-14 00:02:03 +00:00
Evandro Menezes
d6a1cb395c Add support for Samsung Exynos M3 (NFC)
llvm-svn: 289613
2016-12-13 23:31:41 +00:00
Greg Clayton
b894e3a2a3 Update the header docs to match a recent checkin.
llvm-svn: 289612
2016-12-13 23:22:53 +00:00
Greg Clayton
0b5868cb61 Switch functions that returned bool and filled in a DWARFFormValue arg with ones that return Optional<DWARFFormValue>
Differential Revision: https://reviews.llvm.org/D27737

llvm-svn: 289611
2016-12-13 23:20:56 +00:00
Peter Collingbourne
592f0dc4a2 llvm-cat: Allow bitcode files to be created with no modules.
llvm-svn: 289610
2016-12-13 23:14:55 +00:00
Chris Bieneman
6d8e2f21ad [llvm-config] Fixing one check where shared libs implied dylib
We shouldn't print the dylib if LinkDylib is false.

llvm-svn: 289609
2016-12-13 23:08:52 +00:00
Derek Schuff
d4154aff29 llvm-config: Set LinkMode in addition to LinkDyLib when using --ignore-llvm
Summary:
LinkDyLib is only used (before arg processing) to set up the default for
LinkMode. So reset LinkMode as well, and process before --link-shared or
--link-static to allow those flags to continue to override it.

Reviewers: beanz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27736

llvm-svn: 289608
2016-12-13 23:01:53 +00:00
Kostya Serebryany
651d599e40 [libFuzzer] fix an UB (invalid shift) spotted by ubsan. The code worked fine by luck, because the way shifts actually work on clang+x86
llvm-svn: 289607
2016-12-13 22:49:14 +00:00
Chris Bieneman
4bed4ed8ab [llvm-config] Add --ignore-libllvm
This flag forces off linking libLLVM. This should resolve some issues reported on llvm-commits.

llvm-svn: 289605
2016-12-13 22:17:59 +00:00
Eugene Zelenko
271b3b84f5 [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 289604
2016-12-13 22:13:50 +00:00
Dehao Chen
4a5c82b02b Change CoverageTracker from a global variable to member variable to avoid breaking thread-safety. (NFC)
llvm-svn: 289603
2016-12-13 22:13:18 +00:00
Sanjoy Das
eb84cf6c51 Re-land "[SCEVExpander] Use llvm data structures; NFC"
This change re-lands r289215, by reverting r289482.  The underlying
issue that caused it to be reverted has been fixed by Tim Northover in
r289496.

Original commit message for r289215:

[SCEVExpander] Use llvm data structures; NFC

Original commit message for r289482:

Revert "[SCEVExpander] Use llvm data structures; NFC"

This reverts r289215 (git SHA1 cb7b86a1).  It breaks the ubsan build
because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it
tries to cast the empty and tombstone keys to `T *` (due to insufficient
alignment).

This is the relevant stack trace (thanks to Mike Aizatsky):

    #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39
    #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19
    #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23

llvm-svn: 289602
2016-12-13 22:04:58 +00:00
Anna Thomas
d7d47f4ccb [IRCE] Avoid loop optimizations on pre and post loops
Summary:
This patch will add loop metadata on the pre and post loops generated by IRCE.
Currently, we have metadata for disabling optimizations such as vectorization,
unrolling, loop distribution and LICM versioning (and confirmed that these
optimizations check for the metadata before proceeding with the transformation).

The pre and post loops generated by IRCE need not go through loop opts (since
these are slow paths).

Added two test cases as well.

Reviewers: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26806

llvm-svn: 289588
2016-12-13 21:05:21 +00:00
Michael Kuperstein
2c0f3dbd6f [LV] Don't vectorize when we have a small static bound on trip count
We currently check if the exact trip count is known and is smaller than the
"tiny loop" bound. We should be checking the maximum bound on the trip count
instead.

Differential Revision: https://reviews.llvm.org/D27690

llvm-svn: 289583
2016-12-13 20:38:18 +00:00