1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

29561 Commits

Author SHA1 Message Date
Jun Bum Lim
296a001019 [JumpThread] Enhance finding partial redundant loads by continuing scanning single predecessor
Summary: While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor.

Reviewers: mcrosier, rengolin, reames, davidxl, haicheng

Reviewed By: rengolin

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D29200

llvm-svn: 293896
2017-02-02 15:12:34 +00:00
Nirav Dave
d4909b474b In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting after fixing X86 inc/dec chain bug.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 293893
2017-02-02 14:39:42 +00:00
NAKAMURA Takumi
b8fd579c94 DIBuilder.h: Fix a typo. [-Wdocumentation]
llvm-svn: 293876
2017-02-02 09:55:22 +00:00
Adam Nemet
835438236b [LV] Also port failure remarks to new OptimizationRemarkEmitter API
llvm-svn: 293866
2017-02-02 05:41:51 +00:00
Omair Javaid
34390864ad Fix LLDB Android AArch64 GCC debug info build
Committing after fixing suggested changes and tested release/debug builds on 
x86_64-linux and arm/aarch64 builds.

Differential revision: https://reviews.llvm.org/D29042

llvm-svn: 293850
2017-02-02 01:17:49 +00:00
Rui Ueyama
e8d788b83b Re-submit r293820: Return Error instead of bool from mergeTypeStreams().
llvm-svn: 293847
2017-02-02 00:47:10 +00:00
Dehao Chen
efda70ec5a Change debug-info-for-profiling from a TargetOption to a function attribute.
Summary: LTO requires the debug-info-for-profiling to be a function attribute.

Reviewers: echristo, mehdi_amini, dblaikie, probinson, aprantl

Reviewed By: mehdi_amini, dblaikie, aprantl

Subscribers: aprantl, probinson, ahatanak, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D29203

llvm-svn: 293833
2017-02-01 22:45:09 +00:00
Rui Ueyama
5084390489 Revert r293820: Return Error instead of bool from mergeTypeStreams().
It broke buildbots.

llvm-svn: 293824
2017-02-01 22:28:43 +00:00
Rui Ueyama
5eb47df814 Return Error instead of bool from mergeTypeStreams().
Previously, mergeTypeStreams returns only true or false, so it was
impossible to know the reason if it failed. This patch changes the
function signature so that it returns an Error object.

Differential Revision: https://reviews.llvm.org/D29362

llvm-svn: 293820
2017-02-01 22:09:34 +00:00
Zachary Turner
bef0faee96 [pdb] Add a new command for analyzing hash collisions.
This introduces the `analyze` subcommand.  For now there is only
one option, to analyze hash collisions in the type streams.  In
the future, however, we could add many more things here, such
as performing size analyses, compacting, and statistics about
the type of records etc.

llvm-svn: 293795
2017-02-01 18:30:22 +00:00
Matthew Simpson
1981a68b4b [LV] Move interleaved access helper functions to VectorUtils (NFC)
This patch moves some helper functions related to interleaved access
vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like
to use these functions in a follow-on patch that improves interleaved load and
store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was
already duplicated there and has been removed.

Differential Revision: https://reviews.llvm.org/D29398

llvm-svn: 293788
2017-02-01 17:45:46 +00:00
Javed Absar
4fcf0cfd20 [ARM] Enable Cortex-M23 and Cortex-M33 support.
Add both cores to the target parser and TableGen. Test that eabi
attributes are set correctly for both cores. Additionally, test the
absence and presence of MOVT in Cortex-M23 and Cortex-M33, respectively.

Committed on behalf of Sanne Wouda.
Reviewers : rengolin, olista01.

Differential Revision: https://reviews.llvm.org/D29073

llvm-svn: 293761
2017-02-01 11:55:03 +00:00
Evandro Menezes
f3c547dbf4 [CodeGen] Move MacroFusion to the target
This patch moves the class for scheduling adjacent instructions,
MacroFusion, to the target.

In AArch64, it also expands the fusion to all instructions pairs in a
scheduling block, beyond just among the predecessors of the branch at the
end.

Differential revision: https://reviews.llvm.org/D28489

llvm-svn: 293737
2017-02-01 02:54:34 +00:00
Dean Michael Berris
c91cd5f47b [XRay] Define the InstrumentationMap type
Summary:
This change implements the instrumentation map loading library which can
understand both YAML-defined instrumentation maps, and ELF 64-bit object
files that have the XRay instrumentation map section. We break it out
into a library on its own to allow for other applications to deal with
the XRay instrumentation map defined in XRay-instrumented binaries.

This type provides both raw access to the logical representation of the
instrumentation map entries as well as higher level functions for
converting a function ID into a function address.

At this point we only support ELF64 binaries and YAML-defined XRay
instrumentation maps. Future changes should extend this to support
32-bit ELF binaries, as well as other binary formats (like MachO).

As part of this change we also migrate all uses of the extraction logic
that used to be defined in tools/llvm-xray/ to use this new type and
interface for loading from files. We also remove the flag from the
`llvm-xray` tool that required users to specify the type of the
instrumentation map file being provided to instead make the library
auto-detect the file type.

Reviewers: dblaikie

Subscribers: mgorny, varno, llvm-commits

Differential Revision: https://reviews.llvm.org/D29319

llvm-svn: 293721
2017-02-01 00:05:29 +00:00
David Blaikie
c372dbadfb Add a verbose/human readable mode to llvm-symbolizer to investigate discriminators and other line table/backtrace features
Patch by Simon Que!

Differential Revision: https://reviews.llvm.org/D29094

llvm-svn: 293697
2017-01-31 22:19:38 +00:00
Daniel Berlin
00d455eb0e ScopedHashTable lookup should be const
llvm-svn: 293695
2017-01-31 22:01:08 +00:00
Matthew Simpson
24741d447a Fix VectorUtils include guard name (NFC)
VectorUtils was moved to Analysis from Transforms/Utils, but some comments and
the include guard name still reflect its old location.

llvm-svn: 293684
2017-01-31 20:29:10 +00:00
Tim Northover
43a4511ddc GlobalISel: merge invoke and call translation paths.
Well, sort of. But the lower-level code that invoke used to be using completely
botched the handling of varargs functions, which hopefully won't be possible if
they're using the same code.

llvm-svn: 293670
2017-01-31 18:36:11 +00:00
Peter Collingbourne
aa7dd32bfa MC: Introduce the ABS8 symbol modifier.
@ABS8 can be applied to symbols which appear as immediate operands to
instructions that have a 8-bit immediate form for that operand. It causes
the assembler to use the 8-bit form and an 8-bit relocation (e.g. R_386_8
or R_X86_64_8) for the symbol.

Differential Revision: https://reviews.llvm.org/D28688

llvm-svn: 293667
2017-01-31 18:28:44 +00:00
Nirav Dave
61036c7be7 [X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.

Reviewers: hfinkel, craig.topper

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D28000

llvm-svn: 293648
2017-01-31 17:00:27 +00:00
Sam Parker
12536f21d1 [ARM] Avoid using ARM instructions in Thumb mode
The Requires class overrides the target requirements of an instruction,
rather than adding to them, so all ARM instructions need to include the
IsARM predicate when they have overwitten requirements.

This caused the swp and swpb instructions to be allowed in thumb mode
assembly, and the ARM encoding of CDP to be selected in codegen (which
is different for conditional instructions).

Differential Revision: https://reviews.llvm.org/D29283

llvm-svn: 293634
2017-01-31 14:35:01 +00:00
Matt Arsenault
c230fcbb58 AMDGPU: Generalize matching of v_med3_f32
I think this is safe as long as no inputs are known to ever
be nans.

Also add an intrinsic for fmed3 to be able to handle all safe
math cases.

llvm-svn: 293598
2017-01-31 03:07:46 +00:00
Matt Arsenault
a92f7a0b6a NVPTX: Move InferAddressSpaces to generic code
llvm-svn: 293579
2017-01-31 01:10:58 +00:00
Reid Kleckner
9ef8b5626d Remove LLVM_CONFIG from config headers
It appears to be dead, and it needlessly caused me to rebuild all of
LLVM when I changed CMAKE_INSTALL_PREFIX.

llvm-svn: 293574
2017-01-31 00:34:23 +00:00
Derek Schuff
85b3f75e18 [WebAssembly] Add wasm support for llvm-readobj
Create a WasmDumper subclass of ObjDumper to support Webassembly binary
files.

Patch by Sam Clegg

Differential Revision: https://reviews.llvm.org/D27355

llvm-svn: 293569
2017-01-30 23:30:52 +00:00
Matt Arsenault
dd128e79e5 NVPTX: Refactor NVPTXInferAddressSpaces to check TTI
Add a new TTI hook for getting the generic address space value.

llvm-svn: 293563
2017-01-30 23:02:12 +00:00
Dehao Chen
da8eca39f5 Expose isLegalToPromot as a global helper function so that SamplePGO pass can call it for legality check.
Summary: SamplePGO needs to check if it is legal to promote a target before it actually promotes it.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29306

llvm-svn: 293559
2017-01-30 22:46:37 +00:00
Tim Northover
018849ece7 GlobalISel: translate memset & memmove.
llvm-svn: 293541
2017-01-30 19:33:07 +00:00
Justin Bogner
4634635ce4 SDAG: Update ChainNodesMatched during UpdateChains if a node is replaced
Previously, we would hit UB (or the ISD::DELETED_NODE assert) if we
happened to replace a node during UpdateChains, because it would be
left in the list we were iterating over. This nulls out the pointer
when that happens so that we can avoid the issue.

Fixes llvm.org/PR31710

llvm-svn: 293522
2017-01-30 18:29:46 +00:00
Benjamin Kramer
d087048e1f [Orc] Add missing include.
llvm-svn: 293511
2017-01-30 17:54:57 +00:00
David Blaikie
77e367a5a4 unique_ptrify some containers in GlobalISel::RegisterBankInfo
To simplify/clarify memory ownership, make leaks (as one was found/fixed
recently) harder to write, etc.

(also, while I was there - removed a duplicate lookup in a container)

llvm-svn: 293506
2017-01-30 17:13:56 +00:00
Rafael Espindola
74029e1a4e Bring back r293480. It is safe now.
Original message:

    Fix the values of two xcore ELF flags.

    The values in llvm grew from a pre-MC day when they would not show up
    in .o files and are outside of the SHF_MASKPROC.

    Fortunately the MC output is not currently used as xcore has its own
    assemble and that assembler uses valid values. This updates llvm to
    use the same values as the xmos assembler.

llvm-svn: 293486
2017-01-30 15:49:20 +00:00
Rafael Espindola
4b47b4e687 Only print architecture dependent flags for that architecture.
Different architectures can have different meaning for flags in the
SHF_MASKPROC mask, so we should always check what the architecture use
before checking the flag.

NFC for now, but will allow fixing the value of an xmos flag.

llvm-svn: 293484
2017-01-30 15:38:43 +00:00
Rafael Espindola
93b3f3ebf1 Revert "Fix the values of two xcore ELF flags."
This reverts commit r293480.

The patch is correct, but found bugs in other areas that need to be fixed.

llvm-svn: 293481
2017-01-30 14:39:48 +00:00
Rafael Espindola
51b0b3682c Fix the values of two xcore ELF flags.
The values in llvm grew from a pre-MC day when they would not show up
in .o files and are outside of the SHF_MASKPROC.

Fortunately the MC output is not currently used as xcore has its own
assemble and that assembler uses valid values. This updates llvm to
use the same values as the xmos assembler.

llvm-svn: 293480
2017-01-30 14:07:43 +00:00
Daniel Berlin
39077e1d87 Revert "[MemorySSA] Revert r293361 and r293363, as the tests fail under asan."
This reverts commit r293471, reapplying r293361 and r293363 with a fix
for an out-of-bounds read.

llvm-svn: 293474
2017-01-30 11:35:39 +00:00
Sam McCall
f48206b9c8 [MemorySSA] Revert r293361 and r293363, as the tests fail under asan.
llvm-svn: 293471
2017-01-30 09:19:50 +00:00
Kristof Beyls
942514e271 [GlobalISel] Add support for indirectbr
Differential Revision: https://reviews.llvm.org/D28079

llvm-svn: 293470
2017-01-30 09:13:18 +00:00
Matthias Braun
7adfbb9eae MachineInstr: Remove parameter from dump()
The primary use of the dump() functions in LLVM is for use in a
debugger. Unfortunately lldb does not seem to handle default arguments
so using `p SomeMI.dump()` fails and you have to type the longer `p
SomeMI.dump(nullptr)`. Remove the paramter to make the most common use
easy. (You can always construct something like `p
SomeMI.print(dbgs(),MyTII)` if you need more features).

Differential Revision: https://reviews.llvm.org/D29241

llvm-svn: 293440
2017-01-29 18:20:42 +00:00
Craig Topper
76620cef51 [SelectionDAG] Make SDNode::getConstantOperandVal an inline method.
It's operation already exists manually in many places without using the method.

llvm-svn: 293421
2017-01-29 06:08:02 +00:00
Lang Hames
d03dc789c1 [Orc][RPC] Have handleOne abandon pending responses upon channel failure.
llvm-svn: 293411
2017-01-29 04:25:16 +00:00
Lang Hames
666c80ef8b [Orc][RPC] Remove redundant braces. NFC.
llvm-svn: 293410
2017-01-29 04:09:01 +00:00
Xinliang David Li
88f1087cc7 Add support to dump dot graph block layout after MBP
Differential Revision: https://reviews.llvm.org/D29141

llvm-svn: 293408
2017-01-29 01:57:02 +00:00
David Majnemer
81a7274bf2 [Target] Add NoSignedZerosFPMath to the TargetOptions constructor
Most flags were already initialized by the TargetOptions constructor but
we missed out on one.  Also, simplify the constructor by using field
initializers when possible.

llvm-svn: 293406
2017-01-29 01:27:08 +00:00
Lang Hames
4b3e1d3723 [Orc][RPC] Remove a couple of redundant calls to abandonAllPendingResponses.
appendCallAsync, which all RPC call functions ultimately build on, will call
abandonAllPendingResponses on channel error. These extra calls are redundant.

llvm-svn: 293405
2017-01-29 00:51:17 +00:00
Mohammad Shahid
a2646e67e7 [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.
The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask.

Reviewers: hfinkel, mssimpso, mkuper

Differential Revision: https://reviews.llvm.org/D26905

Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad
llvm-svn: 293386
2017-01-28 17:59:44 +00:00
Arpith Chacko Jacob
82b0fc2a51 [NVPTX] Add intrinsics to support named barriers.
Support for barrier synchronization between a subset of threads
in a CTA through one of sixteen explicitly specified barriers.
These intrinsics are not directly exposed in CUDA but are
critical for forthcoming support of OpenMP on NVPTX GPUs.

The intrinsics allow the synchronization of an arbitrary
(multiple of 32) number of threads in a CTA at one of 16
distinct barriers. The two intrinsics added are as follows:

call void @llvm.nvvm.barrier.n(i32 10)
waits for all threads in a CTA to arrive at named barrier #10.

call void @llvm.nvvm.barrier(i32 15, i32 992)
waits for 992 threads in a CTA to arrive at barrier #15.

Detailed description of these intrinsics are available in the PTX manual.
http://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions

Reviewers: hfinkel, jlebar
Differential Revision: https://reviews.llvm.org/D17657

llvm-svn: 293384
2017-01-28 16:38:15 +00:00
Lang Hames
3b1a40837c [Orc][RPC] Unlock message send/receive locks on failure.
This fixes some destruction-of-locked-mutex errors in RawByteChannel.

llvm-svn: 293375
2017-01-28 10:19:47 +00:00
Daniel Berlin
9d9be9a654 MemorySSA: Allow movement to arbitrary places
Summary: Extend the MemorySSAUpdater API to allow movement to arbitrary places

Reviewers: davide, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29239

llvm-svn: 293363
2017-01-28 02:26:39 +00:00
Quentin Colombet
b546b773a7 [RegisterBankInfo] Emit proper type for remapped registers.
When the OperandsMapper creates virtual registers, it used to just create
plain scalar register with the right size. This may confuse the
instruction selector because we lose the information of the instruction
using those registers what supposed to do. The MachineVerifier complains
about that already.

With this patch, the OperandsMapper still creates plain scalar register,
but the expectation is for the mapping function to remap the type
properly. The default mapping function has been updated to do that.

rdar://problem/30231850

llvm-svn: 293362
2017-01-28 02:23:48 +00:00