1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

44263 Commits

Author SHA1 Message Date
Artyom Skrobov
92d7e04b17 [Thumb1] The recently added tADCS and tSBCS pseudo-instructions were missing Uses = [CPSR]
Summary: Thanks to Oliver Stannard for helping catch this.

Reviewers: olista01, efriedma

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31815

llvm-svn: 300951
2017-04-21 07:35:21 +00:00
Davide Italiano
4469a6f083 [PartialInliner] Fix crash when inlining functions with unreachable blocks.
CodeExtractor looks up the dominator node corresponding to return blocks
when splitting them. If one of these blocks is unreachable, there's no
node in the Dom and CodeExtractor crashes because it doesn't check
for domtree node validity.
In theory, we could add just a check for skipping null DTNodes in
`splitReturnBlock` but the fix I propose here is slightly different. To the
best of my knowledge, unreachable blocks are irrelevant for the algorithm,
therefore we can just skip them when building the candidate set in the
constructor.

Differential Revision:  https://reviews.llvm.org/D32335

llvm-svn: 300946
2017-04-21 04:25:00 +00:00
Akira Hatanaka
3808b889a6 Revert r300932 and r300930.
It seems that r300930 was creating an infinite loop in dag-combine when
compling the following file:

MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c

llvm-svn: 300940
2017-04-21 01:31:50 +00:00
Akira Hatanaka
62264dba17 [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

This recommits r300913, which broke bots because I didn't fix a call to
ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of
TargetLoweringOpt and TargetLowering.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300930
2017-04-21 00:05:16 +00:00
Eli Friedman
2e82973f03 Revert r300746 (SCEV analysis for or instructions).
There have been multiple reports of this causing problems: a
compile-time explosion on the LLVM testsuite, and a stack
overflow for an opencl kernel.

llvm-svn: 300928
2017-04-20 23:59:05 +00:00
Akira Hatanaka
e8da9bdf75 Revert "[AArch64] Improve code generation for logical instructions taking"
This reverts r300913.

This broke bots.

llvm-svn: 300916
2017-04-20 23:03:30 +00:00
Craig Topper
b0c6763f18 [Simplify] Add testcase to show that merging conditional stores for triangles is sensitive to the order of the branch targets on the conditional branches. NFC
llvm-svn: 300915
2017-04-20 22:57:36 +00:00
Akira Hatanaka
2966c6087f [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300913
2017-04-20 22:47:56 +00:00
Sanjay Patel
aaccb178b4 [InstCombine] allow shl+shr demanded bits folds with splat constants
llvm-svn: 300911
2017-04-20 22:33:54 +00:00
Sanjay Patel
5a205db771 [InstCombine] add tests for shl+shr demanded bits splat vector folds; NFC
llvm-svn: 300907
2017-04-20 22:18:47 +00:00
Tim Northover
c7dd952b87 ARM: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300904
2017-04-20 21:56:52 +00:00
Sanjay Patel
9914890a01 [InstCombine] allow shl demanded bits folds with splat constants
More fixes are needed to enable the helper SimplifyShrShlDemandedBits().

llvm-svn: 300898
2017-04-20 21:33:02 +00:00
Sanjay Patel
bc10b86f5f [InstCombine] allow ashr/lshr demanded bits folds with splat constants
llvm-svn: 300888
2017-04-20 20:59:02 +00:00
Sanjay Patel
dc1a5027ea [InstCombine] add tests for demanded bits ashr/lshr splat constants; NFC
llvm-svn: 300884
2017-04-20 20:44:54 +00:00
Adrian Prantl
33d589de39 Don't emit locations that need a DW_OP_stack_value in DWARF 2 & 3.
https://bugs.llvm.org/show_bug.cgi?id=32382

llvm-svn: 300883
2017-04-20 20:42:33 +00:00
Tim Northover
cc3adfc204 ARM: handle post-indexed NEON ops where the offset isn't the access width.
Before, we assumed that any ConstantInt offset was precisely the access width,
so we could use the "[rN]!" form. ISelLowering only ever created that kind, but
further simplification during combining could lead to unexpected constants and
incorrect codegen.

Should fix PR32658.

llvm-svn: 300878
2017-04-20 19:54:02 +00:00
Paul Robinson
3707b00913 [DWARF] Versioning for DWARF constants; verify FORMs
Associate the version-when-defined with definitions of standard DWARF
constants.  Identify the "vendor" for DWARF extensions.
Use this information to verify FORMs in .debug_abbrev are defined as
of the DWARF version specified in the associated unit.
Removed two tests that had specified DWARF v1 (which essentially does
not exist).

Differential Revision: http://reviews.llvm.org/D30785

llvm-svn: 300875
2017-04-20 19:16:51 +00:00
Yaxun Liu
0c1bf45146 CodeGen: Let frame index value type match alloca addr space
Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.

However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.

This patch fixes that.

Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.

AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.

Differential Revision: https://reviews.llvm.org/D32021

llvm-svn: 300864
2017-04-20 18:15:34 +00:00
Amara Emerson
8d3887c14e [MVT][SVE] Scalable vector MVTs (2/3)
Adds scalable vector machine value types, and updates
the switch statements required for tablegen.

Patch by Graham Hunter.

Differential Revision: https://reviews.llvm.org/D32018

llvm-svn: 300840
2017-04-20 13:36:58 +00:00
Petar Jovanovic
2efe169f69 [mips][msa] Mask vectors holding shift amounts
Masked vectors which hold shift amounts when creating the following nodes:
ISD::SHL, ISD::SRL or ISD::SRA.
Instructions that use said nodes, which have had their arguments altered are
sll, srl, sra, bneg, bclr and bset.

For said instructions, the shift amount or the bit position that is
specified in the corresponding vector elements will be interpreted as the
shift amount/bit position modulo the size of the element in bits.

The problem lies in compiling with -O2 enabled, where the instructions for
formats .w and .d are not generated, but are instead optimized away.
In this case, having shift amounts that are either negative or greater than
the element bit size results in generation of incorrect results when
constant folding.

We remedy this by masking the operands for the nodes mentioned above before
actually creating them, so that the final result is correct before placed
into the constant pool.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D31331

llvm-svn: 300839
2017-04-20 13:26:46 +00:00
John Brawn
a597f91ffb [ARM] Fix handling of mapping symbols when changing sections
ChangeSection incorrectly registers LastEMSInfo as belonging to the previous
section, not the current section. This happens to work when changing sections
using .section, as the previous section is set to the current section before
the call to ChangeSection, but not when using .popsection.

Differential Revision: https://reviews.llvm.org/D32225

llvm-svn: 300831
2017-04-20 10:18:13 +00:00
John Brawn
c166bdc7f9 [AArch64] Fix handling of zero immediate in fmov instructions
Currently fmov #0 with a vector destination is handle incorrectly and results in
fmov #-1.9375 being emitted but should instead give an error. This is due to the
way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so
fix this by actually doing it through an alias.

Differential Revision: https://reviews.llvm.org/D31949

llvm-svn: 300830
2017-04-20 10:13:54 +00:00
John Brawn
7a9b2d5f6f [AArch64] Fix handling of integer fp immediates
When an integer is used as an fp immediate we're failing to check the return
value of getFP64Imm, so invalid values are silently permitted. Fix this by
merging together the integer and real handling.

llvm-svn: 300828
2017-04-20 10:10:10 +00:00
Adrian Prantl
ff3f98982f Fix bug that caused DwarfExpression to drop DW_OP_deref from FI locations
- introduced in r300522 and found via the Swift LLDB testsuite.

The fix is to set the location kind to memory whenever an FrameIndex
location is emitted.

rdar://problem/31707602

llvm-svn: 300793
2017-04-19 23:42:25 +00:00
Reid Kleckner
bc7fea80e6 Simplify test for sret attribute in instcombine
This change is correct because the verifier requires that at most one
argument be marked 'sret'.

NFC, removes a use of AttributeList slot APIs.

llvm-svn: 300784
2017-04-19 23:17:47 +00:00
Galina Kistanova
8f924ac9d8 Temporarily revert r299221 to fix nondeterminism in ThinLTO builder.
llvm-svn: 300783
2017-04-19 23:16:14 +00:00
Matthias Braun
b69c27fe7e X86FrameLowering: Fix getFrameIndexReference() for 'fixed' objects
Debug information is calculated with getFrameIndexReference() which was
missing some logic for the fixed object cases (= parameters on the stack).

rdar://24557797

Differential Revision: https://reviews.llvm.org/D32204

llvm-svn: 300781
2017-04-19 23:10:43 +00:00
Kostya Serebryany
43072589c3 [sanitizer-coverage] remove some more stale code
llvm-svn: 300778
2017-04-19 22:42:11 +00:00
Sanjay Patel
fab0adbac6 [DAG] add splat vector support for 'or' in SimplifyDemandedBits
I've changed one of the tests to not fold away, but we didn't and still don't do the transform
that the comment claims we do (and I don't know why we'd want to do that).

Follow-up to:
https://reviews.llvm.org/rL300725
https://reviews.llvm.org/rL300763

llvm-svn: 300772
2017-04-19 22:00:00 +00:00
Kostya Serebryany
91e92ad3e0 [sanitizer-coverage] remove stale code
llvm-svn: 300769
2017-04-19 21:48:09 +00:00
Sanjay Patel
c44de937c8 [DAG] add splat vector support for 'xor' in SimplifyDemandedBits
This allows forming more 'not' ops, so we get improvements for ISAs that have and-not.

Follow-up to:
https://reviews.llvm.org/rL300725

llvm-svn: 300763
2017-04-19 21:23:09 +00:00
Matthias Braun
ad6c52ac72 ARMFrameLowering: Reserve emergency spill slot for large arguments
Re-commit after revert in r300668. Changed getMaxFPOffset() to a
more conservative heuristic instead of trying to be clever and missing
for some exotic calling conventions.

We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

llvm-svn: 300761
2017-04-19 21:11:44 +00:00
Simon Pilgrim
f84a10ee86 [InstCombine] Add frem constant folding test (PR3316)
llvm-svn: 300757
2017-04-19 21:09:19 +00:00
Matt Arsenault
72dd027fd6 AMDGPU: Custom lower illegal small select types
Promote them to i32 vectors to avoid unpacking and re-packing
the vectors.

llvm-svn: 300754
2017-04-19 20:53:07 +00:00
Simon Pilgrim
1db704ce33 [InstCombine] Add frem constant folding test (PR32177)
llvm-svn: 300750
2017-04-19 20:47:58 +00:00
Eli Friedman
226d22a6a5 [ARM] Use TableGen patterns to select vtbl. NFC.
Differential Revision: https://reviews.llvm.org/D32103

llvm-svn: 300749
2017-04-19 20:39:39 +00:00
Eli Friedman
c5a8782468 [SCEV] Make SCEV or modeling more aggressive.
Use haveNoCommonBitsSet to figure out whether an "or" instruction
is equivalent to addition. This handles more cases than just
checking for a constant on the RHS.

Differential Revision: https://reviews.llvm.org/D32239

llvm-svn: 300746
2017-04-19 20:19:58 +00:00
Dehao Chen
5b504da948 Using address range map to speedup finding inline stack for address.
Summary:
In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places:

* linear search for the top-level DIE
* recursive linear traverse the DIE tree to find the path to the leaf DIE

In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32177

llvm-svn: 300742
2017-04-19 20:09:38 +00:00
Dehao Chen
3208b9d0b7 Update the madd.ll test with utils/update_llc_test_checks.py (NFC)
llvm-svn: 300740
2017-04-19 20:08:14 +00:00
Dehao Chen
84ece8088b PR32710: Disable using PMADDWD for unsigned short.
Summary: PMADDWD can only handle signed short.

Reviewers: mkuper, wmi

Reviewed By: mkuper

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32236

llvm-svn: 300737
2017-04-19 19:50:34 +00:00
Matt Arsenault
43d2e9a492 AMDGPU: Don't emit amd_kernel_code_t for callable functions
This is inserted directly in the text section. The relocation
for the function ends up resolving to the beginning of the
amd_kernel_code_t header rather than the actual function
entry point.

Also skip some of the comments for initialization
that only makes sense for kernels.

llvm-svn: 300736
2017-04-19 19:38:10 +00:00
Artem Tamazov
84d0c8fabe [AMDGPU][mc][tests][NFC] Update bulk ISA tests for Gfx7 and Gfx8
Added approx. 1100 gfx7 and 1040 gfx8 test cases.

llvm-svn: 300734
2017-04-19 19:12:06 +00:00
Matt Arsenault
7acc7fbaea StructurizeCFG: Directly invert cmp instructions
The most common case for a branch condition is
a single use compare. Directly invert the branch
predicate rather than adding a lot of xor i1 true
which the DAG will have to fold later.

This produces nicer to read structurizer output.

This produces some random changes in codegen
due to the DAG swapping branch conditions itself,
and then does a poor job of dealing with those
inverts.

llvm-svn: 300732
2017-04-19 18:29:07 +00:00
Sanjoy Das
8762f07b17 [GVN] Don't coerce non-integral pointers to integers or vice versa
Summary:
See http://llvm.org/docs/LangRef.html#non-integral-pointer-type

The NewGVN test does not fail without these changes (perhaps it does
try to coerce pointers <-> integers to begin with?), but I added the
test case anyway.

Reviewers: dberlin

Subscribers: mcrosier, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D32208

llvm-svn: 300730
2017-04-19 18:21:09 +00:00
Tim Northover
ed63cc7fe3 ARM: TLS calling convention doesn't preserve r9 or r12 on Darwin.
llvm-svn: 300726
2017-04-19 18:07:54 +00:00
Sanjay Patel
cdbefd846d [DAG] add splat vector support for 'and' in SimplifyDemandedBits
The patch itself is simple: stop discriminating against vectors in visitAnd() and again in 
SimplifyDemandedBits().

Some notes for reference:

1. We're not consistent about calls to SimplifyDemandedBits in the various visitXXX functions. 
   Sometimes, we check if the RHS is a constant first. Other times (like here), we just dive in.
2. I'd like to break the vector shackles in steps for the sake of risk minimization, but we could
    make similar simultaneous changes in other places if we think that would be better.
3. I don't know what the intent of the changed tests in this patch was supposed to be, but since 
   they wiggled in a positive way, I'm just going with that. :)
4. In the rotate tests, note that we can see through non-splat constants. This is a result of D24253.
5. My motivation for being here now is to make D31944 look better, so this is step 1 of N towards 
   improving the vector codegen in that patch without writing any actual new code.

Differential Revision: https://reviews.llvm.org/D32230

llvm-svn: 300725
2017-04-19 18:05:06 +00:00
Matt Arsenault
33960d0165 AMDGPU: Don't align callable functions to 256
llvm-svn: 300720
2017-04-19 17:42:39 +00:00
Matt Arsenault
b3625be5da AMDGPU: Change DivergenceAnalysis for function arguments
Stop assuming all functions are kernels.

llvm-svn: 300719
2017-04-19 17:42:34 +00:00
Sanjay Patel
219e228d73 [InstSimplify] fold identity shuffles (recursing if needed)
This patch simplifies the examples from D31509 and D31927 (PR30630) and catches 
the basic identity shuffle tests that Zvi recently added.

I'm not sure if we have something like this in DAGCombiner, but we should?

It's worth noting that "MaxRecurse / RecursionLimit" is only 3 on entry at the moment. 
We might want to bump that up if there are longer shuffle chains like this in the wild.

For now, we're ignoring shuffles that have undef mask elements because it's not
clear how those should be handled.

Differential Revision: https://reviews.llvm.org/D31960

llvm-svn: 300714
2017-04-19 16:48:22 +00:00
Krzysztof Parzyszek
c71372b507 [Hexagon] Generate proper offset in opt-addr-mode
Also, make a few changes to allow using the pass in .mir testcases.
Among other things, change the abbreviation from opt-amode to amode-opt,
because otherwise lit would expand the "opt" part to the full path to
the opt binary.

llvm-svn: 300707
2017-04-19 15:15:51 +00:00