1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

110195 Commits

Author SHA1 Message Date
Shiva Chen
d9734e1033 [RISCV] Add ELFObjectFileBase::getRISCVFeatures let llvm-objdump could get RISCV target feature
llvm-objdump could get C feature by ELF::EF_RISCV_RVC e_flag,
so then we don't have to add -mattr=+c on the command line.

Differential Revision: https://reviews.llvm.org/D42629

llvm-svn: 324058
2018-02-02 06:01:02 +00:00
Craig Topper
d4d8a9b1d1 [X86] Legalize (v64i1 (bitcast (i64 X))) on 32-bit targets by extracting 32-bit halves from i32, bitcasting each to v32i1, and concatenating.
This prevents the scalarization that would otherwise occur.

llvm-svn: 324057
2018-02-02 05:59:33 +00:00
Craig Topper
06dda9133b [X86] Legalize (i64 (bitcast (v64i1 X))) on 32-bit targets by extracting to v32i1 and bitcasting to i32.
This saves a trip through memory and seems to open up other combining opportunities.

llvm-svn: 324056
2018-02-02 05:59:31 +00:00
Shiva Chen
dced0d02fe [RISCV] Fix c.addi and c.addi16sp immediate constraints which should be non-zero
Differential Revision: https://reviews.llvm.org/D42782

llvm-svn: 324055
2018-02-02 02:43:23 +00:00
Shiva Chen
6ba8127514 [RISCV] Define getSetCCResultType for setting vector setCC type
To avoid trigger "No default SetCC type for vectors!" Assertion

Differential Revision: https://reviews.llvm.org/D42675

llvm-svn: 324054
2018-02-02 02:43:18 +00:00
Amara Emerson
34c4b3c143 Fix debug spelling in ResetMachineFunction pass.
llvm-svn: 324048
2018-02-02 01:49:59 +00:00
Amara Emerson
3e28ce3dc2 [GlobalISel] Constrain the dest reg of IMPLICT_DEF.
This fixes a crash where the user is a COPY, which deliberately does not
constrain its source operands, resulting in a vreg without a reg class escaping
selection.

Differential Revision: https://reviews.llvm.org/D42697

llvm-svn: 324047
2018-02-02 01:44:43 +00:00
David Blaikie
bd15210d03 Remove non-modular header containing static utility functions
The one place that uses these functions isn't particularly
long/complicated, so it's easier to just have these inline at that
location than trying to split it out into a true header. (in part also
because of the use of the DEBUG macros, which make this not really a
standalone header even if the static functions were made inline instead)

llvm-svn: 324044
2018-02-02 00:33:50 +00:00
David Blaikie
74be2045f5 Add missing includes
llvm-svn: 324040
2018-02-02 00:11:09 +00:00
Matthias Braun
3ca3ed7921 SplitKit: Fix liveness recomputation in some remat cases.
Example situation:
```
BB0:
  %0 = ...
  use %0
  ; ...
  condjump BB1
  jmp BB2

BB1:
  %0 = ...   ; rematerialized def from above (from earlier split step)
  jmp BB2

BB2:
  ; ...
  use %0
```

%0 will have a live interval with 3 value numbers (for the BB0, BB1 and
BB2 parts). Now SplitKit tries and succeeds in rematerializing the value
number in BB2 (This only works because it is a secondary split so
SplitKit is can trace this back to a single original def).

We need to recompute all live ranges affected by a value number that we
rematerialize. The case that we missed before is that when the value
that is rematerialized is at a join (Phi VNI) then we also have to
recompute liveness for the predecessor VNIs.

rdar://35699130

Differential Revision: https://reviews.llvm.org/D42667

llvm-svn: 324039
2018-02-02 00:08:19 +00:00
Craig Topper
3869523c0d [X86] Separate the call to LowerVectorAllZeroTest from EmitTest. NFCI
Every instruction that has the word TEST in its name seems to have been buried into EmitTest. But that code is largely concerned with trying to reuse the flags from instructions that update flags in a pretty normal way.

PTEST/TESTP/KTEST do not update flags in a normal way. They only update Z and C and the C flag update is non-standard. Rather than try to bend EmitTest's already complex logic to accomodate this, just move the call up to LowerSETCC and replicate the few pre-checks that are needed.

While there add a FIXME for using the C flag for checking for all 1s which we definitely couldn't do from EmitTEST.

llvm-svn: 324029
2018-02-01 23:21:20 +00:00
Amara Emerson
0ea3f874bf [GlobalISel][Legalizer] Relax a legalization loop detecting assert.
Legalizing vectors may keep the element type the same but change the number of
elements, the assert didn't take this into account.

llvm-svn: 324028
2018-02-01 23:10:57 +00:00
Sanjay Patel
b0b5f996ac [InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 inst
This is the enhancement suggested in D42536 to fix a shortcoming in 
regular InstCombine's canEvaluate* functionality.
When we have multiple uses of a value, but they're all in one instruction, we can 
allow that expression to be narrowed or widened for the same cost as a single-use 
value.

AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified 
away if the operands are the same value; add becomes shl; shifts with a variable shift 
amount aren't handled.

Differential Revision: https://reviews.llvm.org/D42739

llvm-svn: 324014
2018-02-01 21:55:53 +00:00
Nemanja Ivanovic
b9fc7e1e61 [PowerPC] Tell VSX swap removal that scalar conversions are lane-sensitive
This is a rather non-controversial change. We were missing these instructions
from the list of instructions that are lane-sensitive. These two put the result
into lane 0 (BE) or 3 (LE) regardless of the input. This patch fixes PR36068.

llvm-svn: 324005
2018-02-01 21:09:04 +00:00
Craig Topper
2a3ca1333b [DAGCombiner] When folding (insert_subvector undef, (bitcast (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output
We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.

Fixes PR36199

Differential Revision: https://reviews.llvm.org/D42809

llvm-svn: 324002
2018-02-01 20:48:50 +00:00
Amara Emerson
ae95e7a615 [GlobalISel] Fix assert failure when legalizing non-power-2 loads.
Until we support extending loads properly we're going to fall back for these.
We already handle stores in the same way, so this is just being consistent.

llvm-svn: 324001
2018-02-01 20:47:03 +00:00
Brock Wyma
c08d555184 [CodeView] Class record member counts should include base classes and ...
Increment the field list member count for base classes and virtual base
classes.

Differential Revision: https://reviews.llvm.org/D41874

llvm-svn: 324000
2018-02-01 20:37:38 +00:00
Benjamin Kramer
38d1af6c06 [ADT] Replace sys::MemoryFence with standard atomics.
This is a bit faster in theory, in practice it's cold code that's only
active in !NDEBUG, so it probably doesn't make a difference. This is one
of the last users of our homegrown Atomic.h.

llvm-svn: 323999
2018-02-01 20:28:33 +00:00
Sanjay Patel
b172311d95 [AArch64] remove bogus comment; NFC
I added this comment with D42323, but as discussed in D42806, the architecture
does the right thing for denorms. We don't even need the select on 0.0 here?

llvm-svn: 323996
2018-02-01 19:59:33 +00:00
Easwaran Raman
18789746d3 Remove CallGraphTraits and use equivalent methods in GraphTraits
Summary:
D42698 adds child_edge_{begin|end} and children_edges to GraphTraits
which are used here. The reason for this change is to make it easy to
use count propagation on ModulesummaryIndex. As it stands,
CallGraphTraits is in Analysis while ModuleSummaryIndex is in IR.

Reviewers: davidxl, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42703

llvm-svn: 323994
2018-02-01 19:40:35 +00:00
Geoff Berry
eb996ef33b [MachineCopyPropagation] Extend pass to do COPY source forwarding
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.

This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).

This change is a continuation of the work started in D30751.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar

Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits

Differential Revision: https://reviews.llvm.org/D41835

llvm-svn: 323991
2018-02-01 18:54:01 +00:00
Changpeng Fang
d15f8bd42f AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature.
Reviewers:
  Matt and Brian

Differential Revision:
  https://reviews.llvm.org/D42548

llvm-svn: 323988
2018-02-01 18:41:33 +00:00
Simon Pilgrim
0cd5dedfd4 [X86][SSE] LowerBUILD_VECTORAsVariablePermute - add support for scaling index vectors
This allows us to use PSHUFB for v8i16/v4i32 and VPERMD/PERMPS for v4i64/v4f64 variable shuffles.

Differential Revision: https://reviews.llvm.org/D42487

llvm-svn: 323987
2018-02-01 18:10:30 +00:00
Craig Topper
5653f11c57 [X86] Remove custom lowering vXi1 extending loads and truncating stores.
Summary: Now that v2i1/v4i1 are legal without VLX. And v32i1 is legalized by splitting rather than widening. And isVectorLoadExtDesirable returns false for vXi1. It appears this handling is dead because the operations simply don't exist.

Reviewers: RKSimon, zvi, guyblank, delena, spatel

Reviewed By: delena

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D42781

llvm-svn: 323983
2018-02-01 17:08:41 +00:00
Craig Topper
2f76252b8b [X86] Turn X86ISD::AND nodes that have no flag users back into ISD::AND just before isel to enable test instruction matching
Summary:
EmitTest sometimes creates X86ISD::AND specifically to hide the AND from DAG combine. But this prevents isel patterns that look for (cmp (and X, Y), 0) from being able to see it. So we end up with an AND and a TEST. The TEST gets removed by compare instruction optimization during the peephole pass.

This patch attempts to fix this by converting X86ISD::AND with no flag users back into ISD::AND during the DAG preprocessing just before isel.

In order to do this correctly I had to make the X86ISD::AND node created by EmitTest in this case really have a flag output. Which arguably it should have had anyway so that the number of operands would be consistent for the opcode in all cases. Then I had to modify the ReplaceAllUsesWith to understand that we might be looking at an instruction with 2 outputs. Though in this case there are no uses to replace since we just created the node, but that's what the code did before so I just made it keep working.

Reviewers: spatel, RKSimon, niravd, deadalnix

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42764

llvm-svn: 323982
2018-02-01 17:08:39 +00:00
Sanjay Patel
cf5a5a2801 [DAGCombiner] filter out denorm inputs when calculating sqrt estimate (PR34994)
As shown in the example in PR34994:
https://bugs.llvm.org/show_bug.cgi?id=34994
...we can return a very wrong answer (inf instead of 0.0) for square root when 
using a reciprocal square root estimate instruction.

Here, I've conditionalized the filtering out of denorms based on the function 
having "denormal-fp-math"="ieee" in its attributes. The other options for this 
attribute are 'preserve-sign' and 'positive-zero'.

So we don't generate this extra code by default with just '-ffast-math' (because 
then there's no denormal attribute string at all), but it works if you specify 
'-ffast-math -fdenormal-fp-math=ieee' from clang. 

As noted in the review, there may be other problems in clang that affect the 
results depending on platform (Linux x86 at least), but this should allow 
creating the desired codegen.

Differential Revision: https://reviews.llvm.org/D42323

llvm-svn: 323981
2018-02-01 16:57:18 +00:00
Nirav Dave
e6ac7d026e [SelectionDAG] Fix UpdateChains handling of TokenFactors
Summary:
In Instruction Selection UpdateChains replaces all matched Nodes'
chain references including interior token factors and deletes them.
This may allow nodes which depend on these interior nodes but are not
part of the set of matched nodes to be left with a dangling dependence.
Avoid this by doing the replacement for matched non-TokenFactor nodes.

Fixes PR36164.

Reviewers: jonpa, RKSimon, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42754

llvm-svn: 323977
2018-02-01 16:11:59 +00:00
Sjoerd Meijer
ebb063bbdd [ARM] FullFP16 LowerReturn Fix
Commit r323512 introduced an optimisation in LowerReturn for half-precision
return values. A missing check caused a crash when the return value is "undef"
(i.e. a node that has no operands).

Differential Revision: https://reviews.llvm.org/D42743

llvm-svn: 323968
2018-02-01 13:48:40 +00:00
David Green
34ffa45c61 Revert commit rL323951
Looks like it's causing timeouts out on at least ppc64le
buildbots.

llvm-svn: 323959
2018-02-01 13:05:25 +00:00
Aleksandar Beserminji
8acbccaf76 [mips] Include EVA instructions in Std2MicroMips mapping tables
This patch includes EVA instructions in the Std2MicroMips mapping
tables, which is required for direct object emission.

Differential Revision: https://reviews.llvm.org/D41771

llvm-svn: 323958
2018-02-01 12:53:26 +00:00
Clement Courbet
422dc7f90c [AArch64][NFC] Make all ProcResource definitions include their SchedModel.
This makes targets ExynosM1,ExynosM3,ThunderX2T99 consistent with all
other targets.

llvm-svn: 323955
2018-02-01 12:12:01 +00:00
Yvan Roux
ff2ede76d2 [ARM] Add support for unpredictable MVN instructions.
This fixes bugzilla 33011
https://bugs.llvm.org/show_bug.cgi?id=33011

Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in
sections A8.8.116 and A8.8.117.

It fixes also the usage of PC register as destination register for MVN
register-shifted register version as specified in A8.8.117.

Differential Revision: https://reviews.llvm.org/D41905

llvm-svn: 323954
2018-02-01 12:06:57 +00:00
David Green
ce421d820d [InstCombine] Allow common type conversions to i8/i16/i32
This, in instcombine, allows conversions to i8/i16/i32 (very
common cases) even if the resulting type is not legal according
to the data layout. This can often open up extra combine
opportunities.

Differential Revision: https://reviews.llvm.org/D42424

llvm-svn: 323951
2018-02-01 11:06:18 +00:00
Yvan Roux
ffe1ecc492 Test commit: Fix a comment.
llvm-svn: 323947
2018-02-01 08:39:58 +00:00
Mikael Holmen
a7c9ce5fb8 [LSR] Don't force bases of foldable formulae to the final type.
Summary:
Before emitting code for scaled registers, we prevent
SCEVExpander from hoisting any scaled addressing mode
by emitting all the bases first. However, these bases
are being forced to the final type, resulting in some
odd code.

For example, if the type of the base is an integer and
the final type is a pointer, we will emit an inttoptr
for the base, a ptrtoint for the scale, and then a
'reverse' GEP where the GEP pointer is actually the base
integer and the index is the pointer. It's more intuitive
to use the pointer as a pointer and the integer as index.

Patch by: Bevin Hansson

Reviewers: atrick, qcolombet, sanjoy

Reviewed By: qcolombet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42103

llvm-svn: 323946
2018-02-01 06:38:34 +00:00
Dean Michael Berris
6f4f0c7384 [XRay][compiler-rt+llvm] Update XRay register stashing semantics
Summary:
This change expands the amount of registers stashed by the entry and
`__xray_CustomEvent` trampolines.

We've found that since the `__xray_CustomEvent` trampoline calls can show up in
situations where the scratch registers are being used, and since we don't
typically want to affect the code-gen around the disabled
`__xray_customevent(...)` intrinsic calls, that we need to save and restore the
state of even the scratch registers in the handling of these custom events.

Reviewers: pcc, pelikan, dblaikie, eizan, kpw, echristo, chandlerc

Reviewed By: echristo

Subscribers: chandlerc, echristo, hiraditya, davide, dblaikie, llvm-commits

Differential Revision: https://reviews.llvm.org/D40894

llvm-svn: 323940
2018-02-01 02:21:54 +00:00
Rafael Espindola
14987cd2c8 [MC] Fix assembler infinite loop on EH table using LEB padding.
Fix the infinite loop reported in PR35809. It can occur with GCC-style
EH table assembly, where the compiler relies on the assembler to
calculate the offsets in the EH table.

Also see https://sourceware.org/bugzilla/show_bug.cgi?id=4029 for the
equivalent issue in the GNU assembler.

Patch by Ryan Prichard!

llvm-svn: 323934
2018-02-01 00:25:19 +00:00
Amara Emerson
fd88771d7b [GlobalOpt] Improve common case efficiency of static global initializer evaluation
For very, very large global initializers which can be statically evaluated, the
code would create vectors of temporary Constants, modifying them in place,
before committing the resulting Constant aggregate to the global's initializer
value. This had effectively O(n^2) complexity in the size of the global
initializer and would cause memory and non-termination issues compiling some
workloads.

This change performs the static initializer evaluation and creation in batches,
once for each global in the evaluated IR memory. The existing code is maintained
as a last resort when the initializers are more complex than simple values in a
large aggregate. This should theoretically by NFC, no test as the example case
is massive. The existing test cases pass with this, as well as the llvm test
suite.

To give an example, consider the following C++ code adapted from the clang
regression tests:
struct S {
 int n = 10;
 int m = 2 * n;
 S(int a) : n(a) {}
};

template<typename T>
struct U {
 T *r = &q;
 T q = 42;
 U *p = this;
};

U<S> e;

The global static constructor for 'e' will need to initialize 'r' and 'p' of
the outer struct, while also initializing the inner 'q' structs 'n' and 'm'
members. This batch algorithm will simply use general CommitValueTo() method
to handle the complex nested S struct initialization of 'q', before
processing the outermost members in a single batch. Using CommitValueTo() to
handle member in the outer struct is inefficient when the struct/array is
very large as we end up creating and destroy constant arrays for each
initialization.
For the above case, we expect the following IR to be generated:

%struct.U = type { %struct.S*, %struct.S, %struct.U* }
%struct.S = type { i32, i32 }
@e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e,
                                                 i64 0, i32 1),
                        %struct.S { i32 42, i32 84 }, %struct.U* @e }
The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex
constant expression, while the other two elements of @e are "simple".

Differential Revision: https://reviews.llvm.org/D42612

llvm-svn: 323933
2018-01-31 23:56:07 +00:00
Matt Arsenault
10c80524c6 DAG: Fix not truncating when promoting bswap/bitreverse
These need to convert back to the original type, like any
other promotion.

llvm-svn: 323932
2018-01-31 23:54:16 +00:00
Evgeniy Stepanov
7831dfc680 Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations"
Miscompiles code. Testcase pending.

This reverts commit r323869.

llvm-svn: 323929
2018-01-31 22:55:19 +00:00
Matt Arsenault
21a429be13 Utils: Fix DomTree update for entry block
If SplitBlockPredecessors was used on a function entry block,
it wouldn't update the dominator tree.

llvm-svn: 323928
2018-01-31 22:54:37 +00:00
Matt Arsenault
19a16ce6f6 AMDGPU: Fix missing SCC def from s_xor_b64_term
llvm-svn: 323927
2018-01-31 22:54:27 +00:00
Amjad Aboud
671bdf42ed [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf node correctly.
This covers the case where TruncInst leaf node is a constant expression.
See PR36121 for more details.

Differential Revision: https://reviews.llvm.org/D42622

llvm-svn: 323926
2018-01-31 22:39:05 +00:00
Craig Topper
3ce0662dba [X86] Make the type checks in detectAVX512USatPattern more robust
This code currently uses isSimple and getSizeInBits in an attempt to prune types. But isSimple will return true for any type that any target supports natively. I don't think that's a good way to prune types. I also don't think the dest element type checks are very robust since we didn't do an isSimple check on the dest type.

This patch adds a check for the input type being legal to the one caller that didn't already check that. Then we explicitly check the element types for the destination are i8, i16, or i32

Differential Revision: https://reviews.llvm.org/D42706

llvm-svn: 323924
2018-01-31 22:26:31 +00:00
Puyan Lotfi
d4c615be8c Followup on Proposal to move MIR physical register namespace to '$' sigil.
Discussed here:

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html

In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.

llvm-svn: 323922
2018-01-31 22:04:26 +00:00
Krzysztof Parzyszek
ec143adab5 [Hexagon] Rename HexagonISelLowering::getNode to getInstr, NFC
llvm-svn: 323916
2018-01-31 21:17:03 +00:00
Chandler Carruth
263be0967a [x86] Make the retpoline thunk insertion a machine function pass.
Summary:
This removes the need for a machine module pass using some deeply
questionable hacks. This should address PR36123 which is a case where in
full LTO the memory usage of a machine module pass actually ended up
being significant.

We should revert this on trunk as soon as we understand and fix the
memory usage issue, but we should include this in any backports of
retpolines themselves.

Reviewers: echristo, MatzeB

Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D42726

llvm-svn: 323915
2018-01-31 20:56:37 +00:00
Krzysztof Parzyszek
b2def67068 [Hexagon] Implement HVX codegen for vector shifts
llvm-svn: 323914
2018-01-31 20:49:24 +00:00
Krzysztof Parzyszek
ff64aa793a [Hexagon] Handle ANY_EXTEND_VECTOR_INREG in lowering
llvm-svn: 323912
2018-01-31 20:48:11 +00:00
Krzysztof Parzyszek
d369dd4439 [Hexagon] Handle SETCC on vector pairs in lowering
llvm-svn: 323911
2018-01-31 20:46:55 +00:00