1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

37066 Commits

Author SHA1 Message Date
Craig Topper
8e26afd797 [X86] Replace a SmallVector used to pass 2 values to an ArrayRef parameter with a fixed size array. NFC
llvm-svn: 267377
2016-04-25 04:30:29 +00:00
Junmo Park
df2e2c23ce Minor code cleanups. NFC.
llvm-svn: 267375
2016-04-25 01:40:54 +00:00
Saleem Abdulrasool
13a27b5a96 ARM: fix __chkstk Frame Setup on WoA
This corrects the MI annotations for the stack adjustment following the __chkstk
invocation.  We were marking the original SP usage as a Def rather than Kill.
The (new) assigned value is the definition, the original reference is killed.

Adjust the ISelLowering to mark Kills and FrameSetup as well.

This partially resolves PR27480.

llvm-svn: 267361
2016-04-24 20:12:48 +00:00
Nick Lewycky
e703258f62 Fix typo in comment. NFC
llvm-svn: 267354
2016-04-24 17:55:57 +00:00
Simon Pilgrim
713ecd864c [X86][SSE] getTargetShuffleMaskIndices - dropped (unused) UNDEF handling
We aren't currently making use of this in any successful mask decode and its actually incorrect as it inserts the wrong number of SM_SentinelUndef mask elements.

llvm-svn: 267350
2016-04-24 16:49:53 +00:00
Simon Pilgrim
81b1ce90a3 [X86][SSE] Use range loop. NFCI.
llvm-svn: 267349
2016-04-24 16:33:35 +00:00
Craig Topper
043de2291d [Lanai] Use EVT::getEVTString() to print a type as a string instead of an enum encoding value.
llvm-svn: 267348
2016-04-24 16:30:51 +00:00
Simon Pilgrim
9549c9fa45 [X86][XOP] Fixed VPPERM permute op decoding (PR27472).
Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask.

llvm-svn: 267346
2016-04-24 15:05:04 +00:00
Simon Pilgrim
1211a431ae [X86][SSE] Improved support for decoding target shuffle masks through bitcasts
Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm.

This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472.

llvm-svn: 267343
2016-04-24 14:53:54 +00:00
Marcin Koscielnicki
6b999dbbc0 [SystemZ] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on s390x.  The previous attempt at this was D19101,
which was before LOAD_STACK_GUARD existed.  Compared to the previous
version, this always emits a rather ugly block of 4 instructions, involving
a thread pointer load that can't be shared with other potential users.
However, this is necessary for SSP - spilling the guard value (or thread
pointer used to load it) is counter to the goal, since it could be
overwritten along with the frame it protects.

Differential Revision: http://reviews.llvm.org/D19363

llvm-svn: 267340
2016-04-24 13:57:49 +00:00
Craig Topper
87f8c97986 [X86] Merge LowerCTLZ and LowerCTLZ_ZERO_UNDEF into a single function that branches internally for the one difference, allowing the rest of the code to be common. NFC
llvm-svn: 267331
2016-04-24 06:27:39 +00:00
Craig Topper
5e5f4bfbbd [X86] Node need to check if AVX512 is supported when lowering vector CTLZ. The CTLZ operation is only Custom for vectors if AVX512 is enabled so if a vector gets here AVX512 is implied. NFC
llvm-svn: 267330
2016-04-24 06:27:35 +00:00
Gerolf Hoflehner
f601ce3709 [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.

Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267328
2016-04-24 05:14:01 +00:00
Craig Topper
7cda2fdad5 [X86] Remove isel patterns for selecting tzcnt/lzcnt from cmove/ne+cttz/ctlz. These are folded by DAG combine now.
llvm-svn: 267326
2016-04-24 04:38:34 +00:00
Craig Topper
3b13c64dcf Fix an assertion that can never fire because the condition ANDed with the string is just true or 1.
llvm-svn: 267324
2016-04-24 04:38:29 +00:00
Craig Topper
ff7f9dafdb Fix a couple assertions that can never fire because they just contained the text string which always evaluates to true. Add a ! so they'll evaluate to false.
llvm-svn: 267312
2016-04-24 02:01:25 +00:00
Craig Topper
917b9d7a47 [X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly.
llvm-svn: 267311
2016-04-24 02:01:22 +00:00
Davide Italiano
734b1e3ae5 [MC/ELF] Implement support for GOTPCRELX/REX_GOTPCRELX.
The option to control the emission of the new relocations
is -relax-relocations (blatantly copied from GNU as).
It can't be enabled by default because it breaks relatively
recent versions of ld.bfd/ld.gold (late 2015).

llvm-svn: 267307
2016-04-24 01:03:57 +00:00
Davide Italiano
bed5ce5d10 [MC/ELF] Pass Fixup to getRelocType64.
In preparation for other changes.

llvm-svn: 267300
2016-04-23 22:26:31 +00:00
Renato Golin
d3f349208c Revert "[AArch64] Fix optimizeCondBranch logic."
This reverts commit r267206, as it broke self-hosting on AArch64.

llvm-svn: 267294
2016-04-23 19:30:52 +00:00
Craig Topper
aa283acffa [Hexagon] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267266
2016-04-23 02:49:31 +00:00
Craig Topper
0c0d9161b6 [NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ctlz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267265
2016-04-23 02:49:29 +00:00
Craig Topper
a338f25008 [WebAssembly] Set ctlz_zero_undef/cttz_zero_undef to Expand so LegalizeDAG will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267264
2016-04-23 02:49:25 +00:00
Matt Arsenault
3cbb6d7d74 AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size
llvm-svn: 267244
2016-04-22 22:59:16 +00:00
Matt Arsenault
b274219b5d AMDGPU: Re-visit nodes in performAndCombine
This fixes test regressions when i64 loads/stores are made promote.

llvm-svn: 267240
2016-04-22 22:48:38 +00:00
Sriraman Tallam
bdcd2a52a2 Differential Revision: http://reviews.llvm.org/D19040
llvm-svn: 267229
2016-04-22 21:41:58 +00:00
Peter Collingbourne
df3f55c3ab CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
Quentin Colombet
dfaf1211a5 [AArch64] Fix optimizeCondBranch logic.
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267206
2016-04-22 20:09:58 +00:00
Quentin Colombet
e89cf71564 [AArch64] When creating MRS instruction, make sure the destination register is
declared as a definition.

This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll.

llvm-svn: 267185
2016-04-22 18:46:17 +00:00
Quentin Colombet
e4d08c7e23 [AArch64][AdvSIMDScalar] Update the kill flags correctly.
We used to simply set the kill flags to true when transforming a scalar
instruction to a vector one.
SrcScalar1 = copy SrcVector1
... = opScalar SrcScalar1
=>
SrcScalar1 = copy SrcVector1
... = opVector SrcVector1<kill>

This is obviously wrong. The proper update consists in:
1. Propagate the kill status from the copy to the new opVector
2. Reset the kill status on the copy, since the live-range of
   SrcVector1 got extended.

This fixes some of the machine verifier errors for AArch64 with make check.

llvm-svn: 267180
2016-04-22 18:09:14 +00:00
Krzysztof Parzyszek
905ab44058 [Hexagon] Use common Pat classes for selecting code for intrinsics
llvm-svn: 267178
2016-04-22 18:05:55 +00:00
Krzysztof Parzyszek
7f1281b445 [Hexagon] Properly close live range in HexagonBlockRanges
llvm-svn: 267173
2016-04-22 17:27:22 +00:00
Konstantin Zhuravlyov
613d1574ee [AMDGPU] Insert nop pass: take care of outstanding feedback
- Switch few loops to range-based for loops
- Fix nop insertion at the end of BB
- Fix formatting
- Check for endpgm

Differential Revision: http://reviews.llvm.org/D19380

llvm-svn: 267167
2016-04-22 17:04:51 +00:00
Zoran Jovanovic
cc65c7cf04 [mips][microMIPS] Revert commit r266861.
Commit r266861 was the reason for failing tests in LLVM test suite.

llvm-svn: 267166
2016-04-22 16:53:15 +00:00
Krzysztof Parzyszek
e1e99be481 [Hexagon] Teach mux expansion how to deal with undef predicates
llvm-svn: 267165
2016-04-22 16:47:01 +00:00
Krzysztof Parzyszek
dc41008bf4 [Hexagon] Add definitions for trap/pause instructions
Also add tests for other instructions from HexagonSystemInst.td.

llvm-svn: 267162
2016-04-22 16:25:00 +00:00
Nirav Dave
2e439346e9 Emit code16 in assembly in 16-bit mode
Summary:
When generating assembly using -m16 we must explicitly mark it as
16-bit. Emit .code16 at beginning of file. Fixes wrong results when
using -fno-integrated-as.

Reviewers: dwmw2

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19392

llvm-svn: 267152
2016-04-22 13:36:11 +00:00
Simon Dardis
c64ea03044 [mips] Fix select patterns for MIPS64
When targetting MIPS64R6 some of the patterns for select were guarded by a
broken predicate. The predicate was supposed to test if a constant value
could fit in a 16 bit zero-extended field. Instead the value was tested to
fit in a 16 bit sign-extended field. For negative constants of native word
width this resulted in wrong code generation.

Reviewers: vkalintiris, dsanders

Differential Review: http://reviews.llvm.org/D19378

llvm-svn: 267151
2016-04-22 13:19:22 +00:00
Vasileios Kalintiris
fb33c878e4 [mips] Fix a small typo that would leave BLTZC out of getAnalyzableBrOpc().'
llvm-svn: 267149
2016-04-22 13:05:51 +00:00
Hrvoje Varga
0102a446f7 [mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions
Differential Revision: http://reviews.llvm.org/D19354

llvm-svn: 267137
2016-04-22 11:18:40 +00:00
Zoran Jovanovic
057e3e3388 [mips][microMIPS] Add R_MICROMIPS_PC18_S3 relocation
Differential Revision: http://reviews.llvm.org/D15026

llvm-svn: 267130
2016-04-22 10:15:12 +00:00
Daniel Sanders
ffe901fc26 Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127
2016-04-22 09:37:26 +00:00
Ashutosh Nema
4becb7c51e [X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode.
Summary:
rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
operations during DAG combine. This generates better code for truncate, so cost
of truncate needs to be changed but looks like it got changed only in SSE2 table
Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE 
v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
SSE4.1, so it will fall back to SSE2.

Reviewers: Simon Pilgrim
llvm-svn: 267123
2016-04-22 08:34:05 +00:00
Chris Dewhurst
028d2e0885 [Sparc] This provides support for itineraries on Sparc.
Specifically, itineraries for LEON processors has been added, along with several LEON processor Subtargets. Although currently all these targets are pretty much identical, support for features that will differ among these processors will be added in the very near future.

The different Instruction Itinerary Classes (IICs) added are sufficient to differentiate between the instruction timings used by LEON and, quite probably, by generic Sparc processors too, but the focus of the exercise has been for LEON processors, as the requirement of my project. If the IICs are not sufficient for other Sparc processor types and you want to add a new itinerary for one of those, it should be relatively trivial to adapt this.

As none of the LEON processors has Quad Floats, or is a Version 9 processor, none of those instructions have itinerary classes defined and revert to the default "NoItinerary" instruction itinerary.

Phabricator Review: http://reviews.llvm.org/D19359

llvm-svn: 267121
2016-04-22 08:17:17 +00:00
Chris Dewhurst
a841bbdbcd The following code would not work before this patch, due to the inability to take the address of a global object:
void func1() {

...
}

int main(int argc, char** argv) {

void (*pFunc)();
pFunc = &func1
pFunc();
...
}

Phabricator review: http://reviews.llvm.org/D19368

llvm-svn: 267120
2016-04-22 08:13:47 +00:00
Zlatko Buljan
b2404307e8 [mips][microMIPS] Implement DVP, EVP and JALRC.HB instructions
Differential Revision: http://reviews.llvm.org/D18687

llvm-svn: 267114
2016-04-22 06:44:34 +00:00
David Majnemer
4c6833cc3f Fix some spelling mistakes
llvm-svn: 267112
2016-04-22 06:37:48 +00:00
Craig Topper
05d7be2051 [SystemZ] Mark CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF as Expand instead of Custom since the custom logic just did what Expand does when CTTZ/CTLZ are Legal. NFC
llvm-svn: 267109
2016-04-22 05:29:58 +00:00
Craig Topper
9b0556c083 [Lanai] Set CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF to Expand instead of Legal so they will be converted to CTLZ/CTTZ by LegalizeDAG. Remove extra instructions that only existed to to contain patterns that match the zero_undef operations. NFC
llvm-svn: 267108
2016-04-22 05:13:01 +00:00
Craig Topper
07012bab8b [Lanai] Remove unused methods declarations. NFC
llvm-svn: 267107
2016-04-22 05:12:57 +00:00