1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

50122 Commits

Author SHA1 Message Date
Diana Picus
47cdb935e0 [ARM GlobalISel] Thumb2: casts between int and ptr
Mark as legal and add tests. Nothing special to do.

llvm-svn: 349147
2018-12-14 13:45:38 +00:00
Diana Picus
36a62f59bc [ARM GlobalISel] Minor refactoring. NFCI
Refactor the ARMInstructionSelector to cache some opcodes in the
constructor instead of checking all the time if we're in ARM or Thumb
mode.

llvm-svn: 349143
2018-12-14 12:37:24 +00:00
Diana Picus
1f523d85de [ARM GlobalISel] Allow simple binary ops in Thumb2
Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM
and Thumb2.

Extract the legalizer tests for these opcodes into another file.

Add tests for the instruction selector.

llvm-svn: 349142
2018-12-14 11:58:14 +00:00
Craig Topper
486c1ccebe [DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext (setcc)) already has the target desired type for the setcc
Summary:
If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node.

To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term.

Reviewers: RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55459

llvm-svn: 349137
2018-12-14 08:28:24 +00:00
Craig Topper
5cd99cd6e6 [X86] Demote EmitTest to a helper function of EmitCmp. Route all callers except EmitCmp through EmitCmp.
This requires the two callers to manifest a 0 to make EmitCmp call EmitTest.

I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely.

llvm-svn: 349094
2018-12-13 23:55:30 +00:00
Evandro Menezes
6b61a418d6 [AArch64] Fix Exynos predicates (NFC)
Fix the logic in the definition of the `ExynosShiftExPred` as a more
specific version of `ExynosShiftPred`.  But, since `ExynosShiftExPred` is
not used yet, this change has NFC.

llvm-svn: 349091
2018-12-13 23:19:46 +00:00
Aakanksha Patil
ee2b4d8ed1 Revert r348971: [AMDGPU] Support for "uniform-work-group-size" attribute
This patch breaks RADV (and probably RadeonSI as well)

llvm-svn: 349084
2018-12-13 21:23:12 +00:00
Matt Arsenault
c25740dde7 AMDGPU/GlobalISel: Legalize/regbankselect block_addr
llvm-svn: 349081
2018-12-13 20:34:15 +00:00
Mircea Trofin
8fb4b5dbb4 [llvm] Address base discriminator overflow in X86DiscriminateMemOps
Summary:
Macros are expanded on a single line. In case of large expansions,
with sufficiently many instructions with memory operands (and when
-fdebug-info-for-profiling is requested), we may be unable to generate
new base discriminator values - new values overflow (base
discriminators may not be larger than 2^12).

This CL warns instead of asserting in such a case. A subsequent CL
will add APIs to check for overflow before creating new debug info.

See https://bugs.llvm.org/show_bug.cgi?id=39890

Reviewers: davidxl, wmi, gbedwell

Reviewed By: davidxl

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D55643

llvm-svn: 349075
2018-12-13 19:40:59 +00:00
Simon Pilgrim
240f871115 [X86][SSE] Add SSE vector imm/var shift support to SimplifyDemandedVectorEltsForTargetNode
llvm-svn: 349057
2018-12-13 16:39:29 +00:00
Simon Pilgrim
e579f1bff3 [X86][SSE] Fix all remaining modulo vector rotation amounts (PR38243)
There's still a couple of minor SimplifyDemandedElts regressions in some of the shift amount splats that will be fixed in future patches.

llvm-svn: 349052
2018-12-13 15:50:31 +00:00
Daniel Cederman
e3cb6ccb12 [Sparc] Add membar assembler tags
Summary: The Sparc V9 membar instruction can enforce different types of
memory orderings depending on the value in its immediate field.  In the
architectural manual the type is selected by combining different assembler
tags into a mask. This patch adds support for these tags.

Reviewers: jyknight, venkatra, brad

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D53491

llvm-svn: 349048
2018-12-13 15:29:12 +00:00
Simon Pilgrim
8f5885f53e [X86][SSE] Fix modulo rotation amounts for v8i16/v16i16/v4i32 (PR38243)
llvm-svn: 349047
2018-12-13 15:23:09 +00:00
Daniel Cederman
a4358d0136 [Sparc] Use float register for integer constrained with "f" in inline asm
Summary:
Constraining an integer value to a floating point register using "f"
causes an llvm_unreachable to trigger. This patch allows i32 integers
to be placed in a single precision float register and i64 integers to
be placed in a double precision float register. This matches the behavior
of GCC.

For other types the llvm_unreachable is removed to instead trigger an
error message that points out the offending line.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51614

llvm-svn: 349045
2018-12-13 15:13:29 +00:00
Jinsong Ji
ba49832db6 [PowerPC][NFC] Sorting out Pseudo related classes to avoid confusion
There are several Pseudo in PowerPC backend. 
eg:

* ISel Pseudo-instructions , which has let usesCustomInserter=1 in td 
ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them.
* Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.) 
ExpandPostRAPseudos -> expandPostRAPseudo will expand them
* Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction
* Pseudo instruction in CodeEmitter, which has encoding of 0.

Currently, in td files, especially PPCInstrVSX.td, 
we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly.

This patch is to

* Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter
* Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo
* Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo

Differential Revision: https://reviews.llvm.org/D55143

llvm-svn: 349044
2018-12-13 15:12:57 +00:00
Simon Pilgrim
bab6760889 [X86][SSE] Merge the vXi16/vXi32 vector rotation expansion cases. NFCI.
Merged the repeated code into a single if().

llvm-svn: 349040
2018-12-13 14:51:28 +00:00
Jonas Paulsson
6239df0eb2 [SystemZ] Pass copy-hinted regs first from getRegAllocationHints().
When computing register allocation hints for a GRX32Bit register, make sure
that any of the hinted registers that are also copy hints are returned first
in the list.

Review: Ulrich Weigand.
llvm-svn: 349037
2018-12-13 14:37:05 +00:00
Simon Pilgrim
b4fa86589e [X86][BWI] Don't custom lower vXi8 rotations.
We always expand to shifts anyhow - test changes are just different scheduling only.

llvm-svn: 349034
2018-12-13 13:44:33 +00:00
Chen Zheng
c2a6277f77 [PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier.
Differential Revision: https://reviews.llvm.org/D55499

llvm-svn: 349029
2018-12-13 12:25:20 +00:00
Simon Pilgrim
1e9c4bce6b [DAGCombine] Moved X86 rotate_amount % bitwidth == 0 early out to DAGCombiner
Remove common code from custom lowering (code is still safe if somehow a zero value gets used).

llvm-svn: 349028
2018-12-13 12:23:32 +00:00
Diana Picus
5026fe4d53 [ARM GlobalISel] Support exts and truncs for Thumb2
Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for
them in the instruction selector. This uses handwritten code again
because the patterns that are generated with TableGen are tuned for what
the DAG combiner would produce and not for simple sext/zext nodes.
Luckily, we only need to update the opcodes to use the Thumb2 variants,
everything else can be reused from ARM.

llvm-svn: 349026
2018-12-13 12:06:54 +00:00
Simon Pilgrim
9527e2e426 [TargetLowering] Add ISD::ROTL/ROTR vector expansion
Move existing rotation expansion code into TargetLowering and set it up for vectors as well.

Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment.

Begun removing x86 vector rotate custom lowering to use the expansion.

llvm-svn: 349025
2018-12-13 11:20:48 +00:00
Alex Bradbury
12e6230097 [RISCV] Add support for the various RISC-V FMA instruction variants
Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd).

The criteria for choosing whether a fused add or subtract is used, as well as
whether the product is negated or not, is whether some of the arguments to the
llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd
instructions were added to avoid the negation being performed using a xor
trick, which prevented the proper FMA forms from being selected and thus
tested.

The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 -
rs3), but they should be correct. The misleading names were inherited from
MIPS, where the negation happens after computing the sum.

The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions,
as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd.

Some comments in the test files about what type of instructions are there
tested were updated, to better reflect the current content of those test
files.

Differential Revision: https://reviews.llvm.org/D54205
Patch by Luís Marques.

llvm-svn: 349023
2018-12-13 10:49:05 +00:00
Arnaud A. de Grandmaison
9ed5c53397 [AArch64] Catch some more CMN opportunities.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33486

llvm-svn: 349022
2018-12-13 10:31:32 +00:00
Matt Arsenault
35d66b880e AMDGPU/GlobalISel: Legalize f64 fadd/fmul
llvm-svn: 349014
2018-12-13 08:27:48 +00:00
Matt Arsenault
bd00984550 AMDGPU/GlobalISel: RegBankSelect some simple operations
llvm-svn: 349012
2018-12-13 08:23:51 +00:00
Craig Topper
87cbe51492 [X86] Remove assert leftover from when i1 was a legal type. Add more accurate assert. NFC
llvm-svn: 349007
2018-12-13 06:14:25 +00:00
Stanislav Mekhanoshin
f7aedb309c [AMDGPU] Fix build failure, second attempt
Some compilers complain that variable is captured and some
complain when it is not. Switch to [&].

llvm-svn: 349006
2018-12-13 05:52:11 +00:00
Stanislav Mekhanoshin
edd6d8c36b [AMDGPU] Fix build failure
Fixed error 'lambda capture 'CondReg' is not required to be captured
for this use'.

llvm-svn: 349005
2018-12-13 05:21:25 +00:00
Stanislav Mekhanoshin
198d2e45b0 [AMDGPU] Simplify negated condition
Optimize sequence:

  %sel = V_CNDMASK_B32_e64 0, 1, %cc
  %cmp = V_CMP_NE_U32 1, %1
  $vcc = S_AND_B64 $exec, %cmp
  S_CBRANCH_VCC[N]Z
=>
  $vcc = S_ANDN2_B64 $exec, %cc
  S_CBRANCH_VCC[N]Z

It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the
rebuildSetCC().

Differential Revision: https://reviews.llvm.org/D55402

llvm-svn: 349003
2018-12-13 03:17:40 +00:00
Craig Topper
43dcc4cc5f [X86] Don't emit MULX by default with BMI2
MULX has somewhat improved register allocation constraints compared to the legacy MUL instruction. Both output registers are encoded instead of fixed to EAX/EDX, but EDX is used as input. It also doesn't touch flags. Unfortunately, the encoding is longer.

Prefering it whenever BMI2 is enabled is probably not optimal. Choosing it should somehow be a function of register allocation constraints like converting adds to three address. gcc and icc definitely don't pick MULX by default. Not sure what if any rules they have for using it.

Differential Revision: https://reviews.llvm.org/D55565

llvm-svn: 348975
2018-12-12 21:21:31 +00:00
Aakanksha Patil
57ff848c4d [AMDGPU] Support for "uniform-work-group-size" attribute
Updated the annotate-kernel-features pass to support the propagation of uniform-work-group attribute from the kernel to the called functions. Once this pass is run, all kernels, even the ones which initially did not have the attribute, will be able to indicate weather or not they have uniform work group size depending on the value of the attribute. 

Differential Revision: https://reviews.llvm.org/D50200

llvm-svn: 348971
2018-12-12 20:49:17 +00:00
Scott Linder
2405f803ac [AMDGPU] Emit MessagePack HSA Metadata for v3 code object
Continue to present HSA metadata as YAML in ASM and when output by tools
(e.g. llvm-readobj), but encode it in Messagepack in the code object.

Differential Revision: https://reviews.llvm.org/D48179

llvm-svn: 348963
2018-12-12 19:39:27 +00:00
Craig Topper
0f93413ef8 [X86] Emit SBB instead of SETCC_CARRY from LowerSELECT. Break false dependency on the SBB input.
I'm hoping we can just replace SETCC_CARRY with SBB. This is another step towards that.

I've explicitly used zero as the input to the setcc to avoid a false dependency that we've had with the SETCC_CARRY. I changed one of the patterns that used NEG to instead use an explicit compare with 0 on the LHS. We needed the zero anyway to avoid the false dependency. The negate would clobber its input register. By using a CMP we can avoid that which could be useful.

Differential Revision: https://reviews.llvm.org/D55414

llvm-svn: 348959
2018-12-12 19:20:21 +00:00
Simon Pilgrim
800ca97d88 [SelectionDAG] Add a generic isSplatValue function
This patch introduces a generic function to determine whether a given vector type is known to be a splat value for the specified demanded elements, recursing up the DAG looking for BUILD_VECTOR or VECTOR_SHUFFLE splat patterns.

It also keeps track of the elements that are known to be UNDEF - it returns true if all the demanded elements are UNDEF (as this may be useful under some circumstances), so this needs to be handled by the caller.

A wrapper variant is also provided that doesn't take the DemandedElts or UndefElts arguments for cases where we just want to know if the SDValue is a splat or not (with/without UNDEFS).

I had hoped to completely remove the X86 local version of this function, but I'm seeing some regressions in shift/rotate codegen that will take a little longer to fix and I hope to get this in sooner so I can continue work on PR38243 which needs more capable splat detection.

Differential Revision: https://reviews.llvm.org/D55426

llvm-svn: 348953
2018-12-12 18:32:29 +00:00
Artem Belevich
f4a6455e0c [NVPTX] do not rely on cached subtarget info.
If a module has function references, but no functions
themselves, we may end up never calling runOnMachineFunction
and therefore would never initialize nvptxSubtarget field
which would eventually cause a crash.

Instead of relying on nvptxSubtarget being initialized by
one of the methods, retrieve subtarget info directly.

Differential Revision: https://reviews.llvm.org/D55580

llvm-svn: 348952
2018-12-12 18:31:04 +00:00
Sanjay Patel
cfc854315e [x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEA
This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. 
That allows us to combine add ops with register moves and save some instructions. This is 
another step towards allowing add truncation in generic DAGCombiner (see D54640).

Differential Revision: https://reviews.llvm.org/D55494

llvm-svn: 348946
2018-12-12 17:58:27 +00:00
Neil Henning
c6adc5adda [AMDGPU] Extend the SI Load/Store optimizer to combine more things.
I've extended the load/store optimizer to be able to produce dwordx3
loads and stores, This change allows many more load/stores to be combined,
and results in much more optimal code for our hardware.

Differential Revision: https://reviews.llvm.org/D54042

llvm-svn: 348937
2018-12-12 16:15:21 +00:00
Simon Atanasyan
65c370858c [mips] Enable using of integrated assembler in all cases.
llvm-svn: 348934
2018-12-12 15:32:03 +00:00
Piotr Sobczak
9daba5888d [AMDGPU] Set metadata access for explicit section
Summary:
This patch provides a means to set Metadata section kind
for a global variable, if its explicit section name is
prefixed with ".AMDGPU.metadata."
This could be useful to make the global variable go to
an ELF section without any section flags set.

Reviewers: dstuttard, tpr, kzhuravl, nhaehnle, t-tye

Reviewed By: dstuttard, kzhuravl

Subscribers: llvm-commits, arsenm, jvesely, wdng, yaxunl, t-tye

Differential Revision: https://reviews.llvm.org/D55267

llvm-svn: 348922
2018-12-12 11:20:04 +00:00
Diana Picus
8868d7ab14 [ARM GlobalISel] Select load/store for Thumb2
Unfortunately we can't use TableGen for this because it doesn't yet
support predicates on the source pattern root. Therefore, add a bit of
handwritten code to the instruction selector to handle the most basic
cases.

Also mark them as legal and extract their legalizer test cases to a new
test file.

llvm-svn: 348920
2018-12-12 10:32:15 +00:00
Jonas Paulsson
f3854426fe [SystemZ] Minor cleanup of SchedModels
Some fixes of a few InstRWs for z13 and z14.

Review: Ulrich Weigand
llvm-svn: 348917
2018-12-12 08:26:24 +00:00
Craig Topper
3be120f5bd [X86] Combine vpmovdw+vpacksswb into vpmovdb.
This is similar to the combine we already have for vpmovdw+vpackuswb.

llvm-svn: 348910
2018-12-12 05:56:01 +00:00
Mandeep Singh Grang
cb7f7e69ee [COFF, ARM64] Emit COFF function header
Summary:
Emit COFF header when printing out the function. This is important as the
header contains two important pieces of information: the storage class for the
symbol and the symbol type information. This bit of information is required for
the linker to correctly identify the type of symbol that it is dealing with.

This patch mimics X86 and ARM COFF behavior for function header emission.

Reviewers: rnk, mstorsjo, compnerd, TomTan, ssijaric

Reviewed By: mstorsjo

Subscribers: dmajor, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D55535

llvm-svn: 348875
2018-12-11 18:36:14 +00:00
Craig Topper
0b7311fee6 Fix not correct imm operand assertion for SUB32ri in X86CondBrFolding::analyzeCompare
Summary:
When doing X86CondBrFolding::analyzeCompare, it will meet the SUB32ri instruction as below to use the global address for its operand,
  %733:gr32 = SUB32ri %62:gr32(tied-def 0), @img2buf_normal, implicit-def $eflags
  JNE_1 %bb.41, implicit $eflags

so the assertion "assert(MI.getOperand(ValueIndex).isImm() && "Expecting Imm operand")" is not correct and change the assert to if make X86CondBrFolding::analyzeCompare return false as not finding the compare for this

Patch by Jianping Chen

Reviewers: smaslov, LuoYuanke, liutianle, Jianping

Reviewed By: Jianping

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D54250

llvm-svn: 348853
2018-12-11 15:32:14 +00:00
Sanjay Patel
b4996575fa [x86] clean up code for converting 16-bit ops to LEA; NFC
As discussed in D55494, we want to extend this to handle 8-bit
ops too, but that could be extended further to enable this on
32-bit systems too.

llvm-svn: 348851
2018-12-11 15:29:40 +00:00
Sanjay Patel
f95897fead [x86] remove dead code for 16-bit LEA formation; NFC
As discussed in:
D55494
...this code has been disabled/dead for a long time (the code references
Athlon and Pentium 4), and there's almost no chance that it will be used 
given the last decade of uarch evolution. Also, in SDAG we promote 16-bit 
ops to 32-bit, so there's almost no way to test this code any more.

llvm-svn: 348845
2018-12-11 14:05:03 +00:00
Martell Malone
c0aecf6cb0 [PPC][NFC] store operands are dst not src
Differential Revision: https://reviews.llvm.org/D55502

llvm-svn: 348826
2018-12-11 03:14:56 +00:00
Heejin Ahn
73b2f90c13 [WebAssembly] Add '.eventtype' directive support
Summary:
This patch supports `.eventtype` directive printing and parsing in the
same syntax with `.functype`.

Reviewers: aardappel, sbc100

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D55353

llvm-svn: 348818
2018-12-11 01:11:04 +00:00
Heejin Ahn
24418123e0 [WebAssembly] TargetStreamer cleanup (NFC)
Summary:
- Unify mixed argument names (`Symbol` and `Sym`) to `Sym`
- Changed `MCSymbolWasm*` argument of `emit***` functions to `const
  MCSymbolWasm*`. It seems not very intuitive that emit function in the
  streamer modifies symbol contents.
- Moved empty function bodies to the header
- clang-format

Reviewers: aardappel, dschuff, sbc100

Subscribers: jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D55347

llvm-svn: 348816
2018-12-11 00:53:59 +00:00