1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

166352 Commits

Author SHA1 Message Date
Sander de Smalen
d90e3449c5 [AArch64][SVE] Asm: Support for predicated FP rounding instructions.
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.

The added instructions are:
- FRINTI  Round to integral value (current FPCR rounding mode)
- FRINTX  Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA  Round to integral value (to nearest, with ties away from zero)
- FRINTN  Round to integral value (to nearest, with ties to even)
- FRINTZ  Round to integral value (toward zero)
- FRINTM  Round to integral value (toward minus Infinity)
- FRINTP  Round to integral value (toward plus Infinity)
- FSQRT   Floating-point square root
- FRECPX  Floating-point reciprocal exponent

llvm-svn: 336322
2018-07-05 08:38:30 +00:00
Sjoerd Meijer
816d15d7e2 [ARM] ParallelDSP: only support i16 loads for now
We were miscompiling i8 loads, so reject them as unsupported narrow operations
for now.

Differential Revision: https://reviews.llvm.org/D48944

llvm-svn: 336319
2018-07-05 08:21:40 +00:00
Sander de Smalen
a44c2b47fb [AArch64][SVE] Asm: Support for signed/unsigned MIN/MAX/ABD
This patch implements the following varieties:

- Unpredicated signed max,   e.g. smax z0.h, z1.h, #-128
- Unpredicated signed min,   e.g. smin z0.h, z1.h, #-128

- Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255
- Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255

- Predicated signed max,     e.g. smax z0.h, p0/m, z0.h, z1.h
- Predicated signed min,     e.g. smin z0.h, p0/m, z0.h, z1.h
- Predicated signed abd,     e.g. sabd z0.h, p0/m, z0.h, z1.h

- Predicated unsigned max,   e.g. umax z0.h, p0/m, z0.h, z1.h
- Predicated unsigned min,   e.g. umin z0.h, p0/m, z0.h, z1.h
- Predicated unsigned abd,   e.g. uabd z0.h, p0/m, z0.h, z1.h

llvm-svn: 336317
2018-07-05 07:54:10 +00:00
Lei Huang
8805523414 [Power9] Optimize codgen for conversions of int to float128
Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

llvm-svn: 336316
2018-07-05 07:46:01 +00:00
Craig Topper
852bb77ea4 [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement.
llvm-svn: 336315
2018-07-05 06:52:55 +00:00
Lei Huang
6c9349ab49 [Power9][NFC] add back-end tests for passing homogeneous fp128 aggregates by value
Tests to verify that we are passing fp128 via VSX registers as per ABI.
These are related to clang commit rL336308.

Differential Revision: https://reviews.llvm.org/D48310

llvm-svn: 336314
2018-07-05 06:51:38 +00:00
Lei Huang
af2a11cc9c [Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregates
Add missing testcase for rL336310

llvm-svn: 336313
2018-07-05 06:29:28 +00:00
Serge Pavlov
e883ffb7d8 [demangler] Avoid alignment warning
The alignment specified by a constant for the field
`BumpPointerAllocator::InitialBuffer` exceeded the alignment
guaranteed by `malloc` and `new` on Windows. This change set
the alignment value to that of `long double`, which is defined
by the used platform.

It fixes https://bugs.llvm.org/show_bug.cgi?id=37944.

Differential Revision: https://reviews.llvm.org/D48889

llvm-svn: 336311
2018-07-05 06:22:39 +00:00
Lei Huang
6d8417f993 [Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg
Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

llvm-svn: 336310
2018-07-05 06:21:37 +00:00
Lei Huang
31dbbfb8fb [Power9]Legalize and emit code for quad-precision convert from single-precision
Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

llvm-svn: 336307
2018-07-05 04:18:37 +00:00
Lei Huang
3f7002dfe9 [Power9] Implement float128 parameter passing and return values
This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

llvm-svn: 336306
2018-07-05 04:10:15 +00:00
Craig Topper
80bf706c39 [X86] Remove some isel patterns for X86ISD::SELECTS that specifically looked for the v1i1 mask to have come from a scalar_to_vector from GR8.
We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them.

llvm-svn: 336305
2018-07-05 03:01:29 +00:00
Craig Topper
6dde0076c9 [X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg input.
Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)).

This patch fixes that so we can also combine things like (fmsub (fneg)).

llvm-svn: 336304
2018-07-05 02:52:56 +00:00
Craig Topper
f1b7bcc114 [X86] Remove some of the packed FMA3 intrinsics since we no longer use them in clang.
There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes.

More removals to come, but I wanted to stop and fix the regression that showed up in this first.

llvm-svn: 336303
2018-07-05 02:52:54 +00:00
Lei Huang
46c288db0b [Power9]Legalize and emit code for round & convert quad-precision values
Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

llvm-svn: 336299
2018-07-04 21:59:16 +00:00
Aaron Ballman
1415773fba Silence an MSVC C4189 warning about a local variable being initialized but not used; NFC.
llvm-svn: 336298
2018-07-04 21:22:28 +00:00
Vladimir Stefanovic
f9066a105b [mips] Warn when crc, ginv, virt flags are used with too old revision
CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.

Differential Revision: https://reviews.llvm.org/D48843

llvm-svn: 336296
2018-07-04 19:26:31 +00:00
Stefan Pintilie
f220e7cb76 [PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler
We want to run the Machine Scheduler instead of the List Scheduler after RA.
  Checked with a performance run on a Power 9 machine with SPEC 2006 and while
  some benchmarks improved and others degraded the geomean was slightly improved
  with the Machine Scheduler.

  Differential Revision: https://reviews.llvm.org/D45265

llvm-svn: 336295
2018-07-04 18:54:25 +00:00
Jakub Kuderski
d5576313e7 [Dominators] Add DomTreeUpdater constructor from DT* and PDT*
Summary:
Previously, if a function accepts an optional DT pointer,
```
void Foo (.., DominatorTree * DT = nullptr) {
  ...
  if(DT)
    DomTreeUpdater(*DT, ...).insertEdge(A, B);
  if(DT){
    DomTreeUpdater DTU(*DT, ...);
    ... // Construct the update vector and applyUpdates
  }
  ...
  if(DT){
    DomTreeUpdater DTU(*DT, ...);
    ... // Construct the update vector and applyUpdates
  }
}
```
After this patch, it can be simplified as
```
void Foo (.., DominatorTree * DT = nullptr) {
  DomTreeUpdater DTU(DT, ...);
  ...
  DTU.insertEdge(A, B);
  if(DT){
    ... // Construct the update vector and applyUpdates
  }
  ...
  if(DT){
    ... // Construct the update vector and applyUpdates
  }
}
```
Patch by Chijun Sima <simachijun@gmail.com>.

Reviewers: kuhar, brzycki, dmgreen

Reviewed By: kuhar

Author: NutshellySima

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48923

llvm-svn: 336294
2018-07-04 18:37:15 +00:00
Sanjay Patel
680e014fa4 [InstCombine] allow narrowing of min/max/abs
We have bailout hacks based on min/max in various places in instcombine 
that shouldn't be necessary. The affected test was added for:
D48930 
...which is a consequence of the improvement in:
D48584 (https://reviews.llvm.org/rL336172)

I'm assuming the visitTrunc bailout in this patch was added specifically 
to avoid a change from SimplifyDemandedBits, so I'm just moving that 
below the EvaluateInDifferentType optimization. A narrow min/max is still
a min/max.

llvm-svn: 336293
2018-07-04 17:44:04 +00:00
Roman Lebedev
213b1bad80 [X86][BtVer2][MCA][NFC] Add CMPEQ dependency-breaking one-idioms tests
Summary: As per `Agner's Microarchitecture doc
(21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions)`,
these, like zero-idioms, are dependency-breaking,
although they produce ones and still consume resources.

FIXME: as discussed in D48877, llvm-mca handling is broken for these.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: gbedwell, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D48876

llvm-svn: 336292
2018-07-04 17:32:44 +00:00
Simon Pilgrim
d7c9b3a70c Fix some irregular whitespace/indentation. NFCI.
llvm-svn: 336291
2018-07-04 17:24:05 +00:00
Sanjay Patel
fa6effb718 [InstCombine] add value names to test; NFC
That makes it easier to mix and match lines into other tests.

llvm-svn: 336289
2018-07-04 16:56:35 +00:00
Volodymyr Turanskyy
e5db439699 [ARM] [Assembler] Support negative immediates: cover few missing cases
Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.

This change adds negative immediates support and respective tests
for the following:

ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W

Differential Revision: https://reviews.llvm.org/D48649

llvm-svn: 336286
2018-07-04 16:11:15 +00:00
Yvan Roux
af85981149 [MachineOutliner] Fix typo in getOutliningCandidateInfo function name
getOutlininingCandidateInfo -> getOutliningCandidateInfo

Differential Revision: https://reviews.llvm.org/D48867

llvm-svn: 336285
2018-07-04 15:37:08 +00:00
Paul Semel
ad69d123c9 [llvm-objdump] Add --file-headers (-f) option
llvm-svn: 336284
2018-07-04 15:25:03 +00:00
Simon Pilgrim
d35bdcd1f6 [X86][SSE] Add v16i16 shl x,c -> pmullw test
llvm-svn: 336277
2018-07-04 14:20:58 +00:00
Andrew Ng
e9c64a9995 [ThinLTO] Update ThinLTO cache file atimes when on Windows
ThinLTO cache file access times are used for expiration based pruning
and since Vista, file access times are not updated by Windows by
default:

https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance

This means on Windows, cache files are currently being pruned from
creation time. This change manually updates cache files that are
accessed by ThinLTO, when on Windows.

Patch by Owen Reynolds.

Differential Revision: https://reviews.llvm.org/D47266

llvm-svn: 336276
2018-07-04 14:17:10 +00:00
Sander de Smalen
561fb22163 [AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction.
This patch adds both a vector and an immediate form, e.g.                  
                                                                           
- Vector form:                                                             
                                                                           
    subr z0.h, p0/m, z0.h, z1.h                                            
                                                                           
  subtract active elements of z0 from z1, and store the result in z0.      
                                                                           
- Immediate form:                                                          
                                                                           
    subr z0.h, z0.h, #255                                                  
                                                                           
  subtract elements of z0, and store the result in z0.

llvm-svn: 336274
2018-07-04 14:05:33 +00:00
Simon Pilgrim
141d1c9004 [X86][SSE] Add SSE2 target to some shift tests
Show the difference in behaviour cf SSE41 (no PMULLD, PBLENDW etc.)

Raised by D48936

llvm-svn: 336271
2018-07-04 13:58:13 +00:00
Gabor Buella
0b1675d3f5 NFC - Various typo fixes in tests
llvm-svn: 336268
2018-07-04 13:28:39 +00:00
Sander de Smalen
cbd4b2fcb6 [AArch64][SVE] Asm: Support for instructions to set/read FFR.
Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
    rdffr   p0.b
- RDFFR (predicated)
    rdffr   p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
    rdffr   p0.b, p0/z

Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
    setffr
- WRFFR  (unpredicated)
    wrffr   p0.b

llvm-svn: 336267
2018-07-04 12:58:46 +00:00
Clement Courbet
e3160346fa [llvm-exegesis] Remove dead comment.
llvm-svn: 336266
2018-07-04 12:31:00 +00:00
Sander de Smalen
9918de7ecc [AArch64][SVE] Asm: Support for FP conversion instructions.
The variants added are:

- fcvt   (FP convert precision)
- scvtf  (signed int -> FP) 
- ucvtf  (unsigned int -> FP) 
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))

For example:
  fcvt   z0.h, p0/m, z0.s  (single- to half-precision FP) 
  scvtf  z0.h, p0/m, z0.s  (32-bit int to half-precision FP) 
  ucvtf  z0.h, p0/m, z0.s  (32-bit unsigned int to half-precision FP) 
  fcvtzs z0.s, p0/m, z0.h  (half-precision FP to 32-bit int)
  fcvtzu z0.s, p0/m, z0.h  (half-precision FP to 32-bit unsigned int)

llvm-svn: 336265
2018-07-04 12:13:17 +00:00
Max Kazantsev
97daf8f423 [NFC] Add test that shows that InstCombine can do better
llvm-svn: 336258
2018-07-04 10:32:02 +00:00
Anastasis Grammenos
6f5c420176 [DebugInfo][LoopVectorize] Preserve DL in generated phi instruction
When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.

Differential Revision: https://reviews.llvm.org/D48769

llvm-svn: 336256
2018-07-04 10:16:55 +00:00
Anastasis Grammenos
72ae081881 [DebugInfo][InstCombine] Preserve DI after combining zext
When zext is EvaluatedInDifferentType, InstCombine
drops the dbg.value intrinsic. This patch tries to
preserve said DI, by inserting the zext's old DI in the
resulting instruction. (Only for integer type for now)

Differential Revision: https://reviews.llvm.org/D48331

llvm-svn: 336254
2018-07-04 09:55:46 +00:00
Simon Pilgrim
900e069df2 [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED)
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.

Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247.

llvm-svn: 336250
2018-07-04 09:12:48 +00:00
Simon Pilgrim
c411e5c9c8 [X86][SSE] Add reduced crash test case for r336113 - [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values
The patch was reverted at r336189 due to crashes

llvm-svn: 336247
2018-07-04 08:55:23 +00:00
Sander de Smalen
813960f161 [AArch64][SVE] Asm: Support for SVE condition code aliases
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The 
details are described in section 2.2 of the architecture
reference manual supplement for SVE.

In short:

  SVE alias =>  AArch64 name
  --------------------------
  NONE      => EQ
  ANY       => NE
  NLAST     => HS
  LAST      => LO
  FIRST     => MI
  NFRST     => PL
  PMORE     => HI
  PLAST     => LS
  TCONT     => GE
  TSTOP     => LT

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48869

llvm-svn: 336245
2018-07-04 08:50:49 +00:00
Max Kazantsev
0edbc15caf [ImplicitNullChecks] Check for rewrite of register used in 'test' instruction
The following code pattern:

       mov %rax, %rcx
       test %rax, %rax
       %rax = ....
       je  throw_npe
       mov(%rcx), %r9
       mov(%rax), %r10

gets transformed into the following incorrect code after implicit null check pass:
        mov %rax, %rcx
       %rax = ....
       faulting_load_op("movl (%rax), %r10", throw_npe)
       mov(%rcx), %r9

For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization.

Patch by Surya Kumari Jangala!

Differential Revision: https://reviews.llvm.org/D48627
Reviewed By: skatkov

llvm-svn: 336241
2018-07-04 08:01:26 +00:00
Fangrui Song
7dbaffa4bd [Support] Remove SaveOr which is no longer used
llvm-svn: 336237
2018-07-03 23:31:19 +00:00
Jacques Pienaar
a56fb83f33 [lanai] Handle atomic load of i8 like regular load.
Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded.

llvm-svn: 336236
2018-07-03 22:57:51 +00:00
Fangrui Song
5e326ef514 [X86][AsmParser] Fix inconsistent declaration parameter name in r336218
llvm-svn: 336232
2018-07-03 21:40:03 +00:00
Benjamin Kramer
1fc1daf8d2 [NVPTX] Expand v2f16 INSERT_VECTOR_ELT
Vectorization can create them.

llvm-svn: 336227
2018-07-03 20:40:04 +00:00
Craig Topper
4dd6e9e3e3 [X86] Remove repeated 'the' from multiple comments that have been copy and pasted. NFC
llvm-svn: 336226
2018-07-03 20:39:55 +00:00
Roman Lebedev
c5f5fae430 [X86] Add tests for low/high bit clearing with different attributes.
D48768 may turn some of these into shifts.

Reviewers: spatel

Reviewed By: spatel

Subscribers: spatel, RKSimon, llvm-commits, craig.topper

Differential Revision: https://reviews.llvm.org/D48767

llvm-svn: 336224
2018-07-03 19:12:37 +00:00
Fangrui Song
ee91dd12b7 [ARM] Fix inconsistent declaration parameter name in r336195
llvm-svn: 336223
2018-07-03 19:12:27 +00:00
Fangrui Song
1b8bb0dc71 [AArch64] Make function parameter names in declarations match those of definitions
llvm-svn: 336222
2018-07-03 19:07:53 +00:00
Sanjay Patel
626487bab5 [InstCombine] add tests for shuffle+binop with constant op1; NFC
This adds coverage for a planned enhancement for ConstantExpr::getBinOpIdentity() noted in D48830.

llvm-svn: 336220
2018-07-03 18:43:46 +00:00