1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

193846 Commits

Author SHA1 Message Date
Qiu Chaofan
3d8d872b5b [NFC] [PowerPC] Prepare test for FMA negate check
This patch adds a test file, covering outputs when some operands in FMA
is negative.
2020-03-23 11:40:07 +08:00
David Blaikie
676f67456e Roll an expression into an assert to remove the need for a (void) cast. 2020-03-22 18:18:27 -07:00
Craig Topper
a6bf9abdbd [X86] Remove maximum vector length limit from combineBasicSADPattern.
createPSADBW uses SplitsOpsAndApply so should be able to handle
any size.

Restrict the extract result type to i32 or i64 since that's what
we have coverage for today and probably matches what the
isSimple() check gave us before.

Differential Revision: https://reviews.llvm.org/D76560
2020-03-22 15:02:05 -07:00
Sylvestre Ledru
a40c68ffd7 doc: use the right url to bugzilla 2020-03-22 22:49:40 +01:00
Sylvestre Ledru
8598ae94d7 Doc: Links should use https 2020-03-22 22:49:33 +01:00
Florian Hahn
87bac6d16f [SCCP] Add a few more tests for conditional propagation,XOR. 2020-03-22 21:43:33 +00:00
Sylvestre Ledru
0fad3bc3c0 update of the llvm doc: we moved to git 2020-03-22 22:36:21 +01:00
Craig Topper
83013c5746 [X86] More accurately model the cost of horizontal reductions.
This patch attempts to more accurately model the reduction of
power of 2 vectors of types we natively support. This takes into
account the narrowing of vectors that occur as we go from 512
bits to 256 bits, to 128 bits. It also takes into account the use
of wider elements in the shuffles for the first 2 steps of a
reduction from 128 bits. And uses a v8i16 shift for the final step
of vXi8 reduction.

The default implementation uses the legalized type for the arithmetic
for all levels. And uses the single source permute cost of the
legalized type for all levels. This penalizes things like
lack of v16i8 pshufb on pre-sse3 targets and the splitting and
joining that needs to be done for integer types on AVX1. We never
need v16i8 shuffle for a reduction and we only need split AVX1 ops
when type the type wide and needs to be split. I think we're still
over costing splits and joins for AVX1, but we're closer now.

I've also removed all pairwise special casing because I don't
think we ever want to generate that on X86. I've also adjusted
the add handling to more accurately account for any type splitting
that occurs before we reach a legal type.

Differential Revision: https://reviews.llvm.org/D76478
2020-03-22 14:20:15 -07:00
Simon Atanasyan
01b3e42958 [mips] Implement .cpadd directive
This directive inserts code to add $gp to the argument's register when
support for position independent code is enabled.

For example, this code:
  .cpadd $4
expands to:
  addu $4, $4, $gp
2020-03-22 23:34:32 +03:00
Simon Atanasyan
d2c968e63d [mips] Implement sne pseudo instruction
The `sne Dst, Src1, Src2/Imm` pseudo instruction sets register `Dst` to
1 if register `Src1` is not equal to `Src2/Imm` and to 0 otherwise.
2020-03-22 23:34:31 +03:00
Simon Atanasyan
86c237d29d [mips] Implement sle/sleu pseudo instructions
The `sle/sleu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is less than or equal `Src2/Imm` and
to 0 otherwise.
2020-03-22 23:34:31 +03:00
Simon Atanasyan
927588cd01 [mips] Remove instructions related to "wired paired single" from the P5600 model. 2020-03-22 23:34:31 +03:00
Simon Atanasyan
90f2f0b26c [mips] Add HasMips3D to the list of features unsupported by P5600 model. 2020-03-22 23:34:31 +03:00
Simon Atanasyan
1ce9b181a9 [mips] Rename target feature Mips3D => HasMips3D. NFC 2020-03-22 23:34:31 +03:00
Yaxun (Sam) Liu
501520a697 Add Triple::isAMDGPU
Differential Revision: https://reviews.llvm.org/D57707
2020-03-22 14:20:28 -04:00
Craig Topper
89351512e6 [X86] Remove maximum vector width restriction from combineLoopSADPattern.
SplitsOpsAndApply will take care of any needed splitting correctly.
All that we need to check is that the vector element count is a
power of 2.

Differential Revision: https://reviews.llvm.org/D76558
2020-03-22 11:09:14 -07:00
Nico Weber
2cbb88169f Remove a dead function. 2020-03-22 13:27:51 -04:00
Matt Arsenault
b69205cb6b Verifier: Check bswap is supported size
Make sure it is a multiple of 2 bytes as specified in the LangRef.
2020-03-22 12:15:25 -04:00
Nikita Popov
0ba06da958 [InstCombine] Remove ExpensiveCombines option
D75801 removed the last and only user of this option, so we can
drop it now. The original idea behind this was to only run expensive
transforms under -O3, but apart from the one known bits transform,
this has never really taken off. I believe nowadays the recommendation
is to put expensive transforms in AggressiveInstCombine instead,
though that isn't terribly popular either :)

Differential Revision: https://reviews.llvm.org/D76540
2020-03-22 16:56:28 +01:00
Matt Arsenault
c1bc330e49 Utils: Mostly convert memcpy expansion to use Align
The TTI hooks aren't converted. I also think the intrinsics should
have mandatory alignment and never return MaybeAlign.
2020-03-22 11:21:44 -04:00
Qiu Chaofan
4b2fb2854d [DAGCombiner] Require nsz for aggressive fma fold
For folding pattern `x-(fma y,z,u*v) -> (fma -y,z,(fma -u,v,x))`, if
`yz` is 1, `uv` is -1 and `x` is -0, sign of result would be changed.

Differential Revision: https://reviews.llvm.org/D76419
2020-03-22 23:10:07 +08:00
Qiu Chaofan
6acf177dfd [NFC] [PowerPC] Remove unsafe-fp-math in FMA test 2020-03-22 22:40:49 +08:00
Simon Pilgrim
58068c3f9e [X86][SSE] Add some additional irregular AVG tests
Finally resurrecting D56506 and want to improve test coverage.
2020-03-22 14:28:31 +00:00
Bjorn Pettersson
302b728560 [ValueTracking] Avoid blind cast from Operator to Instruction
Summary:
Avoid blind cast from Operator to ExtractElementInst in
computeKnownBitsFromOperator. This resulted in some crashes
in downstream fuzzy testing. Instead we use getOperand directly
on the Operator when accessing the vector/index operands.

Haven't seen any problems with InsertElement and ShuffleVector,
but I believe those could be used in constant expressions as well.
So the same kind of fix as for ExtractElement was also applied for
InsertElement.

When it comes to ShuffleVector we now simply bail out if a dynamic
cast of the Operator to ShuffleVectorInst fails. I've got no
reproducer indicating problems for ShuffleVector, and a fix would be
slightly more complicated as getShuffleDemandedElts is involved.

Reviewers: RKSimon, nikic, spatel, efriedma

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76564
2020-03-22 14:45:31 +01:00
Nikita Popov
3662d648c1 [SLP] Avoid repeated visitation in getVectorElementSize(); NFC
We need to insert into the Visited set at the same time we insert
into the worklist. Otherwise we may end up pushing the same
instruction to the worklist multiple times, and only adding it to
the visited set later.
2020-03-22 14:34:29 +01:00
Qiu Chaofan
913eed32b6 [NFC] [PowerPC] Update FMA association test 2020-03-22 20:55:32 +08:00
Nikita Popov
93b266cf4d [LVI] Use SmallDenseMap for getValueFromCondition(); NFC
For the common case, we will only insert one value into this map.
2020-03-22 10:38:44 +01:00
Fangrui Song
8e10dffd51 [AsmPrinter] Simplify AsmPrinter::emitXXStructorList after D61547 2020-03-21 23:18:23 -07:00
Fangrui Song
4b5ee2a22d Delete TargetLoweringObjectFile::Ctx
We can use the parent MCObjectFileInfo::Ctx which has the same value.
2020-03-21 22:36:29 -07:00
Fangrui Song
c88c804312 [X86] Delete unneeded X86ELFTargetObjectFile::Initialize. NFC 2020-03-21 22:03:42 -07:00
Lang Hames
885fe25f2c [ORC] Move ostream operators for debugging output out of Core.h.
DebugUtils.h seems like a more appropriate home for these.
2020-03-21 18:27:28 -07:00
Craig Topper
44dfbcbe96 [X86] Add nonloop v64i8 test to sad.ll. 2020-03-21 17:49:20 -07:00
Craig Topper
31e5c7940c [X86] Add test for v4i8 loop sad pattern.
This cases produces a psadbw that doesn't need to be widened or
extracted so takes a slightly different code path in
combineLoopSADPattern.
2020-03-21 15:27:29 -07:00
LLVM GN Syncbot
8e72b10003 [gn build] Port 34fd007aaf8 2020-03-21 20:48:15 +00:00
Ehud Katz
adea146bf8 Revert "[ADT] Implement the Waymarking as an independent utility"
This reverts commit 73cf8abbe695aede9aac804f960513bb7355004a.
2020-03-21 22:47:17 +02:00
Simon Pilgrim
aec13b39bb [InstCombine] Add ctpop -> cttz combine tests (PR43513) 2020-03-21 19:30:22 +00:00
Simon Pilgrim
ce880537e2 [X86] getTargetShuffleAndZeroables - add insert_subvector(undef, sub, c) handling.
We often widen xmm/ymm vectors to ymm/zmm by insertion into an undef base vector. By letting getTargetShuffleAndZeroables track the undef elts we can help avoid a lot of unnecessary cross-lane shuffles.

Fixes PR44694
2020-03-21 19:11:42 +00:00
Simon Pilgrim
71844d1b0c [X86][AVX] Add HADDPD test case for PR44694 2020-03-21 19:11:41 +00:00
Georgii Rymar
d7296e555d [obj2yaml] - Simplify and reduce ELFDumper<ELFT>::dumpSections. NFCI.
This method it a bit too large.
It is becoming inconvenient to update it.
This patch suggests a way to reduce and cleanup it.

Differential revision: https://reviews.llvm.org/D76499
2020-03-21 18:15:26 +03:00
Simon Pilgrim
47a5355aae [X86] Combine concat(shufps,shufps) -> shufps(concat,concat)
Now that rG18c19441d105 has improved VPERM2X128 handling, we can perform this to improve x64->x32 truncation without poor cross-lane issues.

Someday combineX86ShufflesRecursively will handle this, but we're still really bad at dealing with different vector widths.
2020-03-21 12:44:10 +00:00
Nikita Popov
a8fa9746b9 [ValueTracking] Short-circuit computeKnownBitsAddSub(); NFCI
If one operand is unknown (and we don't have nowrap), don't compute
the second operand.

Also don't create an unnecessary extra KnownBits variable, it's
okay to reuse KnownOut.

This reduces instructions on libclamav_md5.c by 40%.
2020-03-21 13:42:10 +01:00
LLVM GN Syncbot
6ff402883a [gn build] Port 73cf8abbe69 2020-03-21 12:33:41 +00:00
Ehud Katz
c23ba21e86 [ADT] Implement the Waymarking as an independent utility
This is the Waymarking algorithm implemented as an independent utility.
The utility is operating on a range of sequential elements.
First we "tag" the elements, by calling `fillWaymarks`.
Then we can "follow" the tags from every element inside the tagged
range, and reach the "head" (the first element), by calling
`followWaymarks`.

Differential Revision: https://reviews.llvm.org/D74415
2020-03-21 14:30:32 +02:00
Simon Pilgrim
ce32263121 Revert rGe6a7e3b5e3e7 "[X86][SSE] matchShuffleWithSHUFPD - add support for unary shuffles."
This reverts commit e6a7e3b5e3e779a3bfb617c8d9ed4302edab2cef.

Avoids register pressure regression reported at PR45263
2020-03-21 12:14:19 +00:00
Simon Pilgrim
df09235ad5 Revert rGd5d8569df14e95e2c53d167bd1b37995bcbec565 "Fix static analysis warnings about classes with virtual methods not having virtual destructors"
This reverts commit d5d8569df14e95e2c53d167bd1b37995bcbec565.
2020-03-21 11:39:34 +00:00
Simon Pilgrim
64102f02b9 Fix static analysis warnings about classes with virtual methods not having virtual destructors 2020-03-21 11:30:44 +00:00
Simon Pilgrim
45466ad003 [InstCombine][X86] simplifyX86immShift - remove ConstantAggregateZero handling. NFC.
The llvm::computeKnownBits path now handles this.
2020-03-21 11:30:44 +00:00
Simon Pilgrim
4873a84372 Fix Wdocumentation warning. NFCI. 2020-03-21 11:21:57 +00:00
LLVM GN Syncbot
6af85d4699 [gn build] Port 0f4c70dd3ec 2020-03-21 11:04:46 +00:00
Simon Pilgrim
d76e062375 [DAG] Don't permit EXTLOAD when combining FSHL/FSHR consecutive loads (PR45265)
Technically we can permit EXTLOAD of the LHS operand but only if all the extended bits are shifted out. Until we test coverage for that case, I'm just disabling this to fix PR45265.
2020-03-21 10:52:41 +00:00