1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

206192 Commits

Author SHA1 Message Date
Nico Weber
ed2dff4f4d [gn build] (manually) port c6eaa14e11 2020-11-02 10:43:38 -05:00
Florian Hahn
1db8566f5e Reland "[TTI] Add VecPred argument to getCmpSelInstrCost."
This reverts the revert commit 408c4408facc3a79ee4ff7e9983cc972f797e176.

This version of the patch includes a fix for a crash caused by
treating ICmp/FCmp constant expressions as instructions.

Original message:

On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.

This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.

This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.

I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
2020-11-02 15:39:29 +00:00
Matt Arsenault
5f462fb314 AMDGPU: Reorder checks 2020-11-02 10:21:48 -05:00
Nico Weber
f30ddf1336 [gn build] (manually) port 76a168bce01 better 2020-11-02 10:14:58 -05:00
Matt Arsenault
f4106504f2 RegisterCoalescer: Use Register 2020-11-02 10:14:50 -05:00
Dávid Bolvanský
6a8623d8e0 [SLP] Added testcase for PR47623 2020-11-02 16:02:50 +01:00
Evgeny Leviant
6daa076b36 [TableGen][SchedModels] Fix read/write variant substitution
Patch fixes case when sched class has write and read variants belonging
to different processor models.

Differential revision: https://reviews.llvm.org/D89777
2020-11-02 17:39:04 +03:00
Ben Dunbobbin
1b5bdf4e65 [PS4] Support dllimport/export attributes
For PS4 development we support dllimport/export annotations in
source code. This patch enables the dllimport/export attributes
on PS4 by adding a new function to query the triple for whether
dllimport/export are used and using that function to decide
whether these attributes are supported. This replaces the current
method of checking if the target is Windows.

This means we can drop the use of "TargetArch" in the .td file
(which is an improvement as dllimport/export support isn't really
a function of the architecture).

I have included a simple codgen test to show that the attributes
are accepted and have an effect on codegen for PS4. I have also
enabled the DLLExportStaticLocal and DLLImportStaticLocal
attributes, which we support downstream. However, I am unable to
write a test for these attributes until other patches for PS4
dllimport/export handling land upstream. Whilst writing this
patch I noticed that, as these attributes are internal, they do
not need to be target specific (when these attributes are added
internally in Clang the target specific checks have already been
run); however, I think leaving them target specific is fine
because it isn't harmful and they "really are" target specific
even if that has no functional impact.

Differential Revision: https://reviews.llvm.org/D90442
2020-11-02 14:25:34 +00:00
Nico Weber
63eb68a1b7 [gn build] (manually) port 76a168bce01 2020-11-02 09:22:44 -05:00
Clement Courbet
230b955ed6 Revert "[llvm-exegesis] Save target state before running the benchmark."
_fxsave64 is not available on some buildbots.

This reverts commit 274de447fe9621082a523a7227157aeb84702a7d.
2020-11-02 15:11:45 +01:00
Clement Courbet
0a181c6ec2 [llvm-exegesis] Save target state before running the benchmark.
Some benchmarked instructions might set target state. Preserve this
state. See PR26418.

Differential Revision: https://reviews.llvm.org/D90592
2020-11-02 15:02:54 +01:00
Jay Foad
330e9dc042 Revert "Fix ds_read2/write2 unaligned offsets"
This reverts commit 2e7e898c8f0b38dc11fbce2553fc715067aaf42f.

It was committed by mistake.
2020-11-02 14:01:33 +00:00
Jay Foad
e2e08ec61b Fix ds_read2/write2 unaligned offsets 2020-11-02 13:57:13 +00:00
Jay Foad
69f257782a [AMDGPU] Precommit ds_read2/write2 with unaligned offset tests. NFC. 2020-11-02 13:57:08 +00:00
Jay Foad
bc8237c619 [AMDGPU] Generate test checks. NFC. 2020-11-02 13:56:46 +00:00
Jay Foad
dffa7c50b8 [AMDGPU] Remove a comment. NFC.
This was obsoleted by f78687df9b7 which added gfx9 aligned/unaligned
tests.
2020-11-02 13:56:46 +00:00
Simon Pilgrim
d3858d9474 [LV][X86] Regenerate gather_scatter tests. NFCI.
Reduce diff in D90554
2020-11-02 11:57:37 +00:00
Simon Pilgrim
e437303a09 [SLP][X86] Add AVX512VL test target coverage for PR47629
As suggested on D90445 - the AVX512F test case alone won't handle 128/256-bit vector gather pattern very well
2020-11-02 11:40:40 +00:00
Simon Pilgrim
1b06c024fd [RISCV] Avoid std::pair<> in FPReg StringSwitch to avoid MSVC compile failures. NFCI.
As discussed on D90322, some MSVC builds are failing with is_trivially_copyable static asserts (see D86126) - we can avoid this by not using the std::pair<unsigned,unsigned> which held both the FP+DP Registers, just handle the FP register and convert to DP on the fly.
2020-11-02 11:30:57 +00:00
Clement Courbet
c19cc98e95 [llvm-exegesis] Print signal name when the snippet crashed.
Differential Revision: https://reviews.llvm.org/D90453
2020-11-02 10:41:17 +01:00
Georgii Rymar
a9c8c69899 [yaml2obj] - Add support of Offset for .strtab/.shstrtab/.dynstr sections.
These sections are implicit and handled a bit differently.
Currently the "Offset" is ignored for them.
This patch fixes an issue.

Differential revision: https://reviews.llvm.org/D90446
2020-11-02 11:56:32 +03:00
Caroline Concatto
d88ee71498 Revert "[AArch64][AsmParser] Remove 'x31' alias for 'sp/xzr' register."
This reverts commit 8b281bfaf35d00d42c2993fd5a80d749cc21f45e.
2020-11-02 08:15:50 +00:00
Caroline Concatto
19fb2444af [AArch64][AsmParser] Remove 'x31' alias for 'sp/xzr' register.
Only the aliases 'xzr' and 'sp' exist for the physical register x31.
The reason for wanting to remove the alias 'x31' is because it allows users
to write invalid asm that is not accepted by the GNU assembler.

Is there any objection to removing this alias? Or do we want to keep
this for compatibility with existing code that uses w31/x31?

Differential Revision: https://reviews.llvm.org/D90153
2020-11-02 07:57:05 +00:00
Wang, Pengfei
565658e854 [CodeGen][X86] Remove unused check-prefix in strict FP tests. 2020-11-02 14:41:06 +08:00
Craig Topper
1ff00ea7a3 [RISCV] Add a test case for another issue in SelectRORIW. NFC
When validating C3 in (sext_inreg (or (shl X, C2), (shr (and Y, C3), C1)), i32)
we are truncating it to 32 bits before checking its value. We need
to check all 64 bits.
2020-11-01 22:36:14 -08:00
Qiu Chaofan
fe86a99411 [PowerPC] Fix a crash in POWER 9 setb peephole
Variable InnerIsSel references FalseRes, while FalseRes might be
zext/sext. So InnerIsSel should reference SetOrSelCC, otherwise a crash
will happen.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D90142
2020-11-02 14:29:43 +08:00
Craig Topper
b02682f887 [RISCV] Add a test case to show a bug in SelectRORIW. NFC
The function is matching (sext_inreg (or (shl X, C2), (shr (and Y, C3), C1))),
with appropriate checks for the constants to be a rotate. But it
fails to check that X and Y are the same which is also necessary.
2020-11-01 20:30:55 -08:00
Craig Topper
7e7a22b774 [RISCV] Add more rev32 and rev16 test cases using fshl/fshr intrinsics. NFC
fshl/fshr intrinsics turn into rotl/rotr ISD opcodes and we don't
have a complete set of patterns.

We pattern match rotl, but we have a custom match for rori that gets
priority. We don't pattern match rotr and we don't have patterns
or custom code for rori from rotr.
2020-11-01 20:30:55 -08:00
Wang, Pengfei
ecbb96277d [CodeGen][X86] Remove unused check-prefix in adx tests.
Not needed after 226261353288e.
2020-11-02 11:52:24 +08:00
Chen Zheng
39e1336a34 [MachineSink] sink more profitable loads
Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D86864
2020-11-01 21:13:27 -05:00
QingShan Zhang
5f9db1f559 [Scheduling] Fall back to the fast cluster algorithm if the DAG is too complex
We have added a new load/store cluster algorithm in D85517. However, AArch64 see
some compiling deg with the new algorithm as the IsReachable() is not cheap if
the DAG is complex. O(M+N) See https://bugs.llvm.org/show_bug.cgi?id=47966
So, this patch added a heuristic to switch to old cluster algorithm if the DAG is too complex.

Reviewed By: Owen Anderson

Differential Revision: https://reviews.llvm.org/D90144
2020-11-02 02:11:52 +00:00
Teresa Johnson
c0b0490aa0 [MemProf] Pass down memory profile name with optional path from clang
Similar to -fprofile-generate=, add -fmemory-profile= which takes a
directory path. This is passed down to LLVM via a new module flag
metadata. LLVM in turn provides this name to the runtime via the new
__memprof_profile_filename variable.

Additionally, always pass a default filename (in $cwd if a directory
name is not specified vi the = form of the option). This is also
consistent with the behavior of the PGO instrumentation. Since the
memory profiles will generally be fairly large, it doesn't make sense to
dump them to stderr. Also, importantly, the memory profiles will
eventually be dumped in a compact binary format, which is another reason
why it does not make sense to send these to stderr by default.

Change the existing memprof tests to specify log_path=stderr when that
was being relied on.

Depends on D89086.

Differential Revision: https://reviews.llvm.org/D89087
2020-11-01 17:38:23 -08:00
Nikita Popov
405d95e24a [SCEV] Delay strengthening of nowrap flags
Strengthening nowrap flags is relatively expensive. Make sure we
only do it if we're actually going to use the flags -- we don't
use them for many recursive invocations. Additionally, if we're
reusing an existing SCEV node, there's no point in trying to
strengthen the flags if we don't have any new baseline facts.

This change falls slightly short of being NFC, because the way
flags during add+addrec / mul+addrec folding are handled may be
more precise (as less operands are included in the calculation).
2020-11-01 22:18:07 +01:00
Craig Topper
b462ae3ac5 [RISCV] Add tests to show missed opportunities to use rori for fshr intrinsic with same inputs. NFC
The fshr intrinsic with same inputs produces rotr ISD node. The
fshl intrinsic produces rotl ISD node.

There were only test cases and isel patterns for the fshl/rotl case.
This patch adds fshr/rotr test cases.
2020-11-01 12:25:47 -08:00
Craig Topper
97e7bd0f19 Recommit "[RISCV] Remove include of RISCVRegisterInfo.h from RISCVBaseInfo.h. NFCI"
This reverts 781917254dba17df7fb357a5f621ada2ac1b36e3 and recommits
781917254dba17df7fb357a5f621ada2ac1b36e3.

I've changed getRegForInlineAsmConstraint to not use a std::pair
of Register in a previous commit. Hopefully that fixes the reported
issue with expensive checks on Windows. I'm still not sure exactly
why this commit removing an include affected a different file.

Original message:

RISCVRegisterInfo.h is part of the CodeGen layer. The Utils library
is intended to be shared with the MC layer so shouldn't use files
from the CodeGen layer.

The register enum names are already available from
RISCVMCTargetDesc.h. It appears what was coming from this include
was a transitive include of the Register class which I've replaced
with MCRegister. Register has a constructor from MCRegister so it
should be convertible.
2020-11-01 10:35:37 -08:00
Craig Topper
0396eeb565 [RISCV] Use 'unsigned' instead of Register in getRegForInlineAsmConstraint. NFC
The return value of this interface still uses an 'unsigned' on all
targets. So we convert Register back to unsigned at the end.

I'm hoping this will prevent the issue that caused the revert of
D90322.
2020-11-01 10:16:52 -08:00
Nikita Popov
f67bbf3be8 [SCEV] Construct GEP expression more efficiently (NFCI)
Instead of performing a sequence of pairwise additions, directly
construct a multi-operand add expression.

This should be NFC modulo any SCEV canonicalization deficiencies.
2020-11-01 19:00:57 +01:00
Florian Hahn
cdc39cbc67 [VPlan] Assert no users remaining when deleting a VPValue.
When deleting a VPValue, all users must already by deleted. Add an
assertion to make sure and catch violations.
2020-11-01 17:44:53 +00:00
David Green
598e436f7f [ARM] Add extra MVE tests for various patches. NFC 2020-11-01 16:24:23 +00:00
Qiu Chaofan
6e7484a58c [PowerPC] Avoid unnecessary fadd for unsigned to ppcf128
Unsigned 32-bit or shorter integer to ppcf128 conversion are currently
expanded as signed-to-double with an extra fadd to 'complement'. But on
PowerPC we have native instruction to directly convert unsigned to
double since ISA v2.06. This patch exploits it.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D89786
2020-11-01 23:22:47 +08:00
Christudasan Devadasan
f694eea8d3 [AMDGPU] Some refactoring after D90404. NFC. 2020-11-01 13:18:53 +05:30
Christudasan Devadasan
45dd9c1c1d [AMDGPU] Add alignment check for v3 to v4 load type promotion
It should be enabled only when the load alignment is at least 8-byte.

Fixes: SWDEV-256824

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D90404
2020-11-01 12:05:34 +05:30
Fangrui Song
0536175cbb [test] Fix some unused check prefixes in test/Analysis/CostModel/X86 2020-10-31 23:29:57 -07:00
Ayke van Laethem
a7eefcbc05 [AVR] Improve inline rotate/shift expansions
These expansions were rather inefficient and were done with more code
than necessary. This change optimizes them to use expansions more
similar to GCC. The code size is the same (when optimizing for code
size) but somehow LLVM reorders blocks in a non-optimal way. Still, this
should be an improvement with a reduction in code size of around 0.12%
(when building compiler-rt).

Differential Revision: https://reviews.llvm.org/D86418
2020-10-31 23:15:49 +01:00
Florian Hahn
dc7616bd85 [DSE] Use same logic as legacy impl to check if free kills a location.
This patch updates DSE + MemorySSA to use the same check as the legacy
implementation to determine if a location is killed by a free call.

This changes the existing behavior so that a free does not kill
locations before the start of the freed pointer.

This should fix PR48036.
2020-10-31 20:09:25 +00:00
Florian Hahn
cdee713390 [DSE] Add additional tests with free, regenerate check lines.
The new test cases are inspired by PR48036.
2020-10-31 20:03:06 +00:00
Simon Pilgrim
4feeec2fcd Add missing EOL. NFCI. 2020-10-31 17:32:04 +00:00
Florian Hahn
94fd7ed787 Reland "[SLP] Consider alternatives for cost of select instructions."
This reverts the revert commit a1b53db32418cb6ed6f5b2054d15a22b5aa3aeb9.

This patch includes a fix for a reported issue, caused by
matchSelectPattern returning UMIN for selects of pointers in
some cases by looking to some connected casts.

For now, ensure integer instrinsics are only returned for selects of
ints or int vectors.
2020-10-31 16:52:36 +00:00
Paul C. Anagnostopoulos
2d0f8816ca [TableGen] Eliminate uses of true and false in .td files.
They occurred in one NVPTX file and some test files.

Differential Revision: https://reviews.llvm.org/D90513
2020-10-31 10:54:33 -04:00
David Green
e722c3abc4 [ARM] Fix crash for gather of pointer costs.
If the elt size is unknown due to it being a pointer, a comparison
against 0 will cause an assert. Make sure the elt size is large enough
before comparing and for the moment just return the scalar cost.
2020-10-31 13:10:14 +00:00