1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00
Commit Graph

203817 Commits

Author SHA1 Message Date
Jianzhou Zhao
1e77ba8067 Use one more byte to silence a warning from Vistual C++ 2020-09-18 16:42:38 +00:00
Simon Pilgrim
39e440fde2 [X86][AVX] lowerBuildVectorAsBroadcast - improve i64 BROADCASTM lowering on 32-bit targets
We already handle the the cases where we have a 'zero extended splat' build vector (a, 0, 0, 0, a, 0, 0, 0, ...) but were missing the case where the 'a' scalar was zero-extended as well - such as i64 -> vXi64 splat cases on 32-bit targets.
2020-09-18 16:59:57 +01:00
Simon Pilgrim
975db31f6e [X86][AVX] Add missing i686 broadcastm test coverage 2020-09-18 16:10:24 +01:00
Simon Pilgrim
99d78e28b2 [DAG] BuildVectorSDNode::getSplatValue - pull out repeated getNumOperands() calls. NFCI. 2020-09-18 16:10:23 +01:00
David Tenty
655fbd5c46 [AIX] Enable large code model when building with clang 2020-09-18 11:03:22 -04:00
Sanjay Patel
0432f77854 [InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567)
It would also be correct to return the variable operand in these cases,
but eliminating a variable use is probably better for optimization.
2020-09-18 10:05:44 -04:00
Matt Arsenault
8bd8036a64 IR: Move denormal mode parsing from MachineFunction to Function
This was just inspecting the IR to begin with, and is useful to check
in some places in the IR.
2020-09-18 09:55:47 -04:00
Matt Arsenault
15d1f58f1b emacs: Add nofree and willreturn to list of attributes 2020-09-18 09:48:33 -04:00
Matt Arsenault
c124c9a532 Revert "[amdgpu] Lower SGPR-to-VGPR copy in the final phase of ISel."
This reverts commit c3492a1aa1b98c8d81b0969d52cea7681f0624c2.

I think this is the wrong strategy and wrong place to do this
transform anyway. Also reverts follow up commit
7d593d0d6905b55ca1124fca5e4d1ebb17203138.
2020-09-18 09:48:33 -04:00
Alexey Bataev
50c5b40f69 [SLP] Allow reordering of vectorization trees with reused instructions.
If some leaves have the same instructions to be vectorized, we may
incorrectly evaluate the best order for the root node (it is built for the
vector of instructions without repeated instructions and, thus, has less
elements than the root node). In this case we just can not try to reorder
the tree + we may calculate the wrong number of nodes that requre the
same reordering.
For example, if the root node is \<a+b, a+c, a+d, f+e\>, then the leaves
are \<a, a, a, f\> and \<b, c, d, e\>. When we try to vectorize the first
leaf, it will be shrink to \<a, b\>. If instructions in this leaf should
be reordered, the best order will be \<1, 0\>. We need to extend this
order for the root node. For the root node this order should look like
\<3, 0, 1, 2\>. This patch allows extension of the orders of the nodes
with the reused instructions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D45263
2020-09-18 09:34:59 -04:00
Mirko Brkusanin
46431af84c [AMDGPU] Set DS alignment requirements to be more strict
Alignment requirements for ds_read/write_b96/b128 for gfx9 and onward are
now the same as for other GCN subtargets. This way we can avoid any
unintentional use of these instructions on systems that do not support dword
alignment and instead require natural alignment.
This also makes 'SH_MEM_CONFIG.alignment_mode == STRICT' the default.

Differential Revision: https://reviews.llvm.org/D87821
2020-09-18 15:26:24 +02:00
Sanjay Patel
88960bf64d [InstSimplify] add another test for NaN propagation; NFC 2020-09-18 09:20:26 -04:00
Xing GUO
3c2809fd0d [DWARFYAML] Make the include_directories, file_names and opcodes fields of the line table optional.
This patch makes the include_directories, file_names and opcodes fields
of the line table optional. This helps us simplify some tests.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D87878
2020-09-18 20:21:11 +08:00
Xing GUO
2a69cdbaed [DWARFYAML][test] Use 'CHECK-NEXT:' to make checkers stricter. NFC.
This patch makes checkers stricter so that we are able to avoid
some potential problems earlier.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D87876
2020-09-18 20:19:33 +08:00
David Greene
6628767888 [UpdateCCTestChecks] Include generated functions if asked
Add the --include-generated-funcs option to update_cc_test_checks.py so that any
functions created by the compiler that don't exist in the source will also be
checked.

We need to maintain the output order of generated function checks so that
CHECK-LABEL works properly.  To do so, maintain a list of functions output for
each prefix in the order they are output.  Use this list to output checks for
generated functions in the proper order.

Differential Revision: https://reviews.llvm.org/D83004
2020-09-18 06:34:59 -05:00
Max Kazantsev
6be1d3bebc [Test] Missing range check removal opportunity 2020-09-18 17:55:23 +07:00
Florian Hahn
0312420155 Recommit "[DSE] Switch to MemorySSA-backed DSE by default."
This switches to using DSE + MemorySSA by default again, after
fixing the issues reported after the first commit.

Notable fixes fc8200633122, a0017c2bc258.

This reverts commit 3a59628f3cc26eb085acfc9cbdc97243ef71a6c5.
2020-09-18 11:05:00 +01:00
Florian Hahn
447cd8eb56 [SCEV] Generalize SCEVParameterRewriter to accept SCEV expression as target.
This patch extends SCEVParameterRewriter to support rewriting unknown
epxressions to arbitrary SCEV expressions. It will be used by further
patches.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D67176
2020-09-18 10:05:02 +01:00
Gabriel Hjort Åkerlund
7fb62030d9 [TableGen][GlobalISel] Fix handling of zero_reg
When generating matching tables for GlobalISel, TableGen would output
"::zero_reg" whenever encountering the zero_reg, which in turn would
result in compilation error. This patch fixes that by instead outputting
NoRegister (== 0), which is the same result that TableGen produces when
generating matching tables for ISelDAG.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86215
2020-09-18 11:01:11 +02:00
Tim Northover
bc600a0484 AArch64: make sure jump table entries can reach entire image
This turns all jump table entries into deltas within the target
function because in the small memory model all code & static data must
be in a 4GB block somewhere in memory.

When the entries were a delta between the table location and a basic
block, the 32-bit signed entries are not enough to guarantee
reachability.

https://reviews.llvm.org/D87286
2020-09-18 09:50:40 +01:00
Nikita Popov
07e4739f3c Revert "[InstCombine] Canonicalize SPF_ABS to abs intrinc"
This reverts commit 05d4c4ebc2fb006b8a2bd05b24c6aba10dd2eef8.

mstorsjo reports a miscompile after this change in
https://reviews.llvm.org/D87188#2281093. Reverting until I can
investigate this.
2020-09-18 09:38:26 +02:00
Andrew Wei
f8a859443e [AArch64] Add tests for zext pattern match with AssertZext/AssertSext operand, NFC 2020-09-18 15:02:43 +08:00
Serge Pavlov
35714dfbb7 [FPEnv] Use typed accessors in FPOptions
Previously methods `FPOptions::get*` returned unsigned value even if the
corresponding property was represented by specific enumeration type. With
this change such methods return actual type of the property. It also
allows printing value of a property as text rather than integer code.

Differential Revision: https://reviews.llvm.org/D87812
2020-09-18 14:16:43 +07:00
Craig Topper
c7c58df71f [X86] Add some demanded bits test cases for PDEP with constant mask
The number of ones in the mask for the PDEP determines how many
bits of the other operand are used. If the mask is constant we
can use this to build a mask for SimplifyDemandedBits. This can
be used to replace the extends in the test with anyextend.
2020-09-17 22:48:19 -07:00
Andrew Wei
c2b264b734 [AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext
When the source of the zext is AssertZext or AssertSext, it is hard to know any information about the upper 32 bits,
so we should insert a zext move before emitting SUBREG_TO_REG to define the lower 32 bits.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87771
2020-09-18 12:48:41 +08:00
Amara Emerson
680fe78a3d [AArch64][GlobalISel] Make G_STORE <8 x s8> legal. 2020-09-17 16:42:18 -07:00
Amara Emerson
c09ce4f3bc [AArch64][GlobalISel] clang-format AArch64LegalizerInfo.cpp. NFC. 2020-09-17 16:41:10 -07:00
Amy Kwan
e85d496ffd [PowerPC] Add Set Boolean Condition Instruction Definitions and MC Tests
This patch adds the instruction definitions and assembly/disassembly tests for
the set boolean condition instructions. This also includes the negative, and
reverse variants of the instruction.

Differential Revision: https://reviews.llvm.org/D86252
2020-09-17 18:20:54 -05:00
Amy Kwan
00f4e38665 [PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang
This patch implements the vec_cntm function prototypes in altivec.h in order to
utilize the vector count mask bits instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82726
2020-09-17 18:20:53 -05:00
Philip Reames
eab425518c [MemorySSA] Fix an unused variable warning [NFC] 2020-09-17 16:07:59 -07:00
Zhaoshi Zheng
7d4e6e8ff5 [RISCV] Support Shadow Call Stack
Currenlty assume x18 is used as pointer to shadow call stack. User shall pass
flags:

"-fsanitize=shadow-call-stack -ffixed-x18"

Runtime supported is needed to setup x18.

If SCS is desired, all parts of the program should be built with -ffixed-x18 to
maintain inter-operatability.

There's no particuluar reason that we must use x18 as SCS pointer. Any register
may be used, as long as it does not have designated purpose already, like RA or
passing call arguments.

Differential Revision: https://reviews.llvm.org/D84414
2020-09-17 16:02:35 -07:00
Philip Reames
f9eea5b8c6 [AArch64] Enable implicit null check transformation
This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support:

    An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata.
    FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG.
    FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.)
    When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction.

As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.)

Differential Revision: https://reviews.llvm.org/D87851
2020-09-17 16:00:19 -07:00
Arthur Eubanks
9ae2247c05 [test] Fix FullUnroll.ll
I believe the intention of this test added in
https://reviews.llvm.org/D71687 was to test LoopFullUnrollPass with
clang's -fno-unroll-loops, not its interaction with optnone. Loop
unrolling passes don't run under optnone/-O0.

Also added back unintentionally removed -disable-loop-unrolling from
https://reviews.llvm.org/D85578.

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D86485
2020-09-17 15:56:13 -07:00
Quentin Colombet
f17e1f0936 [TargetRegisterInfo] Add a couple of target hooks for the greedy register allocator
Before this patch, the last chance recoloring and deferred spilling
techniques were solely controled by command line options.
This patch adds target hooks for these two techniques so that it
is easier for backend writers to override the default behavior.

The default behavior of the hooks preserves the default values of
the related command line options.

NFC
2020-09-17 15:23:15 -07:00
Derek Schuff
28f861215e Support dwarf fission for wasm object files
Initial support for dwarf fission sections (-gsplit-dwarf) on wasm.
The most interesting change is support for writing 2 files (.o and .dwo) in the
wasm object writer. My approach moves object-writing logic into its own function
and calls it twice, swapping out the endian::Writer (W) in between calls.
It also splits the import-preparation step into its own function (and skips it when writing a dwo).

Differential Revision: https://reviews.llvm.org/D85685
2020-09-17 14:42:41 -07:00
Florian Hahn
57f2631c85 [MemorySSA] Be more conservative when traversing MemoryPhis.
I think we need to be even more conservative when traversing memory
phis, to make sure we catch any loop carried dependences.

This approach updates fillInCurrentPair to use unknown sizes for
locations when we walk over a phi, unless the location is guaranteed to
be loop-invariant for any possible loop. Using an unknown size for
locations should ensure we catch all memory accesses to locations after
the given memory location, which includes loop-carried dependences.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87778
2020-09-17 22:09:53 +01:00
Arthur Eubanks
fd6a8d455b [NewPM] Fix pr45927.ll under NPM 2020-09-17 13:57:55 -07:00
Alexander Shaposhnikov
29027aaeab [llvm-install-name-tool] Update the command-line guide 2020-09-17 13:44:26 -07:00
Nikita Popov
67c06cafcb [InstCombine] Canonicalize SPF_ABS to abs intrinc
Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic.

To be conservative, the one-use check on the comparison is retained,
this may be relaxed if all goes well.

It's pretty likely that this will uncover places that missing
handling for the abs() intrinsic. Please report any seen performance
regressions.

Differential Revision: https://reviews.llvm.org/D87188
2020-09-17 22:28:34 +02:00
Whitney Tsang
ce3c753838 [LoopUnrollAndJam] Allow unroll and jam loops forced by user.
Summary: Allow unroll and jam loops forced by user.
LoopUnrollAndJamPass is still disabled by default in the NPM pipeline,
and can be controlled by -enable-npm-unroll-and-jam.

Reviewed By: Meinersbur, dmgreen

Differential Revision: https://reviews.llvm.org/D87786
2020-09-17 19:40:14 +00:00
Nikita Popov
868b57b4b2 [GVN] Use that assume(!X) implies X==false (PR47496)
We already use that assume(X) implies X==true, do the same for
assume(!X) implying X==false. This fixes PR47496.
2020-09-17 21:34:44 +02:00
Nikita Popov
c4f8370723 [GVN] Add additional assume tests (NFC)
The other assume tests seem to be dealing with equalities in
particular. Test implication for the condition itself, especially
the negated case from PR47496.
2020-09-17 21:34:43 +02:00
Florian Hahn
5cee4034b8 [SCEV] Add test cases for max BTC with loop guard info.
This adds test cases for PR40961 and PR47247. They illustrate cases in
which the max backedge-taken count can be improved by information from
the loop guards.
2020-09-17 20:27:48 +01:00
Victor Huang
0b4d51eda3 Disable hoisting MI to hotter basic blocks when using pgo
This is a follow up patch for https://reviews.llvm.org/D63676 to
enable the feature when using pgo.

Differential Revision: https://reviews.llvm.org/D85240
2020-09-17 14:17:00 -05:00
Jon Roelofs
a7e645bc20 AArch64::ArchKind's underlying type is uint64_t 2020-09-17 12:13:57 -07:00
LLVM GN Syncbot
363f3d6efc [gn build] Port 7e4c6fb8546 2020-09-17 19:09:34 +00:00
Andrew Litteken
f7aaee70de [IRSim] Adding IR Instruction Mapper
This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.

Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions.  If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer.  The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.

At present, branches, phi nodes are not mapping and exception handling
is illegal.  Debug instructions are not considered.

The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp

Recommit of: b04c1a9d3127730c05e8a22a0e931a12a39528df

Differential Revision: https://reviews.llvm.org/D86968
2020-09-17 14:06:16 -05:00
Cameron McInally
5054c85a3d [SVE][WIP] Implement lowering for fixed length VSELECT to Scalable
Map fixed length VSELECT to its Scalable equivalent.

Differential Revision: https://reviews.llvm.org/D85364
2020-09-17 14:02:57 -05:00
Amara Emerson
7f57bac5cc [AArch64][GlobalISel] Widen G_EXTRACT_VECTOR_ELT element types if < 8b.
In order to not unnecessarily promote the source vector to greater than our
native vector size of 128b, I've added some cascading rules to widen based on
the number of elements.
2020-09-17 11:50:33 -07:00
Amara Emerson
b1ec1a0462 [AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal for shifts. 2020-09-17 11:50:32 -07:00