1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

202230 Commits

Author SHA1 Message Date
David Green
f2589e2340 [ARM] Allow tail predication of VLDn
VLD2/4 instructions cannot be predicated, so we cannot tail predicate
them from autovec. From intrinsics though, they should be valid as they
will just end up loading extra values into off vector lanes, not
effecting the on lanes. The same is true for loads in general where so
long as we are not using the other vector lanes, an unpredicated load
can be converted to a predicated one.

This marks VLD2 and VLD4 instructions as validForTailPredication and
allows any unpredicated load in tail predication loop, which seems to be
valid given the other checks we have.

Differential Revision: https://reviews.llvm.org/D86022
2020-08-18 17:15:45 +01:00
Sam Tebbs
928822abd7 [ARM] Use mov operand if the mov cannot be moved while tail predicating
There are some cases where the instruction that sets up the iteration
count for a tail predicated loop cannot be moved before the dlstp,
stopping tail predication entirely. This patch checks if the mov operand
can be used and if so, uses that instead.

Differential Revision: https://reviews.llvm.org/D86087
2020-08-18 17:10:29 +01:00
Fangrui Song
af8011f23e [llvm-dwarfdump][test] Add a --statistics test for a DW_AT_artificial variable
There is an untested but useful case: `this` (even if not written) is counted as a
source variable.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D86044
2020-08-18 09:08:38 -07:00
Jamie Schmeiser
831953d6d2 [NFC] Add raw_ostream parameter to printIR routines
This is a non-functional-change to generalize the printIR routines so that
the output can be saved and manipulated rather than being directly output
to dbgs(). This is a prerequisite change for many upcoming changes that
allow new ways of examining changes made to the IR in the new pass manager.

Reviewed By: aeubanks (Arthur Eubanks)

Differential Revision: https://reviews.llvm.org/D85999
2020-08-18 16:05:27 +00:00
Jessica Paquette
7e08e6c7a3 [GlobalISel][CallLowering] Look through call parameters for flags
We weren't looking through the parameters on calls at all.

E.g., say you had

```
declare i32 @zext(i32 zeroext %x)

...
%y = call i32 @zext(i32 %something)
...

```

At the point of the call, we wouldn't know that the %something should have the
zeroext attribute.

This sets flags in about the same way as
TargetLoweringBase::ArgListEntry::setAttributes.

Differential Revision: https://reviews.llvm.org/D86125
2020-08-18 08:48:56 -07:00
jasonliu
26926d02d9 [XCOFF] emit .rename for .lcomm when necessary
Summary:

This is a follow up for D82481. For .lcomm directive, although it's
not necessary to have .rename emitted, it's still desirable to do
it so that we do not see internal 'Rename..' gets print out in
symbol table. And we could have consistent naming between TC entry
and .lcomm. And also have consistent naming between IR and final
object file.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D86075
2020-08-18 15:32:45 +00:00
Simon Pilgrim
f9607d1829 [X86] Regenerate load-slice test labels. NFCI.
Pulled out a superfluous diff from D66004
2020-08-18 16:08:35 +01:00
David Green
805498d035 [LV] Predicated reduction tests. NFC 2020-08-18 16:02:21 +01:00
Simon Pilgrim
aaddcf1562 [X86][AVX] lowerShuffleWithPERMV - pad 128/256-bit shuffles on non-VLX targets
Allow non-VLX targets to use 512-bits VPERMV/VPERMV3 for 128/256-bit shuffles.

TBH I'm not sure these targets actually exist in the wild, but we're testing for them and its good test coverage for shuffle lowering/combines across different subvector widths.
2020-08-18 15:46:02 +01:00
Simon Pilgrim
5b3aace9c9 [X86][AVX] lowerShuffleWithVTRUNC - extend to support v16i16/v32i8 binary shuffles.
This requires a few additional SrcVT vs DstVT padding cases in getAVX512TruncNode.
2020-08-18 15:30:02 +01:00
Sanjay Patel
b8e421a05f [SLP] remove instcombine dependency from regression test; NFC
InstCombine doesn't do that much here - sinks some instructions
and improves alignments - but that should not be part of the
SLP pass unit testing.
2020-08-18 10:18:22 -04:00
Simon Pilgrim
bc23e1d476 [X86][AVX] lowerShuffleWithVTRUNC - pull out TRUNCATE/VTRUNC creation into helper code. NFCI.
Prep work toward adding v16i16/v32i8 support for lowerShuffleWithVTRUNC and improving lowerShuffleWithVPMOV.
2020-08-18 14:52:42 +01:00
Matt Arsenault
e8f21df36b AMDGPU/GlobalISel: Select llvm.amdgcn.groupstaticsize
Previously, it would successfully select and assert if not HSA or PAL
when expanding the pseudoinstruction. We don't need the
pseudoinstruction anymore since we know the total size after
legalization.
2020-08-18 09:28:01 -04:00
Matt Arsenault
441f0f056c AMDGPU/GlobalISel: Fix selection of s1/s16 G_[F]CONSTANT
The code to determine the value size was overcomplicated and only
correct in the case where the result register already had a register
class assigned. We can always take the size directly from the
register's type.
2020-08-18 09:28:01 -04:00
Georgii Rymar
dfdc7b8a6d [llvm-readobj/elf] - Refine testing of broken Android's packed relocation sections.
This uses modern `split-file` tool to merge 5 `packed-relocs-error*.s` tests to a
new `packed-relocs-errors.s` and adds testing for GNU style.

Differential revision: https://reviews.llvm.org/D85835
2020-08-18 16:23:41 +03:00
Sanjay Patel
e60caae143 [InstCombine] fold fabs of select with negated operand
This is the FP example shown in:
https://bugs.llvm.org/PR39474
2020-08-18 09:23:07 -04:00
Sanjay Patel
5005da6598 [InstCombine] add tests for fneg+fabs; NFC 2020-08-18 09:23:07 -04:00
Georgii Rymar
7dc8b6931c [yaml2obj] - Don't crash when FileHeader declares an empty Flags key in specific situations.
We currently call the `llvm_unreachable` for the following YAML:

```
--- !ELF
FileHeader:
  Class:   ELFCLASS32
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_NONE
  Flags:   [ ]
```

it happens because the `Flags` key is present, though `EM_NONE` is a
machine type that has no known `EF_*` values and we call `llvm_unreachable` by mistake.

Differential revision: https://reviews.llvm.org/D86138
2020-08-18 16:09:28 +03:00
Ronak Chauhan
6e3663ae70 [ELF] Hide target specific methods as private
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86136
2020-08-18 18:26:08 +05:30
Simon Pilgrim
b682d6311b [X86][AVX] lowerShuffleWithVTRUNC - avoid unnecessary division in element counts. NFCI.
(256 / SrcEltBits) == ((2 * EltSizeInBits * NumElts) / (EltSizeInBits * Scale)) == (2 * (NumElts / Scale)) == NumSrcElts
2020-08-18 13:48:22 +01:00
Nico Weber
7f41844cac Revert "PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units"
This reverts commit be3ef93bf58aa5546c7baadfb21d43b75fbb4e24.
Test fails on macOS and Windows, e.g. http://45.33.8.238/win/22216/step_11.txt
2020-08-18 08:40:36 -04:00
Ronak Chauhan
0f95014e38 [llvm-objdump][AMDGPU] Detect CPU string
AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly.
However, the AMDGPU target CPU is stored in e_flags in the ELF object.

This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D84519
2020-08-18 17:43:16 +05:30
Paul Walker
17b7f05023 [SVE] Fix shift-by-imm patterns used by asr, lsl & lsr intrinsics.
Right shift patterns will no longer incorrectly accept a shift
amount of zero.  At the same time they will allow larger shift
amounts that are now saturated to their upper bound.

Patterns have been extended to enable immediate forms for shifts
taking an arbitrary predicate.

This patch also unifies the code path for immediate parsing so the
i64 based shifts are no longer treated specially.

Differential Revision: https://reviews.llvm.org/D86084
2020-08-18 11:41:26 +01:00
Sam Parker
ba17d3abc4 [NFC] Add some more Arm tests for IndVarSimplify
Copy some generic functions and apply minsize for arm.
2020-08-18 11:24:35 +01:00
Paul Walker
9d2dcdd5f6 [SVE] Lower fixed length vector ISD::SPLAT_VECTOR operations.
Also strengthens the CHECK lines for scalable vector splat tests.

Differential Revision: https://reviews.llvm.org/D86070
2020-08-18 11:19:43 +01:00
Simon Pilgrim
17226c7625 [X86][AVX] Lower v16i8/v8i16 binary shuffles using VTRUNC/TRUNCATE
This patch adds lowerShuffleWithVTRUNC to handle basic binary shuffles that can be lowered either as a pure ISD::TRUNCATE or a X86ISD::VTRUNC (with undef/zero values in the remaining upper elements).

We concat the binary sources together into a single 256-bit source vector. To avoid regressions we perform this after we've tried to lower with PACKS/PACKUS which typically does a cleaner job than a concat.

For non-AVX512VL cases we have to canonicalize VTRUNC cases to use a 512-bit source vectors (inserting undefs/zeros in the upper elements as necessary), truncate and then (possibly) extract the 128-bit result.

This should address the last regressions in D66004

Differential Revision: https://reviews.llvm.org/D86093
2020-08-18 11:11:58 +01:00
QingShan Zhang
3efdf8c410 [Test][NFC] Add a new test to verify if scheduler can cluster two ld/st
even with different preds
2020-08-18 09:42:15 +00:00
LLVM GN Syncbot
1548f29af9 [gn build] Port 00d7b7d014f 2020-08-18 09:10:43 +00:00
Shinji Okumura
8816b1755f [Attributor] Deduce noundef attribute
This patch introduces a new abstract attribute `AANoUndef` which corresponds to `noundef` IR attribute and deduce them.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85184
2020-08-18 18:05:54 +09:00
Georgii Rymar
5d42b65c1b [llvm-readobj/elf] - Refine the malformed-pt-dynamic.test.
This is splitted out from D85519, but significantly reworked.

Changes:
1) This test was changed to stop using python.
2) Use NoHeaders: true instead of `llvm-objcopy --strip-sections`.
3) Test llvm-readelf too (not just llvm-readobj).
4) Simplify the YAML used a bit (e.g. remove PT_LOAD).
5) Test 2 different cases: objects with section header table and without.

Differential revision: https://reviews.llvm.org/D86073
2020-08-18 11:51:03 +03:00
Georgii Rymar
c7885d9767 [llvm-readobj/elf] - Merge mips-got-overlapped.test to mips-got.test and refine testing.
The `mips-got-overlapped.test` was introduced in D16968 and its intention is
to check that when there is an empty section at the same address as `.got`,
then we are able to locate `.got` and dump it.

The issue is that this test does not test llvm-readelf and uses a precompiled
object. This path starts using YAML instead and merges
mips-got-overlapped.test to mips-got.test.

Differential revision: https://reviews.llvm.org/D86080
2020-08-18 11:37:34 +03:00
David Blaikie
b29b31a4a9 PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units
Theory was that we should never reach a non-type unit (eg: type in an
anonymous namespace) when we're already in the invalid "encountered an
address-use, so stop emitting types for now, until we throw out the
whole type tree to restart emitting in non-type unit" state. But that's
not the case (prior commit cleaned up one reason this wasn't exposed
sooner - but also makes it easier to test/demonstrate this issue)
2020-08-17 21:42:00 -07:00
David Blaikie
cafedbe4d6 DebugInfo: Emit class template parameters first, before members
This reads more like what you'd expect the DWARF to look like (from the
lexical order of C++ - template parameters come before members, etc),
and also happens to make it easier to tickle (& thus test) a bug related
to type units and Split DWARF I'm about to fix.
2020-08-17 21:42:00 -07:00
Johannes Doerfert
4cf2813246 [Attributor] Bail early if AAMemoryLocation cannot derive anything
Before this change we looked through all memory operations in a function
even if the first was an unknown call that could do anything. This did
cost a lot of time but there is little use to do so. We also avoid
creating AAs for things that we would have looked at in case no other AA
will; that is the reason for the test changes.

Running only the attributor-cgscc pass on a IR version of
`llvm-test-suite/MultiSource/Applications/SPASS/clause.c` reduced the
time we spend in `AAMemoryLocation::update` from 4% total to
0.9% (disclaimer: no accurate measurements).
2020-08-17 23:36:36 -05:00
Johannes Doerfert
827bd088ab [Attributor] We (should) keep the CG updated so we can mark it as preserved 2020-08-17 23:36:36 -05:00
Johannes Doerfert
117fb9d08e [Attributor][NFC] Directly return proper type to avoid casts 2020-08-17 23:36:36 -05:00
Johannes Doerfert
b94c4517c7 [Attributor][FIX] Handle function pointers properly in AANonNull
Before we tired to create a dominator tree for a declaration when we
wanted to determine if the function pointer is `nonnull`. We now avoid
looking at global values if `Value::getPointerDereferenceableBytes` not
already determined `nonnull`.
2020-08-17 23:36:35 -05:00
Harmen Stoppels
cca791c2df Use find_library for ncurses
Currently it is hard to avoid having LLVM link to the system install of
ncurses, since it uses check_library_exists to find e.g. libtinfo and
not find_library or find_package.

With this change the ncurses lib is found with find_library, which also
considers CMAKE_PREFIX_PATH. This solves an issue for the spack package
manager, where we want to use the zlib installed by spack, and spack
provides the CMAKE_PREFIX_PATH for it.

This is a similar change as https://reviews.llvm.org/D79219, which just
landed in master.

Differential revision: https://reviews.llvm.org/D85820
2020-08-17 19:52:52 -07:00
Amy Kwan
c630ca1f5c [PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang
This patch implements the vec_extractm function prototypes in altivec.h in
order to utilize the vector extract with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82675
2020-08-17 21:14:17 -05:00
Arthur Eubanks
60ce7a3cb9 [NewPM] Pin various tests under Other/ to legacy PM
These all are legacy PM-specific or have a corresponding NPM RUN line.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D86124
2020-08-17 18:58:08 -07:00
Hamilton Tobon Mosquera
59a4de0434 [OpenMPOpt][HideMemTransfersLatency] Split __tgt_target_data_begin_mapper into its "issue" and "wait" counterparts.
WIP that tries to hide the latency of runtime calls that involve host to
device memory transfers by splitting them into their "issue" and "wait"
versions. The "issue" is moved upwards as much as possible. The "wait" is
moved downards as much as possible. The "issue" issues the memory transfer
asynchronously, returning a handle. The "wait" waits in the returned
handle for the memory transfer to finish. We still lack of the movement.
2020-08-17 20:56:10 -05:00
Hongtao Yu
029f749bc7 [llvm-objdump] Attempt to fix html doc generation issue.
https://reviews.llvm.org/D84191 caused a html doc build issue with the changes in `llvm-objdump.rst`. It looks like a blank line is missing from the `code-block` directives.

Test Plan:

Differential Revision: https://reviews.llvm.org/D86123
2020-08-17 18:06:22 -07:00
Aditya Kumar
758be46c16 NFC: [GVNHoist] Outline functions from the class
Reviewers: sebpop
Reviewed By: hiraditya

Differential Revision: https://reviews.llvm.org/D86032
2020-08-17 17:40:04 -07:00
Craig Topper
949794f1c8 [X86] When manually creating intrinsic nodes in X86ISelLowering, make sure we use getTargetConstant and pointer type for the intrinsic ID.
Doesn't really matter in practice but that's how the nodes are
normally created by SelectionDAGBuilder. So we should match.

Found by temporarily hacking type checks into isel table.
2020-08-17 17:25:53 -07:00
Craig Topper
6fa69f2b25 [X86] Rename INTR_TYPE_4OP to INTR_TYPE_4OP_IMM8 and truncate immediates to MVT::i8
This makes sure VPTERNLOG is generated with MVT::i8 immediate
as its SDNode declaration in X86InstrFragmentsSIMD.td declares.
2020-08-17 17:25:52 -07:00
Craig Topper
96e6eb3fdc [X86] Truncate immediate to i8 for INTR_TYPE_3OP_IMM8
This is used for DBPSADBW which has a i32 immediate for its
intrinsic and an i8 immediate in tablegen isel patterns.
2020-08-17 17:25:51 -07:00
Craig Topper
ba2ac944b3 [X86] Make PreprocessISelDAG create X86ISD::VRNDSCALE nodes with i32 constants instead of i8.
This is the type declared in X86InstrFragmentsSIMD.td. ISel pattern
matching doesn't check so it doesn't matter in practice. Maybe for
SelectionDAG CSE it would matter.
2020-08-17 17:25:51 -07:00
Mircea Trofin
3359e4e021 [MLInliner] In development mode, obtain the output specs from a file
Different training algorithms may produce models that, besides the main
policy output (i.e. inline/don't inline), produce additional outputs
that are necessary for the next training stage. To facilitate this, in
development mode, we require the training policy infrastructure produce
a description of the outputs that are interesting to it, in the form of
a JSON file. We special-case the first entry in the JSON file as the
inlining decision - we care about its value, so we can guide inlining
during training - but treat the rest as opaque data that we just copy
over to the training log.

Differential Revision: https://reviews.llvm.org/D85674
2020-08-17 16:56:47 -07:00
Hongtao Yu
43bf988191 [llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.

So far only the X86 disassemblers are supported.

Test Plan:

llvm-objdump -d  --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:

<_start>:
               	push	rax
               	mov	dword ptr [rsp + 4], 0
               	mov	dword ptr [rsp], 0
               	mov	eax, dword ptr [rsp]
               	cmp	eax, dword ptr [rip + 4112]  # 202182 <g>
               	jge	0x20117e <_start+0x25>
               	call	0x201158 <foo>
               	inc	dword ptr [rsp]
               	jmp	0x201169 <_start+0x10>
               	xor	eax, eax
               	pop	rcx
               	ret
```

llvm-objdump -d  **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:

<_start>:
               	push	rax
               	mov	dword ptr [rsp + 4], 0
               	mov	dword ptr [rsp], 0
<L1>:
               	mov	eax, dword ptr [rsp]
               	cmp	eax, dword ptr  <g>
               	jge	 <L0>
               	call	 <foo>
               	inc	dword ptr [rsp]
               	jmp	 <L1>
<L0>:
               	xor	eax, eax
               	pop	rcx
               	ret
```

Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D84191
2020-08-17 16:55:12 -07:00
Johannes Doerfert
4c565c83b8 [Attributor] Properly use the call site argument position 2020-08-17 18:21:09 -05:00