1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

212219 Commits

Author SHA1 Message Date
Fangrui Song
6f69fc9933 [rs4gc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2021-03-06 11:42:27 -08:00
Roman Lebedev
eb30ce19c4 [NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions
These intrinsics, not the icmp+select are the canonical form nowadays,
so we might as well directly emit them.

This should not cause any regressions, but if it does,
then then they would needed to be fixed regardless.

Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`,
but that is a pessimization, not a correctness issue.

Additionally, the non-intrinsic form has issues with undef,
see https://reviews.llvm.org/D88287#2587863
2021-03-06 21:52:46 +03:00
Sean Fertile
551e5c7de1 [PowePC][AIX] Handle variadic vector call operands.
Patch adds support for passing vector call operands to variadic
functions. Arguments which are fixed shadow GPRs and stack space even
when they are passed in vector registers, while arguments passed through
ellipses are passed in properly aligned GPRs if available and on the
stack once all GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97956
2021-03-06 13:49:55 -05:00
Ta-Wei Tu
87e3ae17fd [NPM] Add -enable-loopinterchange option to NPM
We have the `enable-loopinterchange` option in legacy pass manager but not in NPM.
Add `LoopInterchange` pass to the optimization pipeline (at the same position as before)
when `enable-loopinterchange` is turned on.

Reviewed By: aeubanks, fhahn

Differential Revision: https://reviews.llvm.org/D98116
2021-03-07 02:39:28 +08:00
William S. Moses
3600c78d3c [Attributor] Enable heap-to-stack of any size
Enable Attributor's heap-to-stack to lower unbounded allocations given a max size of -1

Differential Revision: https://reviews.llvm.org/D97873
2021-03-06 12:57:32 -05:00
Philip Reames
2c93a90304 [tests] Update an autogen test for format change 2021-03-06 09:49:27 -08:00
Philip Reames
83f044db4a [gvn] Handle simply phi equivalence cases
GVN basically doesn't handle phi nodes at all. This is for a reason - we can't value number their inputs since the predecessor blocks have probably not been visited yet.

However, it also creates a significant pass ordering problem. As it stands, instcombine and simplifycfg ends up implementing CSE of phi nodes. This means that for any series of CSE opportunities intermixed with phi nodes, we end up having to alternate instcombine/simplifycfg and gvn to make progress.

This patch handles the simplest case by simply preprocessing the phi instructions in a block, and CSEing them if they are syntactically identical. This turns out to be powerful enough to handle many cases in a single invocation of GVN since blocks which use the cse'd phi results are visited after the block containing the phi. If there's a CSE opportunity in one the phi predecessors required to recognize the phi CSE opportunity, that will require a second iteration on the function. (Still within a single run of gvn though.)

Compile time wise, this could go either way. On one hand, we're potentially causing GVN to iterate over the function more. On the other, we're cutting down on iterations between two passes and potentially shrinking the IR aggressively. So, a bit unclear what to expect.

Note that this does still rely on instcombine to canonicalize block order of the phis, but that's a one time transformation independent of the values incoming to the phi.

Differential Revision: https://reviews.llvm.org/D98080
2021-03-06 09:31:12 -08:00
Philip Reames
84c1d0a603 [rs4gc/tests] Remove use of internal debug flags
As a pragmatic tradeoff, the ease of updating the tests outweighs the slightly easier to understand test conditions.  Where revevant, debug output was converted to comments to help human understanding.
2021-03-06 09:20:02 -08:00
Philip Reames
4ba57ecf74 [rs4gc] autogen a bunch of tests for ease of update 2021-03-06 09:04:00 -08:00
Philip Reames
5b78c86a21 [rs4gc] track the original value in the state use for base pointer rewriting
I'd originally intended to build on this for another purpose and have decided not to, but at a minimum, the stronger asserts are useful.
2021-03-06 08:46:15 -08:00
Philip Reames
bddbaa0fa3 [rs4gc] minor code style improvement 2021-03-06 08:46:15 -08:00
Nikita Popov
fbb0e66968 [Loads] Restructure getAvailableLoadStore implementation (NFC)
Separate out some conditions with early exits, to make it easier to
support additional cases.
2021-03-06 16:58:11 +01:00
Nikita Popov
3182b065d2 [InstCombine] Add tests for non-trivial store to load forward (NFC)
Examples of things we mostly don't handle.
2021-03-06 16:58:11 +01:00
Alexey Lapshin
468d967ea8 [X86][VARARG] Avoid spilling xmm registers for va_start.
That review is extracted from D69372.
It fixes https://bugs.llvm.org/show_bug.cgi?id=42219 bug.

For the noimplicitfloat mode, the compiler mustn't generate
floating-point code if it was not asked directly to do so.
This rule does not work with variable function arguments currently.
Though compiler correctly guards block of code, which copies xmm vararg
parameters with a check for %al, it does not protect spills for xmm registers.
Thus, such spills are generated in non-protected areas and could break code,
which does not expect floating-point data. The problem happens in -O0
optimization mode. With this optimization level there is used
FastRegisterAllocator, which spills virtual registers at basic block boundaries.
Register Allocator does not protect spills with additional control-flow modifications.
Thus to resolve that problem, it is suggested to not copy incoming physical
registers into virtual registers. Instead, store incoming physical xmm registers
into the memory from scratch.

Differential Revision: https://reviews.llvm.org/D80163
2021-03-06 15:25:47 +03:00
Nikita Popov
fa0178256e [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()
There seems to be an impedance mismatch between what the type
system considers an aggregate (structs and arrays) and what
constants consider an aggregate (structs, arrays and vectors).

Rather than adjusting the type check, simply drop it entirely,
as getAggregateElement() is well-defined for non-aggregates: It
simply returns null in that case.
2021-03-06 12:17:56 +01:00
Nikita Popov
d3cd0c6c1e [GVN] Regenerate test checks (NFC) 2021-03-06 12:11:16 +01:00
Juneyoung Lee
5caecc36c3 [LangRef] dos2unix (NFC) 2021-03-06 18:44:40 +09:00
Nikita Popov
9f2cfa55ab [LVI] Simplify and generalize handling of clamp patterns
Instead of handling a number of special cases for selects, handle
this generally when inferring ranges from conditions. We already
infer ranges from `x + C pred C2` to `x`, so doing the same for
`x pred C2` to `x + C` is straightforward.
2021-03-06 10:42:41 +01:00
Nikita Popov
4eecbe8800 [CVP] Add additional tests for clamp patterns (NFC)
These are the same as the existing tests, but using different
predicates that are not handled by the current code.
2021-03-06 10:42:40 +01:00
Raul Tambre
8f820ce4a5 [runtimes] Fix crosscompiling after a7cad6680b4087eff8994f1f99ac40c661a6621f (D97451)
It moved the logic for CMake target arguments into llvm_ExternalProject_Add().
No handling was added for CMAKE_CROSSCOMPILING, which has a separate set of compiler_args.
This broke crosscompiling, as now the runtimes builds defaulted to the compiler's default.

I've also added passing of CMAKE_ASM_COMPILER, which was missing before although we were passing the triple for it.

Reviewed By: zero9178

Differential Revision: https://reviews.llvm.org/D97855
2021-03-06 11:35:14 +02:00
Nikita Popov
634e7e2d73 [LVI] Pass offset by reference (NFC)
Instead of by pointer. This allows us to use offsets that are not
materialized in the IR.
2021-03-06 10:24:44 +01:00
Nikita Popov
7fce4c6c26 [CVP] Fix tests for clamp patterns (NFC)
These tests didn't test the pattern they were supposed to, because
%a instead of %add was used in the select, which turned this into
a normal min/max).

Noticed this when commenting out the clamp handling code did not
result in any test failures...
2021-03-06 10:24:44 +01:00
Jay Foad
f22c8bd800 Revert "Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030""
This reverts commit e58d68fcd06ddc7743e0419c0b364df3d44121b6.

This reinstates commit fc28f600e558c1344618bda149a068d6162b6f0b
with a fix to initialize HasShaderCyclesRegister. See
https://reviews.llvm.org/D97928.
2021-03-06 09:00:01 +00:00
Fangrui Song
f13eb14aac [MC][RISCV] Support .reloc *, BFD_RELOC_{NONE,32,64}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:45:11 -08:00
Fangrui Song
6bc3220cd9 [MC][ARM] Support .reloc *, BFD_RELOC_{NONE,8,16,32}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:39:16 -08:00
Fangrui Song
b781b17cdc [MC][test] Fix reloc-directive-elf-*.s 2021-03-05 21:37:29 -08:00
Fangrui Song
3abc3e0ea7 [MC][PowerPC] Support .reloc *, BFD_RELOC_{NONE,16,32,64}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:31:45 -08:00
Fangrui Song
c0f3b9dd4e [MC][AArch64] Support .reloc *, BFD_RELOC_{NONE,16,32,64}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:31:08 -08:00
Fangrui Song
d0c3033896 [MC][X86] Support .reloc *, BFD_RELOC_{NONE,8,16,32,64}, *
The names are unfortunate, but BFD_RELOC_NONE provides a generic way indicating
a dependency between two sections, which is useful for ld --gc-sections.
See https://sourceware.org/bugzilla/show_bug.cgi?id=27530
2021-03-05 21:31:05 -08:00
Mitch Phillips
073a491a84 Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030"
Broke the ASan/MSan buildbots. See more comments in the original patch,
https://reviews.llvm.org/D97928.

Build failure at http://lab.llvm.org:8011/#/builders/5/builds/5327

This reverts commit fc28f600e558c1344618bda149a068d6162b6f0b.
2021-03-05 18:24:59 -08:00
Jessica Clarke
310b95be5f [benchmark] Replace references to M680x0 with M68k
The former was the old unusual name of the out-of-tree backend but it
was renamed to M68k during the code review process to conform with how
almost everything refers to the Motorola 68000 family of processors.
Thus, update the comments to avoid confusion when the backend lands.
2021-03-06 01:04:36 +00:00
Philip Reames
9572ae12ef [tests] precommit tests for D98082 2021-03-05 15:21:34 -08:00
Philip Reames
6567af9d6c [tests] precommit tests for phi handling in GVN 2021-03-05 14:24:21 -08:00
Jay Foad
c48f07ac4a [AMDGPU] Restore the s_memtime instruction in gfx1030
gfx1030 added a new way to implement readcyclecounter using the
SHADER_CYCLES hardware register, but the s_memtime instruction still
exists, so the MC layer should still accept it and the
llvm.amdgcn.s.memtime intrinsic should still work.

Differential Revision: https://reviews.llvm.org/D97928
2021-03-05 20:19:11 +00:00
Vy Nguyen
f20ce64ecf Reland 293e8fa13d3f05e993771577a4c022deee5cbf6e
[llvm-exegesis] Disable the LBR check on AMD

    https://bugs.llvm.org/show_bug.cgi?id=48918

    The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW.
    Theory we've got is that the lbr-checking code could be confused on AMD.

    Differential Revision: https://reviews.llvm.org/D97504

New change:
 - Surround usages of x86 helper in llvm-exegesis/X86/Target.cpp with ifdef
 - Fix bug which caused the caller of getVendorSignature to not have a copy of EAX that it expected.
2021-03-05 13:23:42 -05:00
Philip Reames
551c0ac82d [gvn] CSE gc.relocates based on meaning, not spelling
The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)

We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.

Differential Revision: https://reviews.llvm.org/D97974

(Note: Part of the reviewed change was split and landed as f352463a)
2021-03-05 10:16:12 -08:00
Philip Reames
cc9b819224 Mark gc.relocate and gc.result as readnone
For some reason, we had been marking gc.relocates as reading memory. There's no known reason for this, and I suspect it to be a legacy of very early implementation conservatism.  gc.relocate and gc.result are simply projections of the return values from the associated statepoint.  Note that the LangRef has always declared them readnone.

The EarlyCSE change is simply moving the special casing from readonly to readnone handling.

As noted by the test diffs, this does allow some additional CSE when relocates are separated by stores, but since we generate gc.relocates in batches, this is unlikely to help anything in practice.

This was reviewed as part of https://reviews.llvm.org/D97974, but split at reviewer request before landing.  The motivation is to enable the GVN changes in that patch.
2021-03-05 10:07:17 -08:00
Philip Reames
07baf1e7c4 [tests] precommit some additional tests for D97974 2021-03-05 10:04:07 -08:00
Philip Reames
05b84ca7e8 [rs4gc] avoid insert base computation instructions for deopt uses
If we have a value live over a call which is used for deopt at the call, we know that the value must be a base pointer. We can avoid potentially inserting IR to materialize a base for this value.

In it's current form, this is mostly a compile time optimization.   Building the base pointer graph (and then optimizing it away again) is a relatively expensive operation.  We also sometimes end up with better codegen in practice - due to failures in optimizing away the inserted base pointer propogation - but those are optimization bugs we're fixing concurrently.

The alternative to this would be to extend the base pointer inference with the ability to generally reuse multiple-base input instructions (phis and selects).  That's somewhat invasive and complicated, so we're defering it a bit longer.

Differential Revision: https://reviews.llvm.org/D97885
2021-03-05 09:55:36 -08:00
Zarko Todorovski
d3df4db91e [PowerPC][AIX] Enable the default AltiVec ABI on AIX
This patch adds support for the default AltiVec ABI for AIX.

Vector registers 20 through 31 are marked as reserved and cannot
be used in the default ABI. This patch adds handling for this case
and also remove the default AltiVec ABI errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96351
2021-03-05 12:46:27 -05:00
Andrzej Warzynski
bed7df6e8b [Utils] Add missing attributes in syntax files
Added the following attributes to all LLVM syntax files:
  * allocsize
  * cold
  * convergent
  * dereferenceable_or_null
  * hot
  * inaccessiblemem_or_argmemonly
  * inaccessiblememonly
  * inalloca
  * jumptable
  * nocallback
  * nocf_check
  * noduplicate
  * nofree
  * nomerge
  * noprofile
  * nosync
  * null_pointer_is_valid
  * optforfuzzing
  * preallocated
  * safestack
  * sanitize_hwaddress
  * sanitize_memtag
  * shadowcallstack
  * speculative_load_hardening
  * swifterror
  * syncscope
  * tailcc
  * willreturn

I generated that list by comparing:
  * Attributes.inc (generated from Attributes.td), and
  * the Vim syntax file: llvm/utils/vim/syntax/llvm.vim

My original intention was to focus on the Vim syntax file. Since other
syntax files are also out-of-date, I added these attributes (if missing)
to other files as well. Note that in the other sytnax files (i.e. for
Emacs, VScode and Kate), there will be other attributes missing too.

I've also sorted all attributes alphabetically. Otherwise it's really
hard to automate adding new attributes. And I think that it was the
original intent to keep all of them ordered alphabetically.

Differential Revision: https://reviews.llvm.org/D97627
2021-03-05 17:36:09 +00:00
LemonBoy
f00c577e2d [LegalizeDAG] Implement promotion rules for SELECT_CC
Implement the promotion rule for SELECT_CC nodes by upcasting all the parameters and downcasting the result.
The AArch64 target makes use of this rule and, since it was not implemented, in some cases the instruction selector would hit an assertion upon encountering the illegal node.

This patch requires D97840, the included test cases hit both problems.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97859
2021-03-05 18:22:55 +01:00
RamNalamothu
b6b6e47136 [AMDGPU] Do not attempt sgpr spills to vgpr, when it is disabled
This covers a path missed in https://reviews.llvm.org/D95768.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D98013
2021-03-05 22:47:21 +05:30
Nico Weber
cef0f85d70 [gn build] allow setting clang_base_path to a source-absolute path
With this, you can set `clang_base_path = "//out/gn1"` in `out/gn2/args.gn` and
the build in out/gn2 will use clang and lld from out/gn1.

Setting `clang_base_path` to an absolute path (with e.g.
`clang_base_path = getenv("HOME") + "/src/..."`) should behave as before.

Differential Revision: https://reviews.llvm.org/D97989
2021-03-05 12:13:51 -05:00
Jinsong Ji
0026d38644 [PowerPC] Update Copy/Paste encodings according to ISA3.1
Copy-paste P9 insns were added back in 2016,
however, looks like the opcodes has changed in ISA3.1.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D97416
2021-03-05 17:05:50 +00:00
gbtozers
c52cf11f42 [DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics
This patch adds a new metadata node, DIArgList, which contains a list of SSA
values. This node is in many ways similar in function to the existing
ValueAsMetadata node, with the difference being that it tracks a list instead of
a single value. Internally, it uses ValueAsMetadata to track the individual
values, but there is also a reasonable amount of DIArgList-specific
value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special
case in parsing and printing due to the fact that it requires a function state
(as it may reference function-local values).

This patch should not result in any immediate functional change; it allows for
DIArgLists to be parsed and printed, but debug variable intrinsics do not yet
recognize them as a valid argument (outside of parsing).

Differential Revision: https://reviews.llvm.org/D88175
2021-03-05 17:02:24 +00:00
Simon Pilgrim
5ac93b57f2 [X86] X86ISelDAGToDAG.cpp - include cstdint instead of stdint.h NFCI.
Fixes clang-tidy warning
2021-03-05 15:58:20 +00:00
Simon Pilgrim
f3b4ff0bf8 [X86] X86DAGToDAGISel::Select - merge X86::TEST load bitsize checks. NFCI. 2021-03-05 15:58:20 +00:00
LemonBoy
dd61803fe7 [AArch64] Legalize horizontal fmax/fmin reductions on f16 vectors
Expand the horizontal reduction during the instruction selection phase, but only if the target doesn't support the full fp16 instruction set.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49401

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D97840
2021-03-05 16:09:37 +01:00
Ilya Leoshkevich
13dd94d4c3 [BPF] Add support for floats and doubles
Some BPF programs compiled on s390 fail to load, because s390
arch-specific linux headers contain float and double types. At the
moment there is no BTF_KIND for floats and doubles, so the release
version of LLVM ends up emitting type id 0 for them, which the
in-kernel verifier does not accept.

Introduce support for such types to libbpf by representing them using
the new BTF_KIND_FLOAT.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D83289
2021-03-05 15:10:11 +01:00